Online reinforcement learning for a class of partially unknown continuous‐time nonlinear systems via value iteration

Su, Hanguang; Zhang, Huaguang; Zhang, Kun; Gao, Wenzhong

doi:10.1002/oca.2391

Cited by 19 publications

(22 citation statements)

References 72 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Theorem 1. For the model NN (10), if the activation function is the sigmoid function and the hidden-to-output weight is updated by the adaptive law (13), then the approximate error x and wmo is uniformly ultimately bounded.…”

Section: Model Nn Design and The Convergence Analysismentioning

confidence: 99%

“…The optimal control calculated from our algorithm FIGURE 7 The optimal control from the paper [13] FIGURE 8 The cost function produced in our algorithm (upper) and the optimal cost function (lower) [13]. Finally, the initial admissible control law is obtained from NMPC in proposed algorithm and the initial weights of neural networks can be designed reasonably.…”

Section: Figurementioning

confidence: 99%

“…Adaptive dynamic programming (ADP) is an important part of optimal control theory, which can avoid the "curse of dimensionality" [5,6]. Until to now, the ADP algorithms have many research results in discrete systems [7][8][9][10][11] and continuous systems [12][13][14][15][16][17][18]. Zhang et al [18] introduced a precompensator to construct an augmented system and the Hamilton-Jacobi-Bellman (HJB) equation was solved by the least squared technique, neural network (NN) approximator and policy iteration algorithm.…”

Section: Introductionmentioning

confidence: 99%

“…Yang et al [16] proposed an output‐based tracking control scheme for a class of continuous‐time nonlinear systems via the adaptive dynamic programming (ADP) technique. For partially unknown continuous‐time nonlinear systems, a online adaptive critic algorithm using least squares support vector machine was presented in Sun et al [14] and a value iteration‐based ADP algorithm was proposed in Su et al [13] to solve the optimal control problem. Wei et al [15] designed the continuous‐time time‐varying iteration algorithm to deal with the optimal control problem for the continuous‐time time‐varying nonlinear system.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

A novel optimal control design for unknown nonlinear systems based on adaptive dynamic programming and nonlinear model predictive control

Zhang

Zheng

2021

Asian Journal of Control

View full text Add to dashboard Cite

This paper presents a novel adaptive optimal control algorithm by combining adaptive dynamic programming with nonlinear model predictive control for unknown continuous-time affine nonlinear systems. The adaptive optimal control design is realized by the model-critic-actor architecture. Model neural network, critic neural network and actor neural network are constructed to approximate the system dynamics, the cost function and the optimal control law respectively. The random initialization of neural networks usually influences the control performance, so three neural networks are initialized properly to obtain the suitable initial values so that the control performance is improved.Especially, actor neural network is initialized to approximate the near-optimal control law which is obtained from nonlinear model predictive control. The convergence of the proposed algorithm is proved by the Lyapunov theory. Finally, simulation results are provided to illustrate the effectiveness of the proposed algorithm.

show abstract

Section: Model Nn Design and The Convergence Analysismentioning

confidence: 99%

Section: Figurementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

A novel optimal control design for unknown nonlinear systems based on adaptive dynamic programming and nonlinear model predictive control

Zhang

Zheng

2021

Asian Journal of Control

View full text Add to dashboard Cite

show abstract

“…Compared with the existing literature on VI based IRL[64] [65][66], the main contributions of this chapter are enumerated here.…”

mentioning

confidence: 99%

Applications of integral reinforcement learning control in electrical machines and power converter systems

Yu¹

View full text Add to dashboard Cite

Neighbor Q‐learning based consensus control for discrete‐time multi‐agent systems

Zhu

Yuan

Dong

et al. 2022

Optim Control Appl Methods

View full text Add to dashboard Cite

The neighbor Q-learning based consensus control algorithm is developed for discrete-time multi-agent systems in this article. To realize the proposed algorithm, a new actor-critic architecture is employed for each agent. The critic network of each agent approximates its Q-function while the actor network produces control signal by minimizing the Q-function. Considering the distribution metrics of the systems, the neighbors' Q-functions of each agent are applied to the update procedure of the critic network to stabilize the learning process and avoid the overestimation problem. The convergence properties and stability analysis for the proposed algorithm are provided. Different discount factors corresponding to various topology structures are discussed in the convergence analysis section. The accurate system model is nonessential for the algorithm which is too intricate to build up for practical systems. Finally, three simulation examples including different discount factors are conducted to demonstrate the effectiveness of the consensus control algorithm.

show abstract

Online reinforcement learning for a class of partially unknown continuous‐time nonlinear systems via value iteration

Cited by 19 publications

References 72 publications

A novel optimal control design for unknown nonlinear systems based on adaptive dynamic programming and nonlinear model predictive control

A novel optimal control design for unknown nonlinear systems based on adaptive dynamic programming and nonlinear model predictive control

Applications of integral reinforcement learning control in electrical machines and power converter systems

Neighbor Q‐learning based consensus control for discrete‐time multi‐agent systems

Contact Info

Product

Resources

About