Adaptive Identifier-Critic-Based Optimal Tracking Control for Nonlinear Systems With Experimental Validation

Na, Jing; Lv, Yongfeng; Zhang, Kaiqiang; Zhao, Jun

doi:10.1109/tsmc.2020.3003224

Cited by 79 publications

(34 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With all transition samples participating in the AI-VIRL, this model-free Q-learninglike algorithm benefits from a form of experience replay, widely used in reinforcement learning. Under certain assumptions, convergence of the AI-VIRL C-NN to the optimal controller which implies stability of the closed-loop has been analyzed before in the literature [2,3,[6][7][8]11,12] and is not discussed here.…”

Section: The Ai-virl Solution For the Lrm Output Trackingmentioning

confidence: 99%

“…Value Iteration (VI) is one popular approximate dynamic programming [1][2][3][4][5][6][7] and reinforcement learning algorithm [8][9][10][11][12][13], together with Policy Iteration. VI Reinforcement Learning (VIRL) algorithm comes in many implementation flavors, online or offline, offpolicy or on-policy, batch-wise or adaptive-wise, with known or unknown system dynamics.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Virtual State Feedback Reference Tuning and Value Iteration Reinforcement Learning for Unknown Observable Systems Control

Rădac

Borlea

2021

Energies

View full text Add to dashboard Cite

In this paper, a novel Virtual State-feedback Reference Feedback Tuning (VSFRT) and Approximate Iterative Value Iteration Reinforcement Learning (AI-VIRL) are applied for learning linear reference model output (LRMO) tracking control of observable systems with unknown dynamics. For the observable system, a new state representation in terms of input/output (IO) data is derived. Consequently, the Virtual State Feedback Tuning (VRFT)-based solution is redefined to accommodate virtual state feedback control, leading to an original stability-certified Virtual State-Feedback Reference Tuning (VSFRT) concept. Both VSFRT and AI-VIRL use neural networks controllers. We find that AI-VIRL is significantly more computationally demanding and more sensitive to the exploration settings, while leading to inferior LRMO tracking performance when compared to VSFRT. It is not helped either by transfer learning the VSFRT control as initialization for AI-VIRL. State dimensionality reduction using machine learning techniques such as principal component analysis and autoencoders does not improve on the best learned tracking performance however it trades off the learning complexity. Surprisingly, unlike AI-VIRL, the VSFRT control is one-shot (non-iterative) and learns stabilizing controllers even in poorly, open-loop explored environments, proving to be superior in learning LRMO tracking control. Validation on two nonlinear coupled multivariable complex systems serves as a comprehensive case study.

show abstract

Section: The Ai-virl Solution For the Lrm Output Trackingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Virtual State Feedback Reference Tuning and Value Iteration Reinforcement Learning for Unknown Observable Systems Control

Rădac

Borlea

2021

Energies

View full text Add to dashboard Cite

show abstract

“…In this paper, the SMOFCC scheme is designed based on known system dynamics, and it surely can be extended to model-free case as long as the system dynamics is available. To achieve this goal, one strategy is to employ the observer [15] or identifier [48] to estimate the system dynamics, and then directly applied it to propose the control method. While, another way is to develop a pure model-free control method, i.e., the controller is designed directly with system input-output data [28].…”

Section: Remarkmentioning

confidence: 99%

Sliding mode-based online fault compensation control for modular reconfigurable robots through adaptive dynamic programming

Xia

Guo

2021

Complex Intell. Syst.

View full text Add to dashboard Cite

In this paper, a sliding mode (SM)-based online fault compensation control scheme is investigated for modular reconfigurable robots (MRRs) with actuator failures via adaptive dynamic programming. It consists of a SM-based iterative controller, an adaptive robust term and an online fault compensator. For fault-free MRR systems, the SM surface-based Hamilton–Jacobi–Bellman equation is solved by online policy iteration algorithm. The adaptive robust term is added to guarantee the reachable condition of SM surface. For faulty MRR systems, the actuator failure is compensated online to avoid the fault detection and isolation mechanism. The closed-loop MRR system is guaranteed to be asymptotically stable under the developed fault compensation control scheme. Simulation results verify the effectiveness of the present fault compensation control approach.

show abstract

“…Additionally, compared with the data-driven method in [48], this work provides a neural network-based technique to avoid the Kronecker product in estimating the actor/critic term. e actor/critic-based approaches have been discussed in [49,50] for nonlinear affine systems using residual error δ hjb . However, in view of the consideration of the identifier in [49,50], it implies the difference in the computation of residual error δ hjb and training laws in actor/critic weights between the proposed method and the work in [49,50].…”

Section: Arl-based Control Design For Independent Jointsmentioning

confidence: 99%

Adaptive Reinforcement Learning-Enhanced Motion/Force Control Strategy for Multirobot Systems

Nam

Khanh

Nguyen

2021

Mathematical Problems in Engineering

View full text Add to dashboard Cite

This paper presents an adaptive reinforcement learning- (ARL-) based motion/force tracking control scheme consisting of the optimal motion dynamic control law and force control scheme for multimanipulator systems. Specifically, a new additional term and appropriate state vector are employed in designing the ARL technique for time-varying dynamical systems with online actor/critic algorithm to be established by minimizing the squared Bellman error. Additionally, the force control law is designed after obtaining the computation of constraint force coefficient by the Moore–Penrose pseudo-inverse matrix. The tracking effectiveness of the ARL-based optimal control is verified in the closed-loop system by theoretical analysis. Finally, simulation studies are conducted on a system of three manipulators to validate the physical realization of the proposed optimal tracking control design.

show abstract

Adaptive Identifier-Critic-Based Optimal Tracking Control for Nonlinear Systems With Experimental Validation

Cited by 79 publications

References 47 publications

Virtual State Feedback Reference Tuning and Value Iteration Reinforcement Learning for Unknown Observable Systems Control

Virtual State Feedback Reference Tuning and Value Iteration Reinforcement Learning for Unknown Observable Systems Control

Sliding mode-based online fault compensation control for modular reconfigurable robots through adaptive dynamic programming

Adaptive Reinforcement Learning-Enhanced Motion/Force Control Strategy for Multirobot Systems

Contact Info

Product

Resources

About