Summary
In this paper, a modified value iteration–based approximate dynamic programming method is proposed for a class of affine nonlinear continuous‐time systems, whose dynamics are partially unknown. The value iteration algorithm is established in an online fashion, and the convergence proof is given. To attenuate the effect caused by the unascertained characteristics of the system dynamics, the integral reinforcement learning scheme is also used. In the proposed approximate dynamic programming method, it is emphasized that the single‐network structure is utilized to estimate the value functions and the control policies. That is, the iteration process is implemented on the actor/critic structure, in which case only the critic NN is required to be identified. Then, the least‐squares scheme is derived for the NN weights updating. Finally, a linear system and a nonlinear system are tested to evaluate the performance of the proposed online value iteration algorithm. Both of the examples show the feasibility and effectiveness of the proposed algorithms.
In this article, a novel neural network (NN) optimal control approach using adaptive critic designs is developed for nonlinear discrete-time (DT) systems with time delays. First, to eliminate the delay term of control input, a time-delay matrix function is developed by designing a M network. Furthermore, the cost function is approximated by the critic NN, and the control signal can be obtained directly by using the information of critic NN according to the equilibrium condition. In addition, to shorten the learning time and reduce the computational burden in the control process, a novel control strategy with less adjustable parameters for the time-delay DT nonlinear systems is proposed in this article, in which the norm of the weight estimations of critic NN is updated to generate a novel long-term performance function.The proposed control algorithm using adaptive critic designs has the advantage of reducing adaptive learning parameters and lessening calculative burden. The Lyapunov stability analysis shows that the time-delay DT controlled systems can be uniformly ultimately bounded stable. Finally, three simulations are presented to demonstrate the control performance of the developed method.
K E Y W O R D Sadaptive critic designs, discrete-time systems, neural network, optimal control, time delay 1 748
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.