Abstract-In this paper, an alternative solution for adaptive optimal tracking control of nonlinear completely unknown systems is proposed. Firstly, an adaptive identifier is used to estimate the unknown system dynamics. Then, a recently developed system augmentation approach is adopted to design the optimal control, where the reference signal is incorporated into the augmented system. Thus, both the feedforward control and feedback control can be obtained simultaneously. Then, a critic neural network (NN) is used to estimate the augmented performance index, and calculate the optimal control action. Thus, the widely used actor NN is not needed. Finally, a new adaptive law recently proposed by the authors is used to online update the NN weight. The closed-loop stability and the convergence of the optimal control are all proved. The feasibility of the suggested approach is demonstrated by a simulation example.
I. INSTRUCTION1 The objective of solving optimal tracking control (OTC) is to design a controller in such a way that the system state or output tracks a given reference in an optimal manner by minimizing a predefined performance index. The direct extension of optimal control schemes used for regulation to solve the OTC problem is not straightforward [1]. For specific continuous-time linear systems, the OTC may be designed by using Riccita equation method [1,2]. However, only a few results have been suggested for nonlinear systems because it is not trivial to solve the associated Hamilton-Jacobi-Bellman (HJB) equation [3]. Nevertheless, the direct application of dynamic programming (DP) [5] to solve OTC problem also encountered difficulties for high order systems.Adaptive dynamic programming (ADP) proposed by Werbos [6] has been developed as a feasible method to address the optimal control problems forward-in-time for discrete-time (DT) systems. However, extensions of the ADP methods for continuous-time (CT) systems [7] entail challenges in proving the closed-loop system stability. Moreover, most available ADP results assume that the system dynamics are partially or fully known. To relax these requirements of system dynamics, Zhang et al. [8] used a neural network (NN) identifier to reconstruct unknown drift dynamics, and proposed an adaptive optimal control. We have also suggested a new 'identifier-