An online adaptive optimal control is proposed for continuous-time nonlinear systems with co mpletely unknown dynamics, which is achieved by developing a novel identifier-critic based approximate dynamic programming (ADP) algorith m with a dual neural network (NN) appro ximation structure. Firstly, an adaptive NN identifier is designed to obviate the requirement of complete knowledge of system dynamics, and a critic NN is emp loyed to approximate the optimal value function. Then the optimal control law is co mputed based on the informat ion fro m the identifier NN and the c rit ic NN so that the actor NN is not needed. In particu lar, a novel adaptive law design method with the parameter estimation error is proposed to online update the weights of both identifier NN and critic NN simu ltaneously, which converge to small neighborhoods of their ideal values. The closed-loop system stability and the convergence to small vicinity around the optimal solution are all proved by means of the Lyapunov theory. The proposed adaptation algorithm is also improved to achieve finite-time (FT) convergence of the NN weights. Finally, simulation results are provided to exemplify the efficacy of the proposed methods.
The vast majority of available parameter estimation methods assume that the parameters to be estimated are constant or slowly time-varying and mainly depend on a predictor or observer design so that a large adaptive gain must be used to achieve fast adaptation; this may result in high-frequency oscillations when the system subjects to a large source of uncertainties or disturbances. This paper is concerned with adaptive online estimation of time-varying parameters for two kinds of linearly parameterized nonlinear systems. By dividing the time into small intervals, the time-varying parameters are approximated in terms of polynomials with unknown coefficients. Then a novel adaptive law design methodology is developed to estimate those constant coefficients, for which the parameter estimation error information is explicitly derived and used to drive the adaptations. To guarantee the continuity of the parameter estimation for all time, a parameter resetting scheme is introduced at the beginning of each interval. Finite-time estimation convergence and the robustness against disturbances are all proved. Extensive simulation examples are provided to demonstrate the efficacy of the proposed algorithms for estimating time-varying parameters.In practical systems, time-varying behavior of the plant parameters may be unavoidable due to the complex mechanisms, the model-plant mismatch, or unmeasured inputs that affect the system dynamics [7,8]. Consequently, the estimation of such time-varying behaviors is of great importance for the control system design. To address this issue, some efforts have been advanced to exploit the ability of gradient algorithms and LS algorithms to estimate time-varying parameters. In particular, various well-known approaches, for example, gradient, recursive/nonrecursive LS algorithms and modified LS algorithms with constant/variable forgetting factor, are reviewed and compared in terms of performance and robustness [9]. It is shown in [9] that the nonrecursive algorithms are more robust to disturbances under slowly time-varying parameters. In [10], the application of a local regression of a time-related polynomial for the RLS algorithm with a forgetting factor is validated to reduce the mean-square estimation error. However, these algorithms all assume that the parameters to be estimated are slowly time-varying, that is, they may fail to retain their properties for fast time-varying parameters.It is well-known that a continuous time-varying function can be locally approximated by polynomials with constant coefficients. Following this idea, with the assumption that the time-varying parameters can be precisely represented as a linear combination of known functions and unknown constants [11,12], conventional LS estimation has been adopted for time-varying systems. In [13], a local Taylor expansion of a time-related polynomial is used to approximate time-varying parameters, and the induced constant Taylor series coefficients are estimated by using standard LS approach. It is noted that such approxima...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.