Trajectory tracking control of wheeled mobile robots (WMRs) has been an important research topic in control theory and robotics. Although various tracking control methods with stability have been developed for WMRs, it is still difficult to design optimal or near-optimal tracking controller under uncertainties and disturbances. In this paper, a near-optimal tracking control method is presented for WMRs based on receding-horizon dual heuristic programming (RHDHP). In the proposed method, a backstepping kinematic controller is designed to generate desired velocity profiles and the receding horizon strategy is used to decompose the infinite-horizon optimal control problem into a series of finite-horizon optimal control problems. In each horizon, a closed-loop tracking control policy is successively updated using a class of approximate dynamic programming algorithms called finite-horizon dual heuristic programming (DHP). The convergence property of the proposed method is analyzed and it is shown that the tracking control system based on RHDHP is asymptotically stable by using the Lyapunov approach. Simulation results on three tracking control problems demonstrate that the proposed method has improved control performance when compared with conventional model predictive control (MPC) and DHP. It is also illustrated that the proposed method has lower computational burden than conventional MPC, which is very beneficial for real-time tracking control.
In this paper, a learning-based predictive control (LPC) scheme is proposed for adaptive optimal control of discrete-time nonlinear systems under stochastic disturbances. The proposed LPC scheme is different from conventional model predictive control (MPC), which uses open-loop optimization or simplified closed-loop optimal control techniques in each horizon. In LPC, the control task in each horizon is formulated as a closed-loop nonlinear optimal control problem and a finite-horizon iterative reinforcement learning (RL) algorithm is developed to obtain the closed-loop optimal/suboptimal solutions. Therefore, in LPC, RL and adaptive dynamic programming (ADP) are used as a new class of closed-loop learning-based optimization techniques for nonlinear predictive control with stochastic disturbances. Moreover, LPC also decomposes the infinite-horizon optimal control problem in previous RL and ADP methods into a series of finite horizon problems, so that the computational costs are reduced and the learning efficiency can be improved. Convergence of the finite-horizon iterative RL algorithm in each prediction horizon and the Lyapunov stability of the closed-loop control system are proved. Moreover, by using successive policy updates between adjoint time horizons, LPC also has lower computational costs than conventional MPC which has independent optimization procedures between two different prediction horizons. Simulation results illustrate that compared with conventional nonlinear MPC as well as ADP, the proposed LPC scheme can obtain a better performance both in terms of policy optimality and computational efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.