This article presents a rigorous formulation for the pursuit‐evasion (PE) game when velocity constraints are imposed on agents of the game or players. The game is formulated as an infinite‐horizon problem using a non‐quadratic functional, then sufficient conditions are derived to prove capture in a finite‐time. A novel tracking Hamilton–Jacobi–Isaacs (HJI) equation associated with the non‐quadratic value function is employed, which is solved for Nash equilibrium velocity policies for each agent with arbitrary nonlinear dynamics. In contrast to the existing remedies for proof of capture in PE game, the proposed method does not assume players are moving with their maximum velocities and considers the velocity constraints a priori. Attaining the optimal actions requires the solution of HJI equations online and in real‐time. We overcome this problem by presenting the on‐policy iteration of integral reinforcement learning (IRL) technique. The persistence of excitation for IRL to work is satisfied inherently until capture occurs, at which time the game ends. Furthermore, a nonlinear backstepping control method is proposed to track desired optimal velocity trajectories for players with generalized Newtonian dynamics. Simulation results are provided to show the validity of the proposed methods.
This paper briefly reviews the dynamics and the control architectures of unmanned vehicles; reinforcement learning (RL) in optimal control theory; and RL-based applications in unmanned vehicles. Nonlinearities and uncertainties in the dynamics of unmanned vehicles (e.g. aerial, underwater, and tailsitter vehicles) pose critical challenges to their control systems. Solving Hamilton–Jacobi–Bellman (HJB) equations to find optimal controllers becomes difficult in the presence of nonlinearities, uncertainties, and actuator faults. Therefore, RL-based approaches are widely used in unmanned vehicle systems to solve the HJB equations. To this end, they learn the optimal solutions by using online data measured along the system trajectories. This approach is very practical in partially or completely model-free optimal control design and optimal fault-tolerant control design for unmanned vehicle systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.