For the pursuit-evasion (PE) game, this paper proposes an evasion strategy with coordinated control of angle of attack, bank angle and body morphing control using deep reinforcement learning. Considering the evasion and ballistic regression of the aircraft, the specified miss distance (SMD) and residual energy are used as the optimization objectives, to acquire the optimal control strategy against the encounter with pursuer in the terminal guidance phase. For the problem of sparse rewards, reward reshaping cannot be performed for this problem, we modify DQN algorithm with the mechanism of Monte-Carlo reinforcement learning to improve the sampling efficiency and realize the end-to-end learning. Finally, the linear analytical solution of the problem based on SMD is analyzed theoretically. With it, the strategy obtained by reinforcement learning is compared and explained.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.