The penetration of unmanned aerial vehicles (UAVs) is an important aspect of UAV games. In recent years, UAV penetration has generally been solved using artificial intelligence methods such as reinforcement learning. However, the high sample demand of the reinforcement learning method poses a significant challenge specifically in the context of UAV games. To improve the sample utilization in UAV penetration, this paper innovatively proposes an improved sampling mechanism called task completion division (TCD) and combines this method with the soft actor critic (SAC) algorithm to form the TCD-SAC algorithm. To compare the performance of the TCD-SAC algorithm with other related baseline algorithms, this study builds a dynamic environment, a UAV game, and conducts training and testing experiments in this environment. The results show that among all the algorithms, the TCD-SAC algorithm has the highest sample utilization rate and the best actual penetration results, and the algorithm has a good adaptability and robustness in dynamic environments.
The penetration of unmanned aerial vehicles (UAVs) is an essential and important link in modern warfare. Enhancing UAV’s ability of autonomous penetration through machine learning has become a research hotspot. However, the current generation of autonomous penetration strategies for UAVs faces the problem of excessive sample demand. To reduce the sample demand, this paper proposes a combination policy learning (CPL) algorithm that combines distributed reinforcement learning and demonstrations. Innovatively, the action of the CPL algorithm is jointly determined by the initial policy obtained from demonstrations and the target policy in the asynchronous advantage actor-critic network, thus retaining the guiding role of demonstrations in the initial training. In a complex and unknown dynamic environment, 1000 training experiments and 500 test experiments were conducted for the CPL algorithm and related baseline algorithms. The results show that the CPL algorithm has the smallest sample demand, the highest convergence efficiency, and the highest success rate of penetration among all the algorithms, and has strong robustness in dynamic environments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.