Autonomous maneuvering decisions of unmanned aerial vehicle (UAV) in short-range air combat remain a challenging research topic, and a decision method based on an improved deep deterministic policy gradient (DDPG) is proposed. First, the problem model is improved from the perspective of energy-air combat, and a decision model with engine thrust, angle of attack, and roll angle as control variables is established. The normal and tangential overloads are determined by these control variables, and the decision is constrained by the flight stability and threshold range. Subsequently, the decision learning algorithm of the maneuver command is designed based on the DDPG framework. According to the energy air combat, speed is introduced into the return function in some states to make the return value more in line with reality. In view of the slow learning speed of the DDPG algorithm, the winning rate is introduced into the ε-greedy strategy to adjust the exploration and application probabilities in real time. In view of the decrease in computational efficiency caused by the large amount of empirical data, a similar empirical exclusion was carried out based on the vector distance. The simulation results show that the DDPG-based algorithm realizes autonomous decisions of engine thrust, roll angle, and attack angle under constraints, and the comparative simulation shows that the improvement measures are effective.INDEX TERMS Unmanned aerial vehicle (UAV), maneuvering decision, deep deterministic policy gradient (DDPG), short-range air combat, reinforcement Learning (RL)
When Unmanned Aerial Vehicles (UAVs) are used in search and rescue operations, electro-optical (EO) devices are usually used as the detection equipment, and area coverage is used as the main search method. However, the sector scanning mode of EO puts forward higher requirements for task parameter planning. First, to ensure there is no missing coverage, a method to determine the full coverage width of EO equipment in sector scanning mode is proposed. Second, the constraint of no interval missing and the model of the speed-to-high ratio constraint are established, and the constraints of other factors are addressed in the context of the problem situation. Third, a coverage efficiency index is proposed for the boustrophedon coverage of a rectangular area, and a comprehensive coverage index is established. Finally, task parameter planning algorithms are designed, based on Immune Algorithm (IA), Grey Wolf Optimization (GWO) and Variable Neighborhood Search (VNS), respectively. The simulation results showed that the designed algorithms, based on IA, GWO and VNS, can effectively solve task planning problems. In general, IA is more suitable for offline occasions, VNS is suitable for online real-time planning, and GWO has characteristics between the two. The coverage process, based on optimized parameters, meets all constraints, has higher search efficiency and does not miss areas, proving the correctness of these models and the effectiveness of the planning algorithm. The research presented in this paper provides a technical basis for efficient and fully automated target search and rescue.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.