This paper proposes a feature fusion algorithm for solving the path planning problem of multiple unmanned aerial vehicles (UAVs) using GPS and communication denial conditions. Due to the blockage of GPS and communication, UAVs cannot obtain the precise position of a target, which leads to the failure of path planning algorithms. This paper proposes a feature fusion proximal policy optimization (FF-PPO) algorithm based on deep reinforcement learning (DRL); the algorithm can fuse image recognition information with the original image, realizing the multi-UAV path planning algorithm without an accurate target location. In addition, the FF-PPO algorithm adopts an independent policy for multi-UAV communication denial environments, which enables the distributed control of UAVs such that multi-UAVs can realize the cooperative path planning task without communication. The success rate of our proposed algorithm can reach more than 90% in the multi-UAV cooperative path planning task. Finally, the feasibility of the algorithm is verified by simulations and hardware.
In this paper, we propose a C51-Duel-IP (C51 Dueling DQN with Independent Policy) dynamic destination path-planning algorithm to solve the problem of autonomous navigation and avoidance of multiple Unmanned Aerial Vehicles (UAVs) in the communication denial environment. Our proposed algorithm expresses the Q function output by the Dueling network as a Q distribution, which improves the fitting ability of the Q value. We also extend the single-step temporal differential (TD) to the N-step timing differential, which solves the problem of inflexible updates of the single-step temporal differential. More importantly, we use an independent policy to achieve autonomous avoidance and navigation of multiple UAVs without any communication with each other. In the case of communication rejection, the independent policy can achieve the consistency of multiple UAVs and avoid the greedy behavior of UAVs. In multiple-UAV dynamic destination scenarios, our work includes path planning, taking off from different initial positions, and dynamic path planning, taking off from the same initial position. The hardware-in-the-loop (HITL) experiment results show that our C51-Duel-IP algorithm is much more robust and effective than the original Dueling-IP and DQN-IP algorithms in an urban simulation environment. Our independent policy algorithm has similar effects as the shared policy but with the significant advantage of running in a communication denial environment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.