Fixed-Wing UAVs flocking in continuous spaces: A deep reinforcement learning approach

Yan, Chao; Xiang, Xiaojia; Wang, Chang

doi:10.1016/j.robot.2020.103594

Cited by 46 publications

(15 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To verify the feasibility and effectiveness of our proposed method, we build a UAV hardware-in-the-loop (HITL) real-time simulation system [27] and conduct the flight simulation experiments based on this system.…”

Section: Methodsmentioning

confidence: 99%

Online Identification-Verification-Prediction Method for Parallel System Control of UAVs

et al. 2021

Self Cite

View full text Add to dashboard Cite

In order to solve the problem of how to efficiently control a large-scale swarm Unmanned Aerial Vehicle (UAV) system, which performs complex tasks with limited manpower in a non-ideal environment, this paper proposes a parallel UAV swarm control method. The key technology of parallel control is to establish a one-to-one artificial UAV system corresponding to the aerial swarm UAV on the ground. This paper focuses on the computational experiments algorithm for artificial UAV system establishment, including data processing, model identification, model verification and state prediction. Furthermore, this paper performs a comprehensive flight mission with four common modes (climbing, level flighting, turning and descending) for verification. The results of the identification experiment present a good consistency between the outputs of the refined dynamics model and the real flight data. The prediction experiment results show that the prediction method in this paper can basically guarantee that the prediction states error is kept within 10% about 16 s.

show abstract

Section: Methodsmentioning

confidence: 99%

Online Identification-Verification-Prediction Method for Parallel System Control of UAVs

et al. 2021

Self Cite

View full text Add to dashboard Cite

show abstract

“…Recently, many approaches have been developed to realize flocking navigation for multi-UAV systems. For example, Yan et al [7] considered the leader-followers flocking problem of fixed-wing UAVs in the context of deep reinforcement learning. The followers can always follow the leader closely.…”

Section: Introductionmentioning

confidence: 99%

“…Most of the previous studies either predefine the path of every UAV, or give the information of the leader to them [7][8][9][10][11], both of which are hard to realize in practice. Firstly, the mechanism of receiving the path information remotely from the ground station requires a communication device to be equipped on each UAV, which in turn burdens the data transmission load.…”

Section: Introductionmentioning

confidence: 99%

Towards Flocking Navigation and Obstacle Avoidance for Multi-UAV Systems through Hierarchical Weighting Vicsek Model

et al. 2021

Self Cite

View full text Add to dashboard Cite

Flocking navigation and obstacle avoidance in complex environments remain challenging for multiple unmanned aerial vehicle (multi-UAV) systems, especially when only one UAV (termed as information UAV) knows the predetermined path and the communication range is limited. To this end, we propose a hierarchical weighting Vicsek model (HWVEM). In this model, a hierarchical weighting mechanism and an obstacle avoidance mechanism are designed. Based on the hierarchical weighting mechanism, all the UAVs are divided into different layers, and assigned with different weights according to the layer to which they belong. The purpose is to align the rest of UAVs with the information UAV more efficiently. Subsequently, the obstacle avoidance mechanism that utilizes only the local information is developed to ensure the system safety in an environment filled with obstacles differing in size and shape. A series of simulations have been conducted to demonstrate the high performance of HWVEM in terms of convergence time, success rate, and safety.

show abstract

“…Unmanned aerial vehicle (UAV) flocking has also been a target for the application of deep reinforcement learning. Using simulation, a flocking controller was trained to control a follower's roll angle and velocity to keep a certain distance from a leader to avoid collisions [11]. In terms of deep reinforcement learning applied to control the attitude of aircraft, DDPG, trust region policy optimisation (TRPO [12]) and proximal policy optimisation (PPO [13]) algorithms have been used for quadrotors [14].…”

Section: Introductionmentioning

confidence: 99%

Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test

2021

View full text Add to dashboard Cite

Nonlinear flight controllers for fixed-wing unmanned aerial vehicles (UAVs) can potentially be developed using deep reinforcement learning. However, there is often a reality gap between the simulation models used to train these controllers and the real world. This study experimentally investigated the application of deep reinforcement learning to the pitch control of a UAV in wind tunnel tests, with a particular focus of investigating the effect of time delays on flight controller performance. Multiple neural networks were trained in simulation with different assumed time delays and then wind tunnel tested. The neural networks trained with shorter delays tended to be susceptible to delay in the real tests and produce fluctuating behaviour. The neural networks trained with longer delays behaved more conservatively and did not produce oscillations but suffered steady state errors under some conditions due to unmodeled frictional effects. These results highlight the importance of performing physical experiments to validate controller performance and how the training approach used with reinforcement learning needs to be robust to reality gaps between simulation and the real world.

show abstract

Fixed-Wing UAVs flocking in continuous spaces: A deep reinforcement learning approach

Cited by 46 publications

References 34 publications

Online Identification-Verification-Prediction Method for Parallel System Control of UAVs

Online Identification-Verification-Prediction Method for Parallel System Control of UAVs

Towards Flocking Navigation and Obstacle Avoidance for Multi-UAV Systems through Hierarchical Weighting Vicsek Model

Unmanned Aerial Vehicle Pitch Control under Delay Using Deep Reinforcement Learning with Continuous Action in Wind Tunnel Test

Contact Info

Product

Resources

About