2020
DOI: 10.3390/robotics9010008
|View full text |Cite
|
Sign up to set email alerts
|

Sim-to-Real Quadrotor Landing via Sequential Deep Q-Networks and Domain Randomization

Abstract: The autonomous landing of an Unmanned Aerial Vehicle (UAV) on a marker is one of the most challenging problems in robotics. Many solutions have been proposed, with the best results achieved via customized geometric features and external sensors. This paper discusses for the first time the use of deep reinforcement learning as an end-to-end learning paradigm to find a policy for UAVs autonomous landing. Our method is based on a divide-and-conquer paradigm that splits a task into sequential sub-tasks, each one a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
27
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 33 publications
(27 citation statements)
references
References 31 publications
0
27
0
Order By: Relevance
“…In 2013, DeepMind innovatively combined deep learning (DL) with RL to form a new hotspot in the field of artificial intelligence which is known as DRL [20]. By leveraging the decision-making capabilities of RL and the perceived capabilities of DL, DRL has been proven to be efficient at controlling UAV [21][22][23][24][25][26][27][28][29][30][31]. Zhu [21] proposed a framework for target driven visual navigation, this framework addressed some of the limitations that prevent DRL algorithms from being applied to realistic settings.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In 2013, DeepMind innovatively combined deep learning (DL) with RL to form a new hotspot in the field of artificial intelligence which is known as DRL [20]. By leveraging the decision-making capabilities of RL and the perceived capabilities of DL, DRL has been proven to be efficient at controlling UAV [21][22][23][24][25][26][27][28][29][30][31]. Zhu [21] proposed a framework for target driven visual navigation, this framework addressed some of the limitations that prevent DRL algorithms from being applied to realistic settings.…”
Section: Related Workmentioning
confidence: 99%
“…Singla [28] designed a deep recurrent Q-Network [34] with temporal attention that exhibited significant improvements over DQN and D3QN [32] for UAV motion planning in a cluttered and unseen environment. For the autonomous landing task of UAV, Polvara R [29] introduced a sequential DQN which is comparable with DQN and human pilots while being quantitatively better in noisy conditions. Wang [30] proposed a fast recurrent deterministic policy gradient algorithm to address the UAV's autonomous navigation problem in a large-scale complex environment.…”
Section: Related Workmentioning
confidence: 99%
“…The target network retains a fixed value during the original Q Network learning for a few times and periodically resets it to the original Q Network value [35]. This can be an effective way of learning because Q Network can be closer to a fixed target network.…”
Section: Introduction To the Algorithms Of Deep Q Learningmentioning
confidence: 99%
“…Responding to the needs, there have been various studies and algorithms developed for autonomous flight systems. Especially, many ML-based (Machine-learning based) methods have been proposed for autonomous path finding [ 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 ]. However, they are limited when applied to a large target area.…”
Section: Introductionmentioning
confidence: 99%
“…The authors proposed a MEP–DDPG algorithm to address UAV’s AMP (Autonomous Motion Planning) problem. The authors of [ 18 , 19 ] proposed an autonomous landing task mechanism of UAVs based on sequential DQN and DDPG algorithms, respectively.…”
Section: Introductionmentioning
confidence: 99%