2018
DOI: 10.1109/lra.2018.2869644
|View full text |Cite
|
Sign up to set email alerts
|

Reinforced Imitation: Sample Efficient Deep Reinforcement Learning for Mapless Navigation by Leveraging Prior Demonstrations

Abstract: This work presents a case study of a learning-based approach for target driven map-less navigation. The underlying navigation model is an end-to-end neural network which is trained using a combination of expert demonstrations, imitation learning (IL) and reinforcement learning (RL). While RL and IL suffer from a large sample complexity and the distribution mismatch problem, respectively, we show that leveraging prior expert demonstrations for pre-training can reduce the training time to reach at least the same… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
99
0
1

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
2
2

Relationship

1
8

Authors

Journals

citations
Cited by 158 publications
(100 citation statements)
references
References 25 publications
0
99
0
1
Order By: Relevance
“…In 2013, DeepMind innovatively combined deep learning (DL) with RL to form a new hotspot in the field of artificial intelligence which is known as DRL [20]. By leveraging the decision-making capabilities of RL and the perceived capabilities of DL, DRL has been proven to be efficient at controlling UAV [21][22][23][24][25][26][27][28][29][30][31]. Zhu [21] proposed a framework for target driven visual navigation, this framework addressed some of the limitations that prevent DRL algorithms from being applied to realistic settings.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In 2013, DeepMind innovatively combined deep learning (DL) with RL to form a new hotspot in the field of artificial intelligence which is known as DRL [20]. By leveraging the decision-making capabilities of RL and the perceived capabilities of DL, DRL has been proven to be efficient at controlling UAV [21][22][23][24][25][26][27][28][29][30][31]. Zhu [21] proposed a framework for target driven visual navigation, this framework addressed some of the limitations that prevent DRL algorithms from being applied to realistic settings.…”
Section: Related Workmentioning
confidence: 99%
“…Tai [24] designed a mapless motion planner with DRL which can navigate the nonholonomic mobile robot to the desired targets without colliding with any obstacles. Pfeiffer Mark [25] presented and analyzed an approach that combines the advantages of both imitation learning and DRL for target-driven map-less navigation. Han [26] introduced a double deep Q-Network (Double DQN) [32] that utilized a priority sample replay method, and this demonstrated better results than DQN [20] and Double DQN when UAVs navigated through a 3D obstacle avoidance environment.…”
Section: Related Workmentioning
confidence: 99%
“…Our method builds on two main ideas for improving RL: using expert demonstrations, and a modelbased update for the policy gradient. While similar ideas have been explored in the literature [e.g., 34,36,17], our formulation is tailored for the NMP setting, and is, to the best of our knowledge, novel.…”
Section: Related Workmentioning
confidence: 99%
“…Sparse reward formulations are naturally suited for many goal-oriented manipulation tasks, but also create challenges leading to techniques such as augmenting reinforcement signals through reward shaping [20], [21], learning from expert demonstrations [13], [24], [25] and curriculum learning [1]. The latter proposes to guide learning by presenting training samples in a meaningful order with increasing complexity and has been applied to supervised learning for sequence prediction [26] and RL to acquire a curriculum of motor skills of an articulated figure [27].…”
Section: Related Workmentioning
confidence: 99%