2017
DOI: 10.48550/arxiv.1703.09327
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DART: Noise Injection for Robust Imitation Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
23
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
7
2
1

Relationship

1
9

Authors

Journals

citations
Cited by 19 publications
(24 citation statements)
references
References 12 publications
1
23
0
Order By: Relevance
“…For Dial, we find that behaviour cloning alone is not able to solve the task and all of our models never generate the maximum return due to the exposure bias. This is consistent with the previous literature on applying behavior cloning to robotics tasks with continuous states (Laskey et al, 2017). More details can be found in Figure 13.…”
Section: Behavior Cloningsupporting
confidence: 92%
“…For Dial, we find that behaviour cloning alone is not able to solve the task and all of our models never generate the maximum return due to the exposure bias. This is consistent with the previous literature on applying behavior cloning to robotics tasks with continuous states (Laskey et al, 2017). More details can be found in Figure 13.…”
Section: Behavior Cloningsupporting
confidence: 92%
“…Scientifically, this result is valuable evidence about the limitations of pure imitation in the driving domain, especially in light of recent promising results for high-capacity models (Laskey et al (2017a)). But practically, we needed ways to address this challenge without exposing demonstrators to new states actively (Ross et al (2011); Laskey et al (2017b)) or performing reinforcement learning (Kuefler et al (2017)).…”
Section: Introductionmentioning
confidence: 99%
“…Prior work has attempted to reduce the number of human annotations needed [10,23,30] but relabeling is still required. Noise injection during expert demonstrations has also been proposed in order to correct for covariate shift [22]. Other paradigms for human-in-the-loop policy learning include collaboration [14,15], teaching the robot through informative sample selection [7,9,17], and leveraging physical kinesthetic corrections [5,6].…”
Section: Related Workmentioning
confidence: 99%