2020
DOI: 10.48550/arxiv.2012.06733
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Human-in-the-Loop Imitation Learning using Remote Teleoperation

Abstract: Imitation Learning is a promising paradigm for learning complex robot manipulation skills by reproducing behavior from human demonstrations. However, manipulation tasks often contain bottleneck regions that require a sequence of precise actions to make meaningful progress, such as a robot inserting a pod into a coffee machine to make coffee. Trained policies can fail in these regions because small deviations in actions can lead the policy into states not covered by the demonstrations. Intervention-based policy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
33
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(33 citation statements)
references
References 29 publications
0
33
0
Order By: Relevance
“…The proxy Q value distribution shown in this section not only explains the avoidance behaviors, but also serves as a good indicator for the learned human preference. We benchmark the performance of two human-in-the-loop methods HG-DAgger (Kelly et al, 2019) and IWR (Mandlekar et al, 2020). Both methods require warming up through behavior cloning on a pre-collected dataset.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…The proxy Q value distribution shown in this section not only explains the avoidance behaviors, but also serves as a good indicator for the learned human preference. We benchmark the performance of two human-in-the-loop methods HG-DAgger (Kelly et al, 2019) and IWR (Mandlekar et al, 2020). Both methods require warming up through behavior cloning on a pre-collected dataset.…”
Section: Discussionmentioning
confidence: 99%
“…DAgger (Ross et al, 2011) and its extended methods (Kelly et al, 2019;Zhang & Cho, 2016;Hoque et al, 2021) correct the compounding error (Ross & Bagnell, 2010) of behavior cloning by periodically requesting expert to provide more demonstration. Instead of proving demonstration upon requests, Human-Gated DAgger (HG-DAgger) (Kelly et al, 2019), Expert Intervention Learning (EIL) (Spencer et al, 2020) and Intervention Weighted Regression (IWR) (Mandlekar et al, 2020) empower the expert to intervene exploration and carry the agent to safe states. However, these methods do not impose constraints to reduce human intervention and do not utilize the data from the free exploration of the agent.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…By incorporating human into the training, previous works successfully improve the performance on visual input control tasks such as Atari game [1,51,64]. Robotic control tasks also benefit from human feedback [28,39,58,46,65]. The other category is to have human in both training and test time to accurately accomplish humanassistive tasks.…”
Section: Related Workmentioning
confidence: 99%
“…are several works showing how humans can interactively teach robotic agents, for exampleSaxena et al (2014);Paxton et al (2017);Mandlekar et al (2018);Cabi et al (2019);Mandlekar et al (2020). InSaxena et al (2014), the authors demonstrate large-scale crowd-sourcing of data for perceptual and knowledge-base components of a robotics system.…”
mentioning
confidence: 99%