2018 IEEE International Conference on Robotics and Automation (ICRA) 2018
DOI: 10.1109/icra.2018.8461249
|View full text |Cite
|
Sign up to set email alerts
|

Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation

Abstract: Imitation learning is a powerful paradigm for robot skill acquisition. However, obtaining demonstrations suitable for learning a policy that maps from raw pixels to actions can be challenging. In this paper we describe how consumergrade Virtual Reality headsets and hand tracking hardware can be used to naturally teleoperate robots to perform complex tasks. We also describe how imitation learning can learn deep neural network policies (mapping from pixels to actions) that can acquire the demonstrated skills. Ou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
310
0
3

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 411 publications
(314 citation statements)
references
References 42 publications
1
310
0
3
Order By: Relevance
“…In this work we concentrate on the former. The majority of work in behaviour cloning operates on a set of configuration-space trajectories that can be collected via tele-operation [15], [16], kinesthetic teaching [17], [18], sensors on a human demonstrator [19], [20], [21], [22], through motion planners [5], or even by observing humans directly. Expanding further on the latter, learning by observing humans has previously been achieved through hand-designed mappings between human actions and robot actions [1], [2], [23], visual activity recognition and explicit handtracking [24], [25], and more recently by a system that infers actions from a single video of a human via an end-toend trained system [4].…”
Section: Related Workmentioning
confidence: 99%
“…In this work we concentrate on the former. The majority of work in behaviour cloning operates on a set of configuration-space trajectories that can be collected via tele-operation [15], [16], kinesthetic teaching [17], [18], sensors on a human demonstrator [19], [20], [21], [22], through motion planners [5], or even by observing humans directly. Expanding further on the latter, learning by observing humans has previously been achieved through hand-designed mappings between human actions and robot actions [1], [2], [23], visual activity recognition and explicit handtracking [24], [25], and more recently by a system that infers actions from a single video of a human via an end-toend trained system [4].…”
Section: Related Workmentioning
confidence: 99%
“…A comparison of various control interfaces shows that general purpose hardware is deficient while special purpose hardware is more accurate but is not widely available [19,26]. Virtual reality-based free-space controllers have recently been proposed both for data collection [23,39] and policy learning [40,43]. While these methods have shown the utility of data, they do not provide a seamlessly scalable data collection mechanism.…”
Section: Related Workmentioning
confidence: 99%
“…While these methods have shown the utility of data, they do not provide a seamlessly scalable data collection mechanism. Often the data is either collected locally or requires a powerful local client computer, to render the high definition sensor stream to a VR headset [39,43]. The use of VR hardware and requirement of client-side compute resources has limited the deployment of these interfaces on crowdsourcing platforms.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…liuboyi17@mails.ucas.edu.cn; lj.wang1@siat.ac.cn 2 Ming liu is with Department of ECE, Hong Kong University of Science and Technology. eelium@ust.hk 3 Boyi liu is also with the University of Chinese Academy of Sciences. 4 Cheng-Zhong Xu is with the University of Macau.…”
Section: Introductionmentioning
confidence: 99%