2019 International Conference on Robotics and Automation (ICRA) 2019
DOI: 10.1109/icra.2019.8793485
|View full text |Cite
|
Sign up to set email alerts
|

Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks

Abstract: Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. However, it is non-trivial to manually design a robot controller that combines modalities with very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to deploy on real robots due to sample complexity. We use self-supervision to learn a compact and multimodal representation of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
138
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 256 publications
(139 citation statements)
references
References 55 publications
1
138
0
Order By: Relevance
“…The Deterministic model is based after the model proposed in our previous work [39], which does not use a probabilistic graphical model framework. Instead we are using deterministic encoders to learn the representation and deterministic decoders to predict the same self-supervised objectives.…”
Section: A Deterministic Modelmentioning
confidence: 99%
See 4 more Smart Citations
“…The Deterministic model is based after the model proposed in our previous work [39], which does not use a probabilistic graphical model framework. Instead we are using deterministic encoders to learn the representation and deterministic decoders to predict the same self-supervised objectives.…”
Section: A Deterministic Modelmentioning
confidence: 99%
“…In our previous work [39], we have used the same robot for real world experiments. Here, we use the Franka Panda robot (also with 7-DoF, torquecontrolled) to emphasize that the results reported in [39] are reproducible on different hardware. Four sensor modalities are available in both simulation and real hardware, including proprioception, an RGB-D camera, and a force-torque sensor.…”
Section: Experiments: Design and Setupmentioning
confidence: 99%
See 3 more Smart Citations