2021
DOI: 10.48550/arxiv.2108.05877
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DexMV: Imitation Learning for Dexterous Manipulation from Human Videos

Abstract: We propose to perform imitation learning for dexterous manipulation from human demonstration videos. We record human videos on manipulation tasks (1st row) and perform 3D hand-object pose estimations from the videos (2nd row) for constructing the demonstrations. We have a paired simulation system providing the same dexterous manipulation tasks for the multi-finger robot hand (3rd row), including relocate, pour, and place inside, which we can solve using imitation learning with the inferred demonstrations.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(16 citation statements)
references
References 55 publications
0
16
0
Order By: Relevance
“…For articulated objects, we additionally provide part annotations similar to PartNet [34]. Worth mentioning, by providing object meshes, HOI4D could facilitate research in instance-level HOI and also makes it possible to transfer the human interaction trajectories to a simulation environment for applications such as robot imitation learning [38]. Label propagation.…”
Section: Category-level Pose Annotationmentioning
confidence: 99%
See 3 more Smart Citations
“…For articulated objects, we additionally provide part annotations similar to PartNet [34]. Worth mentioning, by providing object meshes, HOI4D could facilitate research in instance-level HOI and also makes it possible to transfer the human interaction trajectories to a simulation environment for applications such as robot imitation learning [38]. Label propagation.…”
Section: Category-level Pose Annotationmentioning
confidence: 99%
“…Inspired by DexMV [38], we divide the demonstration collection process into three steps named hand joint retargeting, state-only demonstration collection and state-action demonstration collection. We transform the human hand pose represented as 51 DoF MANO model [39] to 30 DoF Adroit Hand pose in the hand joint retargeting step.…”
Section: D2 Demonstration Collectionmentioning
confidence: 99%
See 2 more Smart Citations
“…It was not until recently that data-driven approaches have begun to promote research on learning human manipulation [2,15,20,36,44,61]. Prior work has tried to empower a machine complex skills such as handobject localization [46], pose estimation [30], grasp generation [11], and action imitation [42].…”
Section: Introductionmentioning
confidence: 99%