2021
DOI: 10.48550/arxiv.2109.12098
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CLIPort: What and Where Pathways for Robotic Manipulation

Mohit Shridhar,
Lucas Manuelli,
Dieter Fox

Abstract: How can we imbue robots with the ability to manipulate objects precisely but also to reason about them in terms of abstract concepts? Recent works in manipulation have shown that end-to-end networks can learn dexterous skills that require precise spatial reasoning, but these methods often fail to generalize to new goals or quickly learn transferable concepts across tasks. In parallel, there has been great progress in learning generalizable semantic representations for vision and language by training on large-s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 43 publications
0
8
0
Order By: Relevance
“…Moreover, our approach can learn a concept hierarchy starting from zero known concepts, displaying the adaptability of our model under a continual learning setup. Previous work has focused on language conditioned manipulation (Shridhar, Manuelli, and Fox 2021;Liu et al 2021;Brohan et al 2023b,a). Shridhar, Manuelli, and Fox 2021 computes a pick and place location conditioned on linguistic and visual inputs.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Moreover, our approach can learn a concept hierarchy starting from zero known concepts, displaying the adaptability of our model under a continual learning setup. Previous work has focused on language conditioned manipulation (Shridhar, Manuelli, and Fox 2021;Liu et al 2021;Brohan et al 2023b,a). Shridhar, Manuelli, and Fox 2021 computes a pick and place location conditioned on linguistic and visual inputs.…”
Section: Related Workmentioning
confidence: 99%
“…Human Subjects Experiment Domain We evaluate our approach with the human-subjects experiment in a 2-D object rearrangement domain, which is a problem commonly used in language grounding and HRI research (Liu et al 2021;Shridhar, Manuelli, and Fox 2021). The domain we choose for this study is the House-Construction domain which we introduce in Domains Section.…”
Section: Human-subjects Experimentsmentioning
confidence: 99%
See 1 more Smart Citation
“…They demonstrated that the final goal-conditioned policy performed extremely well. [223] proposed CLIPort, also based on a large-scale unsupervised representation learning model named CLIP [196], which demonstrated surprisingly good generality when trained with a few demonstrations for natural-language-conditioned robotic manipulation.…”
Section: Interactive Grasp Synthesismentioning
confidence: 99%
“…CLIP Extensions Despite CLIP [8] being fairly new, multiple derivative works across different sub-fields have emerged. CLIP was combined with a GAN to modify images based on a text prompt [43] and in robotics to generalize to unseen objects in manipulations tasks [44]. Other work focused on understanding CLIP in more detail.…”
Section: Referring Expression Segmentationmentioning
confidence: 99%