2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021
DOI: 10.1109/iccv48922.2021.00674
|View full text |Cite
|
Sign up to set email alerts
|

Where2Act: From Pixels to Actions for Articulated 3D Objects

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
46
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 85 publications
(46 citation statements)
references
References 34 publications
0
46
0
Order By: Relevance
“…Researchers have used other forms of supervision (strong supervision, weak supervision, imitation learning, reinforcement learning, inverse reinforcement learning) to build interactive understanding of objects. This can be in the form of learning a) where and how to grasp [9,21,26,27,30,35,36,39,43,51], b) state classifiers [25], c) interaction hotspots [15,42,44,61], d) spatial priors for action sites [46], e) object articulation modes [12,38], f) reward functions [29,31,50,52], g) functional correspondences [34]. While our work pursues similar goals, we differ in our supervision source (observation of human hands interacting with objects in egocentric videos).…”
Section: Related Workmentioning
confidence: 99%
“…Researchers have used other forms of supervision (strong supervision, weak supervision, imitation learning, reinforcement learning, inverse reinforcement learning) to build interactive understanding of objects. This can be in the form of learning a) where and how to grasp [9,21,26,27,30,35,36,39,43,51], b) state classifiers [25], c) interaction hotspots [15,42,44,61], d) spatial priors for action sites [46], e) object articulation modes [12,38], f) reward functions [29,31,50,52], g) functional correspondences [34]. While our work pursues similar goals, we differ in our supervision source (observation of human hands interacting with objects in egocentric videos).…”
Section: Related Workmentioning
confidence: 99%
“…This is crucial because in the beginning, when there are much fewer positive examples than negative examples and the dataset is imbalanced, the model may converge to a suboptimal solution in which all values in the output are close to 0. This technique is also used in other work with similar problems [30,48]. We use a similar strategy for balancing data across different tasks.…”
Section: Trainingmentioning
confidence: 99%
“…Prior work has avoided this roadblock in two ways: (1) with human supervision [19,2]; or (2) by greatly constraining the space of possible actions [24,7,8,21,20]. Although labelled data (e.g., keypoint annotations where an object should be grasped and interacted with) remove the need to sample actions, they can be expensive, time-consuming to collect and may encode irrelevant human biases.…”
Section: Introductionmentioning
confidence: 99%