2022 International Conference on 3D Vision (3DV) 2022
DOI: 10.1109/3dv57658.2022.00047
|View full text |Cite
|
Sign up to set email alerts
|

Reconstructing Action-Conditioned Human-Object Interactions Using Commonsense Knowledge Priors

Abstract: We present a method for inferring diverse 3D models of human-object interactions from images. Reasoning about how humans interact with objects in complex scenes from a single 2D image is a challenging task given ambiguities arising from the loss of information through projection. In addition, modeling 3D interactions requires the generalization ability towards diverse object categories and interaction types. We propose an action-conditioned modeling of interactions that allows us to infer diverse 3D arrangemen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(2 citation statements)
references
References 36 publications
0
2
0
Order By: Relevance
“…(Huang et al 2022) create new datasets containing RGBD videos and pseudo GT 3D human and rigid object models. With the help of BEHAVE, (Xie, Bhatnagar, and Pons-Moll 2022;Wang et al 2022;Xie, Bhatnagar, and Pons-Moll 2023) get better performance on single-view HOI 3D reconstruction. However, there is still an ignored gap from 2D to 3D, that is humans interacting with articulated objects.…”
Section: Related Workmentioning
confidence: 99%
“…(Huang et al 2022) create new datasets containing RGBD videos and pseudo GT 3D human and rigid object models. With the help of BEHAVE, (Xie, Bhatnagar, and Pons-Moll 2022;Wang et al 2022;Xie, Bhatnagar, and Pons-Moll 2023) get better performance on single-view HOI 3D reconstruction. However, there is still an ignored gap from 2D to 3D, that is humans interacting with articulated objects.…”
Section: Related Workmentioning
confidence: 99%
“…1, are part of our daily routines. Being able to synthesize such interactions in a virtual 3D environment through textual instructions has widespread applications in several areas, including computer graphics and robotics [ALNM20; HTBT22; WLK*22], movie script visualization [HMLC09] and game design [SSR07]. For instance, in a digitally created movie scene or a virtual role‐playing game, it is natural for the character to interact with the scene objects based on a set of instructions, such as yielding tools, using objects, or eating various items.…”
Section: Introductionmentioning
confidence: 99%