2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.01072
|View full text |Cite
|
Sign up to set email alerts
|

LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
113
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
1
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 94 publications
(113 citation statements)
references
References 27 publications
0
113
0
Order By: Relevance
“…These works can generalize to unseen objects, but still require 3D object models. Park et al [19] relax this assumption by estimating a 3D geometric model by learning a 3D object representation that enforces consistency across multiple views. Then, this estimated 3D object model can be rendered as a depth image of the object in a desired pose.…”
Section: Related Workmentioning
confidence: 99%
“…These works can generalize to unseen objects, but still require 3D object models. Park et al [19] relax this assumption by estimating a 3D geometric model by learning a 3D object representation that enforces consistency across multiple views. Then, this estimated 3D object model can be rendered as a depth image of the object in a desired pose.…”
Section: Related Workmentioning
confidence: 99%
“…Thus, it allows for directly synthesizing the object appearance, eliminating the intermediate step of reconstructing the object in 3D. Latent-Fusion [37] proposes a 3D latent space based object representation for unseen object pose estimation. In contrast to ours, it requires multi-view imagery of the test object to form the latent space and depth measurements at test time.…”
Section: Object Pose Estimationmentioning
confidence: 99%
“…In contrast to ours, it requires multi-view imagery of the test object to form the latent space and depth measurements at test time. In contrast to both NOCS [50] and LatentFusion [37], our model enables 3D object pose estimation from a single RGB image as input.…”
Section: Object Pose Estimationmentioning
confidence: 99%
“…Early works [10,11] directly apply 2D CNN on RGB-D image for 6DoF pose regression, while 2D CNN does not characterize 3D geometric information well. To better explore the 3D geometric information, the 3D space corresponding to depth image is divided into voxel grids, and then 3D CNN is applied on voxels [18][19][20]. Higher-dimensional convolution on 3D voxel requires huge computational resources.…”
Section: Related Workmentioning
confidence: 99%