2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021
DOI: 10.1109/cvpr46437.2021.00467
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Learning of 3D Object Categories from Videos in the Wild

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
44
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 51 publications
(45 citation statements)
references
References 30 publications
1
44
0
Order By: Relevance
“…Neural radiance fields. Recently, a variety of methods based on NeRF [94] have become popular for novel view synthesis [5,39,53,63,77,79,80,85,99,107,115,124,136,141], 3D reconstruction [8,8,15,27,35,54,60,61,102,116,132,139,142,143], generative modeling [70,90,100,121] and semantic segmentation [149]. The majority of these models demonstrate impressive results on novel view synthesis but are only applicable in the single-scene setting.…”
Section: Related Workmentioning
confidence: 99%
“…Neural radiance fields. Recently, a variety of methods based on NeRF [94] have become popular for novel view synthesis [5,39,53,63,77,79,80,85,99,107,115,124,136,141], 3D reconstruction [8,8,15,27,35,54,60,61,102,116,132,139,142,143], generative modeling [70,90,100,121] and semantic segmentation [149]. The majority of these models demonstrate impressive results on novel view synthesis but are only applicable in the single-scene setting.…”
Section: Related Workmentioning
confidence: 99%
“…A long-standing objective in computer vision has been to understand the 3D structure of scenes and objects from a single image. Many works have approached this problem by encoding a relationship between appearance and structure using prior knowledge of that structure as supervision [16,20,24,39]. Until recently however, the problem of deriving such a model from only single-view observations has remained very difficult.…”
Section: Related Workmentioning
confidence: 99%
“…The densities for each point are predicted by aggregating information from all other points using a single attention layer. Similarly for category-specific reconstruction, NerFormer [38] proposed to replace the MLP in NeRF-WCE [23] with a transformer model to allow for spatial reasoning. Our work crucially differs from these methods at their core: the rendering framework.…”
Section: Related Workmentioning
confidence: 99%