2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00873
|View full text |Cite
|
Sign up to set email alerts
|

FrameNet: Learning Local Canonical Frames of 3D Surfaces From a Single RGB Image

Abstract: In this work, we introduce the novel problem of identifying dense canonical 3D coordinate frames from a single RGB image. We observe that each pixel in an image corresponds to a surface in the underlying 3D geometry, where a canonical frame can be identified as represented by three orthogonal axes, one along its normal direction and two in its tangent plane. We propose an algorithm to predict these axes from RGB. Our first insight is that canonical frames computed automatically with recently introduced directi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
43
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 50 publications
(43 citation statements)
references
References 51 publications
(113 reference statements)
0
43
0
Order By: Relevance
“…3D generation and reconstruction has been studied extensively in the computer vision and graphics communities (Saxena et al 2009;Chaudhuri et al 2011;Kalogerakis et al 2012;Chang et al 2015;Rezende et al 2016;Soltani et al 2017;Kulkarni et al 2015;Tulsiani et al 2016;Huang et al 2019;Jiang et al 2019). Most methods in the literature focus on recovering the 3D structure from 2D images by using explicit 3D supervision.…”
Section: Single View 3d Reconstruction and Generationmentioning
confidence: 99%
“…3D generation and reconstruction has been studied extensively in the computer vision and graphics communities (Saxena et al 2009;Chaudhuri et al 2011;Kalogerakis et al 2012;Chang et al 2015;Rezende et al 2016;Soltani et al 2017;Kulkarni et al 2015;Tulsiani et al 2016;Huang et al 2019;Jiang et al 2019). Most methods in the literature focus on recovering the 3D structure from 2D images by using explicit 3D supervision.…”
Section: Single View 3d Reconstruction and Generationmentioning
confidence: 99%
“…Huang et al [28] propose FrameNet, a model to learn a canonical frame from a RGB image, where a canonical frame is represented by three orthogonal directions, one along the normal direction and two in the tangent plane. Dasgupta et al [29] propose a novel method called DeLay for room layout estimation from a single monocular RGB image that uses a CNN model to generate an initial belief map, which is then used by an optimization algorithm to predict the final room layout.…”
Section: B Monocular Depth Estimationmentioning
confidence: 99%
“…Unfortunately, such method assumes the ground plane is always visible in the images, and only applies to vehicle-control use cases. In addition, there are recent work making use of local surface frame representation for a variety of 3D scene understanding tasks [20,21]. However, our method extends beyond these ideas by estimating both local and global aligned surface geometry from single images and use such correspondences to estimate camera orientation.…”
Section: Related Workmentioning
confidence: 99%
“…We find this choice leads to the best performance in our experiments. However, other surface frame representations could also be used [20,21].…”
Section: Approachmentioning
confidence: 99%