2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018
DOI: 10.1109/cvpr.2018.00405
|View full text |Cite
|
Sign up to set email alerts
|

Im2Pano3D: Extrapolating 360° Structure and Semantics Beyond the Field of View

Abstract: We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 • panoramic view of an indoor scene when given only a partial observation (≤ 50%) in the form of an RGB-D image. To make this possible, Im2Pano3D leverages strong contextual priors learned from large-scale synthetic and realworld indoor scenes. To ease the prediction of 3D structure, we propose to parameterize 3D surfaces with their plane equati… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
58
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 77 publications
(58 citation statements)
references
References 40 publications
0
58
0
Order By: Relevance
“…distance-to-origin). We then pass both predicted outputs through a differentiable PN-layer [22] to convert the estimated surface normals and plane distances into a pixel-wise prediction of 3D locations. Direct supervision is provided to the 1) surface normal predictions via a cosine loss, 2) plane offset predictions via an 1 loss, and 3) final 3D point locations via an 1 to ensure consistency between the surface normal and plane offset predictions.…”
Section: Geometry Estimationmentioning
confidence: 99%
“…distance-to-origin). We then pass both predicted outputs through a differentiable PN-layer [22] to convert the estimated surface normals and plane distances into a pixel-wise prediction of 3D locations. Direct supervision is provided to the 1) surface normal predictions via a cosine loss, 2) plane offset predictions via an 1 loss, and 3) final 3D point locations via an 1 to ensure consistency between the surface normal and plane offset predictions.…”
Section: Geometry Estimationmentioning
confidence: 99%
“…. Following the convention [37,31], we formulate the input to both scan completion modules using a similar tensor form…”
Section: Scan Completion Modulesmentioning
confidence: 99%
“…Flint et al used a combination of monocular features, with multiple-view and 3D features to infer a Manhattan World representation of the environment [9]. In [30], a RGB-D panorama is split and half of the scene is taken as input with the aim of producing reasonable, cluttered room layout estimation for the unseen portion of the panorama.…”
Section: Cloudmentioning
confidence: 99%