2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00631
|View full text |Cite
|
Sign up to set email alerts
|

Deep Surface Normal Estimation With Hierarchical RGB-D Fusion

Abstract: The growing availability of commodity RGB-D cameras has boosted the applications in the field of scene understanding. However, as a fundamental scene understanding task, surface normal estimation from RGB-D data lacks thorough investigation. In this paper, a hierarchical fusion network with adaptive feature re-weighting is proposed for surface normal estimation from a single RGB-D image. Specifically, the features from color image and depth are successively integrated at multiple scales to ensure global surfac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
49
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 70 publications
(55 citation statements)
references
References 30 publications
0
49
0
1
Order By: Relevance
“…Therefore, a confidence map network branch was set following the method proposed by Zeng et al [ 11 ], which generates confidence maps to indicate whether side effects resulted from pixel holes on or not. Confidence maps [ 19 ] of depth image were produced by combining mask images [ 21 ] ( ) with relative coarse depth images ( and were denoted as according to resolution.…”
Section: Our Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…Therefore, a confidence map network branch was set following the method proposed by Zeng et al [ 11 ], which generates confidence maps to indicate whether side effects resulted from pixel holes on or not. Confidence maps [ 19 ] of depth image were produced by combining mask images [ 21 ] ( ) with relative coarse depth images ( and were denoted as according to resolution.…”
Section: Our Methodsmentioning
confidence: 99%
“…As is shown in Figure 3 , pixel holes in mask images ( ) suggest that there are lots of missing pixels in ground truth depth images, which inevitably causes deviation to supervised learning. Therefore, we adopted a multi-layer convolution network ( ) for producing confidence map [ 19 ] of input depth images. stands for scale value of images, i.e., if the resolution of 2D images can be denoted as , then the corresponding is defined as .…”
Section: Our Methodsmentioning
confidence: 99%
See 3 more Smart Citations