2022
DOI: 10.48550/arxiv.2203.01502
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

NeW CRFs: Neural Window Fully-connected CRFs for Monocular Depth Estimation

Abstract: Estimating the accurate depth from a single image is challenging since it is inherently ambiguous and ill-posed. While recent works design increasingly complicated and powerful networks to directly regress the depth map, we take the path of CRFs optimization. Due to the expensive computation, CRFs are usually performed between neighborhoods rather than the whole graph. To leverage the potential of fully-connected CRFs, we split the input into windows and perform the FC-CRFs optimization within each window, whi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
21
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 18 publications
(26 citation statements)
references
References 24 publications
0
21
0
Order By: Relevance
“…The first is GLP-depth [17] which proposes a hierarchical transformer encoder that captures global features and a simple decoder that considers the local context. The second method [37] employs a neural window fully-connected Conditional Random Fields (CRFs) module for the decoder and a vision transformer for the encoder.…”
Section: Related Workmentioning
confidence: 99%
“…The first is GLP-depth [17] which proposes a hierarchical transformer encoder that captures global features and a simple decoder that considers the local context. The second method [37] employs a neural window fully-connected Conditional Random Fields (CRFs) module for the decoder and a vision transformer for the encoder.…”
Section: Related Workmentioning
confidence: 99%
“…The clouds can be calculated from multiple un-referenced images of various point of views inside a scenery (Kuhn et al, 2020;Zhang et al, 2020). Even monocular methods for producing highly accurate depth maps from single photographies exist (Alhashim and Wonka, 2018;Yuan et al, 2022).…”
Section: Technological Outlookmentioning
confidence: 99%
“…pixel-wise) loss is not sufficient, as this basic approach lacks the ability to model the structure of the output. To address this shortcoming, standard approaches turn to using additional modeling components such as, for example, anchor boxes [37,30], non-maximal suppression [37,30], matching losses [2,5,6] or conditional random fields [3,55].…”
Section: Introductionmentioning
confidence: 99%