2022
DOI: 10.1109/tpami.2020.3019967
|View full text |Cite
|
Sign up to set email alerts
|

Abstract: The success of monocular depth estimation relies on large and diverse training sets. Due to the challenges associated with acquiring dense ground-truth depth across different environments at scale, a number of datasets with distinct characteristics and biases have emerged. We develop tools that enable mixing multiple datasets during training, even if their annotations are incompatible. In particular, we propose a robust training objective that is invariant to changes in depth range and scale, advocate the use … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
529
1

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 551 publications
(531 citation statements)
references
References 59 publications
1
529
1
Order By: Relevance
“…We compare two recent methods for monocular depth estimation -monodepth2 [4] and MiDaS [13], using official public implementations of both.…”
Section: Depth Estimation 21 Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We compare two recent methods for monocular depth estimation -monodepth2 [4] and MiDaS [13], using official public implementations of both.…”
Section: Depth Estimation 21 Methodsmentioning
confidence: 99%
“…Hyperparameter ε controls the minimal offset from the camera to regions on the image. For stylization pre-trained AdaIN encoder/decoder [15,6] and pre-trained depth network [13] is used. Computational advantage of our method is that it is learningfree: given pretrained encoder, decoder and depth estimation network, method does not require additional training for new styles.…”
Section: Proposed Extensionmentioning
confidence: 99%
See 1 more Smart Citation
“…An alternative is to train on simulated data, but then generalization to diverse real-world scenes can become an issue. Therefore, while supervised learning methods have demonstrated impressive results [25], [26], [27], [28], it is desirable to develop algorithms that function in the absence of large annotated datasets.…”
Section: Related Workmentioning
confidence: 99%
“…The 20-channel network output is then processed by a single 3 × 3 convolutional layer to shrink the channels to 1. The design of this compact fully convolutional network has been inspired by the refinement layer used by previous works on supervised depth estimation [27], [26].…”
Section: Model Detailsmentioning
confidence: 99%