2023
DOI: 10.48550/arxiv.2303.05021
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DiffusionDepth: Diffusion Denoising Approach for Monocular Depth Estimation

Abstract: Monocular depth estimation is a challenging task that predicts the pixel-wise depth from a single 2D image. Current methods typically model this problem as a regression or classification task. We propose DiffusionDepth, a new approach that reformulates monocular depth estimation as a denoising diffusion process. It learns an iterative denoising process to 'denoise' random depth distribution into a depth map with the guidance of monocular visual conditions. The process is performed in the latent space encoded b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(10 citation statements)
references
References 37 publications
0
7
0
Order By: Relevance
“…[42]- [44] extended diffusion model in image segmentation and Chen et al [21] leveraged diffusion model to generate detection box proposals. There are also attempts at applying diffusion models to MDE [23], [24], but they focus on a fully supervised setting. To alleviate the performance degradation induced by diffusing on sparse ground-truth, Saxena et al [23] and Duan et al [24] both proposed to diffuse on the denoised output of network to assist training.…”
Section: B Diffusion Modelmentioning
confidence: 99%
See 4 more Smart Citations
“…[42]- [44] extended diffusion model in image segmentation and Chen et al [21] leveraged diffusion model to generate detection box proposals. There are also attempts at applying diffusion models to MDE [23], [24], but they focus on a fully supervised setting. To alleviate the performance degradation induced by diffusing on sparse ground-truth, Saxena et al [23] and Duan et al [24] both proposed to diffuse on the denoised output of network to assist training.…”
Section: B Diffusion Modelmentioning
confidence: 99%
“…There are also attempts at applying diffusion models to MDE [23], [24], but they focus on a fully supervised setting. To alleviate the performance degradation induced by diffusing on sparse ground-truth, Saxena et al [23] and Duan et al [24] both proposed to diffuse on the denoised output of network to assist training. Self-supervised MDE with the diffusion model is more challenging due to the lack of depth ground-truth.…”
Section: B Diffusion Modelmentioning
confidence: 99%
See 3 more Smart Citations