2020
DOI: 10.3390/s21010054
|View full text |Cite
|
Sign up to set email alerts
|

Monocular Depth Estimation with Joint Attention Feature Distillation and Wavelet-Based Loss Function

Abstract: Depth estimation is a crucial component in many 3D vision applications. Monocular depth estimation is gaining increasing interest due to flexible use and extremely low system requirements, but inherently ill-posed and ambiguous characteristics still cause unsatisfactory estimation results. This paper proposes a new deep convolutional neural network for monocular depth estimation. The network applies joint attention feature distillation and wavelet-based loss function to recover the depth information of a scene… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 50 publications
0
6
0
Order By: Relevance
“…There are also hybrid methods that use both stereo pair data and video sequence image frames [32,33]. Other refinement strategies, such as [34][35][36]. Some recent approaches used relatively bulky architectures to improve the depth quality [37], which indicates a higher memory cost as well as time.…”
Section: Self-supervised Methodsmentioning
confidence: 99%
“…There are also hybrid methods that use both stereo pair data and video sequence image frames [32,33]. Other refinement strategies, such as [34][35][36]. Some recent approaches used relatively bulky architectures to improve the depth quality [37], which indicates a higher memory cost as well as time.…”
Section: Self-supervised Methodsmentioning
confidence: 99%
“…#params REL↓ RMSE↓ δ1↑ δ2↑ δ3↑ DORN [15] 110.0M 0.072 2.626 0.932 0.984 0.994 AFDB-Net [39] 139.2M 0.071 2.848 0.933 0.983 0.995 VNL [66] 114.2M 0.072 3.258 0.938 0.990 0.998 BTS [34] 112.8M 0.059 2.756 0.956 0.993 0.998 PGA-Net [61] 168.3M 0.063 2.634 0.952 0.992 0.998 TransDepth [65] 311.3M 0.064 2.755 0.956 0.994 0.999 DPT [48] 123…”
Section: Methodsmentioning
confidence: 99%
“…Later on, attention was implemented as spatial attention [1,57], channel-wise attention [22,54], and mix attention [56] to improve classification and detection accuracy. Recent monocular depth estimation methods [25,32,36,39,62] also applied the attention mechanism. However, these attention implementations require relatively heavy computational resources.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, Refs. [ 31 , 41 , 46 ] studied wavelet–based loss function to improve structural details while [ 32 ] proposed learning sparse wavelet coefficients to estimate depth maps. Researchers have also replaced the UNET encoder backbone with dense pretrained image networks, such as ResNet, DenseNet [ 7 ], etc., and tuned the decoder accordingly to estimate depth.…”
Section: Nested Dwt Net Architecturementioning
confidence: 99%