2019
DOI: 10.1049/iet-cvi.2018.5645
|View full text |Cite
|
Sign up to set email alerts
|

Fully convolutional multi‐scale dense networks for monocular depth estimation

Abstract: Monocular depth estimation is of vital importance in understanding the 3D geometry of a scene. However, inferring the underlying depth is ill-posed and inherently ambiguous. In this study, two improvements to existing approaches are proposed. One is about a clean improved network architecture, for which the authors extend Densely Connected Convolutional Network (DenseNet) to work as end-to-end fully convolutional multi-scale dense networks. The dense upsampling blocks are integrated to improve the output resol… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 39 publications
0
7
0
Order By: Relevance
“…Xu et al [19] Fu et al [10] Liu et al [21] Our estimations can be directly applied. 2) The depth ground truth gotten by LIDAR can be straightly exploited without processing into a dense map.…”
Section: B State-of-the-art Comparisionsmentioning
confidence: 99%
See 2 more Smart Citations
“…Xu et al [19] Fu et al [10] Liu et al [21] Our estimations can be directly applied. 2) The depth ground truth gotten by LIDAR can be straightly exploited without processing into a dense map.…”
Section: B State-of-the-art Comparisionsmentioning
confidence: 99%
“…It is worth noting that Eigen et al [2] and Liu et al [21] used the effective part of ground truth data, which was the lower 2/3 part of the image. Meanwhile, Liu et al [21] and Fu et al [10] applied the dense depth maps to train the network. While the input of our network is the full image, and the original sparse data is utilized as the ground truth.…”
Section: B State-of-the-art Comparisionsmentioning
confidence: 99%
See 1 more Smart Citation
“…Ranftl et al [16] suggested the application of vision transformers as the backbone for dense predictions, where tokens from the different stages of the vision transformer were assembled in an image-like representation using a convolutional decoder. Liu et al [21] proposed a similar approach to ours in which they adopted fully convolutional multiscale dense network based on DenseNet169 [22] for monocular depth estimation. They also proposed a dense upsampling block that includes a sequence of convolutional filters followed by a pixel shuffle operation to obtain higher resolution output.…”
Section: Related Workmentioning
confidence: 99%
“…Eigen et al [33] first applied a multi-scale CNN architecture to predict depth maps from monocular images, which helps capture image details. Following this, some other CNN-based methods [34] were proposed to estimate monocular depth. Xu et al [35] combined CNN and conditional random field to improve the smoothness of estimated depth maps.…”
Section: Depth Prediction From a Single Imagementioning
confidence: 99%