2022
DOI: 10.48550/arxiv.2205.08780
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Visual Attention-based Self-supervised Absolute Depth Estimation using Geometric Priors in Autonomous Driving

Abstract: Although existing monocular depth estimation methods have made great progress, predicting an accurate absolute depth map from a single image is still challenging due to the limited modeling capacity of networks and the scale ambiguity issue. In this paper, we introduce a fully Visual Attention-based Depth (VADepth) network, where spatial attention and channel attention are applied to all stages. By continuously extracting the dependencies of features along the spatial and channel dimensions over a long distanc… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 39 publications
0
1
0
Order By: Relevance
“…These assumptions allow them to use camera extrinsic parameters, in particular the camera height and compare it with the estimated height by fitting plane on the road, to recover the scale as a post-processing step. [44,50] also use camera height for scale but incorporate it within training. These methods rely on heuristics that a flat road plane is visible in the area of interest and that the camera position and orientation remain constant over time, which are often not realistic.…”
Section: Scale-disambiguation/consistency-enforcement Via Supervisionmentioning
confidence: 99%
“…These assumptions allow them to use camera extrinsic parameters, in particular the camera height and compare it with the estimated height by fitting plane on the road, to recover the scale as a post-processing step. [44,50] also use camera height for scale but incorporate it within training. These methods rely on heuristics that a flat road plane is visible in the area of interest and that the camera position and orientation remain constant over time, which are often not realistic.…”
Section: Scale-disambiguation/consistency-enforcement Via Supervisionmentioning
confidence: 99%