2020
DOI: 10.1007/978-3-030-58452-8_42
|View full text |Cite
|
Sign up to set email alerts
|

Learning Stereo from Single Images

Abstract: Supervised deep networks are among the best methods for finding correspondences in stereo image pairs. Like all supervised approaches, these networks require ground truth data during training. However, collecting large quantities of accurate dense correspondence data is very challenging. We propose that it is unnecessary to have such a high reliance on ground truth depths or even corresponding stereo pairs. Inspired by recent progress in monocular depth estimation, we generate plausible disparity maps from sin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 53 publications
(32 citation statements)
references
References 75 publications
0
32
0
Order By: Relevance
“…However, the data gap is inevitable from the virtual data to the real autonomous driving scenario. Therefore, we propose a 6-DoF pose augmentation by generating a random pose T aug and then forward warping [38] the origin image X t to synthesizing a new arbitrary view X t aug :…”
Section: -Dof Pose Augmentation By Means Of Forward Warpingmentioning
confidence: 99%
“…However, the data gap is inevitable from the virtual data to the real autonomous driving scenario. Therefore, we propose a 6-DoF pose augmentation by generating a random pose T aug and then forward warping [38] the origin image X t to synthesizing a new arbitrary view X t aug :…”
Section: -Dof Pose Augmentation By Means Of Forward Warpingmentioning
confidence: 99%
“…Strictly re- lated to our work is [61], estimating depth from single images to synthesize virtual right views and thus obtain stereo pairs, used to train deep stereo networks. Despite the analogy of using single image depth estimation, we point out that our goal differs from [61] since we aim at modeling arbitrary motions in the scene (i.e., optical flow) rather than a horizontal pixel displacement between synchronized images (i.e., disparity). Purposely, we will describe the additional strategies required to attain, from single still images, the best training data for optical flow networks.…”
Section: Related Workmentioning
confidence: 99%
“…In case the network estimates inverse depth, we bring it to the depth domain first. D 0 usually shows blurred edges [61,49], causing flying pixels in the 3D space that can be easily sharpened via edge-preserving filters [32].…”
Section: Depthstillation Pipelinementioning
confidence: 99%
See 2 more Smart Citations