EdgeStereo: An Effective Multi-task Learning Network for Stereo Matching and Edge Detection

Song, Xiaoning; Zhao, Xu; Fang, Liangji; Hu, Hanwen; Yu, Yizhou

doi:10.1007/s11263-019-01287-w

Cited by 144 publications

(73 citation statements)

References 67 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is tempting to use a state-of-the-art stereo algorithm instead, e.g. [3,2,29], however most modern stereo algorithms are supervised using the LiDAR ground truth from the KITTI dataset. Using one of these would cause us to be implicitly learning from laser-scanned ground-truth data.…”

Section: Computing Depth Hintsmentioning

confidence: 99%

Self-Supervised Monocular Depth Hints

Watson¹,

Firman²,

Brostow

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

217

178

View full text Add to dashboard Cite

Monocular depth estimators can be trained with various forms of self-supervision from binocular-stereo data to circumvent the need for high-quality laser scans or other ground-truth data. The disadvantage, however, is that the photometric reprojection losses used with selfsupervised learning typically have multiple local minima. These plausible-looking alternatives to ground truth can restrict what a regression network learns, causing it to predict depth maps of limited quality. As one prominent example, depth discontinuities around thin structures are often incorrectly estimated by current state-of-the-art methods.Here, we study the problem of ambiguous reprojections in depth prediction from stereo-based self-supervision, and introduce Depth Hints to alleviate their effects. Depth Hints are complementary depth suggestions obtained from simple off-the-shelf stereo algorithms. These hints enhance an existing photometric loss function, and are used to guide a network to learn better weights. They require no additional data, and are assumed to be right only sometimes. We show that using our Depth Hints gives a substantial boost when training several leading self-supervised-from-stereo models, not just our own. Further, combined with other good practices, we produce state-of-the-art depth predictions on the KITTI benchmark. We demonstrate that our selective training using DepthHints is a general enhancement that can improve multiple leading self-supervised training algorithms, allowing our implementations to reach better minima. The Depth Hints can come from the same stereo image data, via, e.g. OpenCV's stereo estimates [13,14].3. We show that our selective training with Depth Hints, coupled with sensible network design choices, leads us to outperform most other algorithms. We achieve state-of-the-art results on the KITTI dataset [8], outperforming both our baseline model and previously published results.

show abstract

Section: Computing Depth Hintsmentioning

confidence: 99%

Self-Supervised Monocular Depth Hints

Watson¹,

Firman²,

Brostow

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

217

178

View full text Add to dashboard Cite

show abstract

“…e FlowNet provides the basic 2D encoder-decoder structure. Later, a lot of networks [23,24,26,27,32] have been proposed based on this. Optical flow estimation requires precise per-pixel localization, and it also depends on finding correspondences between two input images.…”

Section: End-to-end Stereo Matchingmentioning

confidence: 99%

“…Tons of algorithms based on this have been proposed. ese methods could roughly be categorized into two groups: 2D encode-decoder structures [23][24][25][26][27] and regularization modules composed of 3D convolutions [28][29][30][31]. DispNetC [24] computes a correlation volume from the left and right image features (encoding) and utilizes a CNN to directly regress (decoding) disparity maps.…”

Section: Introductionmentioning

confidence: 99%

Review of Stereo Matching Algorithms Based on Deep Learning

Zhou

Meng

Cheng

2020

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

Stereo vision is a flourishing field, attracting the attention of many researchers. Recently, leveraging on the development of deep learning, stereo matching algorithms have achieved remarkable performance far exceeding traditional approaches. This review presents an overview of different stereo matching algorithms based on deep learning. For convenience, we classified the algorithms into three categories: (1) non-end-to-end learning algorithms, (2) end-to-end learning algorithms, and (3) unsupervised learning algorithms. We have provided a comprehensive coverage of the remarkable approaches in each category and summarized the strengths, weaknesses, and major challenges, respectively. The speed, accuracy, and time consumption were adopted to compare the different algorithms.

show abstract

“…The authors propose a feature-based matching methodology as opposed to a deep learning-based approach. The main reason for this decision is the fact that most deep learning methods demand a large amount of processing power [5][6][7]. This will be an extremely limiting factor if environments with restricted resources are considered where the resources are not abundant.…”

Section: Introductionmentioning

confidence: 99%

Performance analysis of a fuzzy disparity selector for stereo matching of image segments under radiometric variations

Shetty¹,

George

Nayak

et al. 2020

Turk J Elec Eng & Comp Sci

View full text Add to dashboard Cite

Stereo matching algorithms generate disparity maps, which contain the depth information of the environment, from two or more images of a scene taken from different viewpoints. The process of obtaining dense disparity maps is a problem which is still being actively researched. The presence of radiometric differences in the images only further complicates the stereo matching problem. In the present research work, the images are initially split into small patches of pixels, such that pixels in each patch have similar intensities. The authors attempt to study the effect of the parameters, namely, tuning parameter ' α ' and the number of segments, while the images are subjected to variations in exposure and illumination. The value ' α ' performs the function of a weight signifying the contribution of each data cost, when the two data costs are combined in a linear fashion. Lastly, the results of this methodology are compared with other methods that try to tackle the problem of stereo matching under radiometric variations.

show abstract

EdgeStereo: An Effective Multi-task Learning Network for Stereo Matching and Edge Detection

Cited by 144 publications

References 67 publications

Self-Supervised Monocular Depth Hints

Self-Supervised Monocular Depth Hints

Review of Stereo Matching Algorithms Based on Deep Learning

Performance analysis of a fuzzy disparity selector for stereo matching of image segments under radiometric variations

Contact Info

Product

Resources

About