SegStereo: Exploiting Semantic Information for Disparity Estimation

Yang, Gang; Zhao, Hengshuang; Shi, Jianping; Deng, Zhidong; Jia, Jiaya

doi:10.1007/978-3-030-01234-2_39

Cited by 305 publications

(197 citation statements)

References 44 publications

Supporting

Mentioning

196

Contrasting

Order By: Relevance

“…Ladicky et al [15] estimate the depth based on different canonical views and show that semantic knowledge helps to improve the prediction. This is verified for stereo-matching methods by [26,31], too. In this paper, we empirically prove that this concept also holds for depth estimation from a single monocular image with a CNN.…”

Section: Related Worksupporting

confidence: 58%

SDNet: Semantically Guided Depth Estimation Network

Ochs

Kretz

Mester

2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Autonomous vehicles and robots require a full scene understanding of the environment to interact with it. Such a perception typically incorporates pixel-wise knowledge of the depths and semantic labels for each image from a video sensor. Recent learning-based methods estimate both types of information independently using two separate CNNs. In this paper, we propose a model that is able to predict both outputs simultaneously, which leads to improved results and even reduced computational costs compared to independent estimation of depth and semantics. We also empirically prove that the CNN is capable of learning more meaningful and semantically richer features. Furthermore, our SD-Net estimates the depth based on ordinal classification. On the basis of these two enhancements, our proposed method achieves state-of-theart results in semantic segmentation and depth estimation from single monocular input images on two challenging datasets.

show abstract

Section: Related Worksupporting

confidence: 58%

SDNet: Semantically Guided Depth Estimation Network

Ochs

Kretz

Mester

2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…When applied to down-scaled images, these methods run faster, but gives blurry results and inaccurate disparity estimates for the far-field. Recent "deep" stereo methods perform well on low-resolution benchmarks [5,11,16,21,38], while failing to produce SOTA results on high-res benchmarks [26]. This is likely due to: 1) Their architectures are not efficiently designed to operate on high-resolution images.…”

Section: Introductionmentioning

confidence: 99%

Hierarchical Deep Stereo Matching on High-Resolution Images

Yang

Manela²,

Happold

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

215

182

View full text Add to dashboard Cite

Figure 1: Illustration of on-demand depth sensing with a coarse-to-fine hierarchy on the proposed dataset. Our method (HSM) captures the coarse layout of the scene in 91 milliseconds, finds the far-away car (shown in the red box) in 175 ms, and recovers the details of the car given extra 255 ms. AbstractWe explore the problem of real-time stereo matching on high-res imagery. Many state-of-the-art (SOTA) methods struggle to process high-res imagery because of memory constraints or speed limitations. To address this issue, we propose an end-to-end framework that searches for correspondences incrementally over a coarse-to-fine hierarchy. Because high-res stereo datasets are relatively rare, we introduce a dataset with high-res stereo pairs for both training and evaluation. Our approach achieved SOTA performance on Middlebury-v3 and KITTI-15 while running significantly faster than its competitors. The hierarchical design also naturally allows for anytime on-demand reports of disparity by capping intermediate coarse results, allowing us to accurately predict disparity for near-range structures with low latency (30ms). We demonstrate that the performance-vs-speed tradeoff afforded by on-demand hierarchies may address sensing needs for time-critical applications such as autonomous driving.

show abstract

“…On KITTI 2012 dataset "Noc" means non occluded regions and "All" mean all regions. Notice, that we perform comparable against SegStereo [27] on KITTI 2015 but way better in KITTI 2012 dataset.…”

Section: Resultsmentioning

confidence: 80%

StereoDRNet: Dilated Residual StereoNet

Chabra

Straub

Sweeney

et al. 2019

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Figure 1: StereoDRNet enables estimation of high quality depth maps that opens the door to high quality reconstruction by passive stereo video. In this figure we compare the output from dense reconstruction [15] built form depth maps generated by StereoDRNet, PSMNet [2] and a structured light system [25] (termed Ground Truth). We report and visualize point-toplane distance RMS error on the reconstructed meshes with respect to the ground truth demonstrating the improvement in reconstruction over the state-of-the-art. AbstractWe propose a system that uses a convolution neural network (CNN) to estimate depth from a stereo pair followed by volumetric fusion of the predicted depth maps to produce a 3D reconstruction of a scene. Our proposed depth refinement architecture, predicts view-consistent disparity and occlusion maps that helps the fusion system to produce geometrically consistent reconstructions. We utilize 3D dilated convolutions in our proposed cost filtering network that yields better filtering while almost halving the computational cost in comparison to state of the art cost filtering architectures. For feature extraction we use the Vortex Pooling architecture [26]. The proposed method achieves state of the art results in KITTI 2012, KITTI 2015 and ETH 3D stereo benchmarks. Finally, we demonstrate that our system is able to produce high fidelity 3D scene reconstructions that outperforms the state of the art stereo system.

show abstract

SegStereo: Exploiting Semantic Information for Disparity Estimation

Cited by 305 publications

References 44 publications

SDNet: Semantically Guided Depth Estimation Network

SDNet: Semantically Guided Depth Estimation Network

Hierarchical Deep Stereo Matching on High-Resolution Images

StereoDRNet: Dilated Residual StereoNet

Contact Info

Product

Resources

About