2021
DOI: 10.48550/arxiv.2108.09770
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching

Abstract: Recent methods in stereo matching have continuously improved the accuracy using deep models. This gain, however, is attained with a high increase in computation cost, such that the network may not fit even on a moderate GPU. This issue raises problems when the model needs to be deployed on resource-limited devices. For this, we propose two light models for stereo vision with reduced complexity and without sacrificing accuracy. Depending on the dimension of cost volume, we design a 2D and a 3D model with encode… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
12
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(12 citation statements)
references
References 28 publications
(72 reference statements)
0
12
0
Order By: Relevance
“…Xing et al [8] has proposed adjust multi-branch module which combines depth-wise convolution to reduce the number of channels. The mobilestereonet's [9] main contribution is that they have proposed to reform the cost volume with convolution and use the 2D convolution only for the disparity regression. However, the final FLOPs and latency are still too larger and far from real-time.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Xing et al [8] has proposed adjust multi-branch module which combines depth-wise convolution to reduce the number of channels. The mobilestereonet's [9] main contribution is that they have proposed to reform the cost volume with convolution and use the 2D convolution only for the disparity regression. However, the final FLOPs and latency are still too larger and far from real-time.…”
Section: Related Workmentioning
confidence: 99%
“…With the goal of achieving real-time processing on edge devices (GPU/NPU), many lightweight methods have been proposed [7]- [12]. Those methods can be roughly divided into two categories: multi-stage method [8], [10], [13] and model compression method [7], [9], [11]. The computational complexity of the network depends on two factors: the size 1 The authors are with the UBTech Robotics Corp, Shenzhen, China {baiyu.pan,jichao.jiao,walton}@ubtrobot.com 2 The author is with the Beijing University of Posts and Telecommunications, Beijing, China jiaojichao@bupt.edu.cn 3 The authors are with the Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China jun.cheng@siat.ac.cn of the feature map and the number of convolution kernels.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…With the rise of deep learning, stereo matching continued to be reformed by these modern techniques. Following the general paradigm for stereo reconstruction, deep models can be divided into two categories: the methods that formulate one or some of the steps with a deep learning framework [2,28,37], and the approaches that transfer the full process in an end-to-end scheme [4,12,19,21,29,39]. Following the recent research, our model is also an end-to-end one, processing a 3-tuple sample.…”
Section: Related Workmentioning
confidence: 99%
“…The cell-phone of the 90s was a phone, the modern cellphone is a handheld computational imaging platform [9] that is capable of acquiring high-quality images, pose, and depth. Recent years have witnessed explosive advances in passive depth imaging, from single-image methods that leverage large data priors to predict structure directly from image features [39,40] to efficient multi-view approaches grounded in principles of 3D geometry and epipolar projection [49,46]. Alongside, progress has been made in the miniaturization and cost-reduction [3] of active depth systems such as LiDAR and correlation time-of-flight sensors [29].…”
Section: Introductionmentioning
confidence: 99%