Scene Depth Estimation Based on Monocular Vision in Advanced Driving Assistance System

Meng, 丁萌 Ding; Xinyan, 姜欣言 Jiang

doi:10.3788/aos202040.1715001

光学学报

2020

DOI: 10.3788/aos202040.1715001

|View full text |Cite

Scene Depth Estimation Based on Monocular Vision in Advanced Driving Assistance System

丁萌 Ding Meng¹,

姜欣言 Jiang Xinyan²

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2022

2023

Publication Types

Select...

Article3

Relationship

Self Cite0

Independent3

Authors

Journals

Cited by 3 publications

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Monocular Depth Estimation: Lightweight Convolutional and Matrix Capsule Feature-Fusion Network

Wang

Zhu

2022

Sensors

View full text Add to dashboard Cite

This paper reports a study that aims to solve the problem of the weak adaptability to angle transformation of current monocular depth estimation algorithms. These algorithms are based on convolutional neural networks (CNNs) but produce results lacking in estimation accuracy and robustness. The paper proposes a lightweight network based on convolution and capsule feature fusion (CNNapsule). First, the paper introduces a fusion block module that integrates CNN features and matrix capsule features to improve the adaptability of the network to perspective transformations. The fusion and deconvolution features are fused through skip connections to generate a depth image. In addition, the corresponding loss function is designed according to the long-tail distribution, gradient similarity, and structural similarity of the datasets. Finally, the results are compared with the methods applied to the NYU Depth V2 and KITTI datasets and show that our proposed method has better accuracy on the C1 and C2 indices and a better visual effect than traditional methods and deep learning methods without transfer learning. The number of trainable parameters required by this method is 65% lower than that required by methods presented in the literature. The generalization of this method is verified via the comparative testing of the data collected from the internet and mobile phones.

show abstract

Monocular Depth Estimation: Lightweight Convolutional and Matrix Capsule Feature-Fusion Network

Wang

Zhu

2022

Sensors

View full text Add to dashboard Cite

show abstract

基于平面系数表示的自适应深度分布单目深度估计方法

Wang Jiajun,

Liu Yue,

Wu Yuhui

et al. 2023

光学学报

View full text Add to dashboard Cite

Objective Obtaining scene depth is crucial in 3D reconstruction, autonomous driving, and other related tasks. Current methods based on lidar or time of flight (ToF) cameras are not widely applicable due to their high cost. In contrast, only employing a single RGB image to infer scene depth information is more costeffective, which has broader potential for more applications. Inspired by the successful applications of deep learning methods in various illposed problems recently, many researchers tend to adopt convolutional neural networks to estimate reasonable and accurate monocular depths.However, most existing studies based on deep learning focus on how to enhance the feature extraction capability of the network, without attention paid to the distribution of image depths. Estimating the pixel distributions of images can not only improve the inference precision but also make the reconstructed 3D images more consistent with ground truth.Therefore, we propose a new adaptive depth distribution module, which allows the model to predict different depth distributions for each image during the training. MethodsThe NYU Depth -v2 dataset created by New York University is employed. Overall, our model is built based on the encoderdecoder structure with skip connections, which has been proven to be able to guide image generation more effectively. An indirect representation of depth maps based on plane coefficient is also introduced to implicitly add the plane constraint in the depth estimation and obtain smoother depth estimation results in the plane region of the scene. Specifically, two subnetworks with different lightweight designs are adopted at the bottleneck and other upsampling stages in the network to enhance the model's feature extraction capability. In addition, an adaptive depth distribution estimation module is also designed to estimate different depth distributions according to different input images, which makes the pixel distribution of depth maps closer to the ground truth. A twostage training strategy is employed. In the first stage, we load the pretrained weights on ImageNet into the backbone network and optimize the model using the loss function only at the 2D level. In the second stage, we perform joint training through loss functions at both the 2D and 3D levels.Results and Discussions Our study employs multiple metrics including root mean square error (RMSE), relative error (REL) , and intersection over union (IoU) to qualitatively evaluate the inference ability of the proposed model. As shown in Table 1, the proposed lightweight network model outperforms most of the listed methods with only 46 M parameters, which proves the overall structure of the model is concise and effective. The visual comparison results of 3D depth reconstruction (Fig. 5) demonstrate that the proposed network can output smoother and more continuous depth predictions in planar regions, and reasonable predictions in the partially occluded or missing areas of planar regions. In terms of depth distribution, the carefully designed adap...

show abstract

无偏振片液晶透镜深度估计

Lai Wenjie,

Liu Zhiqiang,

Sun Tao

et al. 2023

光学学报

View full text Add to dashboard Cite

Objective Visionbased depth estimation is an important research direction of computer vision, which is of great significance to threedimensional (3D) reconstruction, semantic segmentation, navigation, etc. The monocular depth estimation scheme has the advantages of low cost and easy installation, which cannot be realized by binocular stereo vision and lidar, and it has received more and more attention in recent years. There is a strong correlation between the outof -

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Scene Depth Estimation Based on Monocular Vision in Advanced Driving Assistance System

Cited by 3 publications

References 0 publications

Monocular Depth Estimation: Lightweight Convolutional and Matrix Capsule Feature-Fusion Network

Monocular Depth Estimation: Lightweight Convolutional and Matrix Capsule Feature-Fusion Network

基于平面系数表示的自适应深度分布单目深度估计方法

无偏振片液晶透镜深度估计

Contact Info

Product

Resources

About