DeepVO: Towards end-to-end visual odometry with deep Recurrent Convolutional Neural Networks

Wang, Sen; Clark, Ronald; Wen, Hongkai; Trigoni, Niki

doi:10.1109/icra.2017.7989236

Cited by 687 publications

(603 citation statements)

References 27 publications

Supporting

Mentioning

589

Contrasting

Unclassified

Order By: Relevance

“…Sequence 10 Model t rel r rel t rel r rel Two-stream [16] 0.0554 0.0830 0.0870 0.1592 ResNet18 [10] 0.1094 0.0602 0.1443 0.1327 DeepVO [20] 0.2157 0.0709 0.2153 0.3311 PointNet [17] 0.0946 0.0442 0.1381 0.1360 PointGrid [12] 0.0550 0.0690 0.0842 0.1523 DeepPCO (Ours) 0.0263 0.0305 0.0247 0.0659…”

Section: Sequence 04mentioning

confidence: 99%

DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network

Wang

Saputra

Zhao

et al. 2019

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Self Cite

View full text Add to dashboard Cite

Odometry is of key importance for localization in the absence of a map. There is considerable work in the area of visual odometry (VO), and recent advances in deep learning have brought novel approaches to VO, which directly learn salient features from raw images. These learning-based approaches have led to more accurate and robust VO systems. However, they have not been well applied to point cloud data yet. In this work, we investigate how to exploit deep learning to estimate point cloud odometry (PCO), which may serve as a critical component in point cloud-based downstream tasks or learning-based systems. Specifically, we propose a novel end-to-end deep parallel neural network called DeepPCO, which can estimate the 6-DOF poses using consecutive point clouds. It consists of two parallel sub-networks to estimate 3-D translation and orientation respectively rather than a single neural network. We validate our approach on KITTI Visual Odometry/SLAM benchmark dataset with different baselines. Experiments demonstrate that the proposed approach achieves good performance in terms of pose accuracy.

show abstract

Section: Sequence 04mentioning

confidence: 99%

DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network

Wang

Saputra

Zhao

et al. 2019

2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Self Cite

View full text Add to dashboard Cite

show abstract

“…The key concept of our architecture is inspired by [5], which uses an LSTM for action recognition in videos. A similar approach was also used in [32] to predict full 6DOF camera pose. In contrast to the previous approaches, we also evaluate a bidirectional LSTM version, see Figure 3).…”

Section: Layermentioning

confidence: 99%

Estimation of Absolute Scale in Monocular SLAM Using Synthetic Data

Rukhovich¹

2020

PRIN

View full text Add to dashboard Cite

This paper addresses the problem of scale estimation in monocular SLAM by estimating absolute distances between camera centers of consecutive image frames. These estimates would improve the overall performance of classical (not deep) SLAM systems and allow metric feature locations to be recovered from a single monocular camera. We propose several network architectures that lead to an improvement of scale estimation accuracy over the state of the art. In addition, we exploit a possibility to train the neural network only with synthetic data derived from a computer graphics simulator. Our key insight is that, using only synthetic training inputs, we can achieve similar scale estimation accuracy as that obtained from real data. This fact indicates that fully annotated simulated data is a viable alternative to existing deep-learning-based SLAM systems trained on real (unlabeled) data. Our experiments with unsupervised domain adaptation also show that the difference in visual appearance between simulated and real data does not affect scale estimation results. Our method operates with low-resolution images (0.03 MP), which makes it practical for real-time SLAM applications with a monocular camera.

show abstract

“…We aim at utilizing multiple observations from a sequence to reduce the ambiguity of a single image. Since images are high-dimension data with much redundant information, learning information from raw data is ineffective [34]. On the other hand, LSTMs cannot preserve longterm knowledge [29].…”

Section: Content-augmented Pose Estimationmentioning

confidence: 99%

Local Supports Global: Deep Camera Relocalization With Sequence Enhancement

Xue

Wang

Zhang

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

We propose to leverage the local information in image sequences to support global camera relocalization. In contrast to previous methods that regress global poses from single images, we exploit the spatial-temporal consistency in sequential images to alleviate uncertainty due to visual ambiguities by incorporating a visual odometry (VO) component. Specifically, we introduce two effective steps called content-augmented pose estimation and motion-based refinement. The content-augmentation step focuses on alleviating the uncertainty of pose estimation by augmenting the observation based on the co-visibility in local maps built by the VO stream. Besides, the motion-based refinement is formulated as a pose graph, where the camera poses are further optimized by adopting relative poses provided by the VO component as additional motion constraints. Thus, the global consistency can be guaranteed. Experiments on the public indoor 7-Scenes and outdoor Oxford RobotCar benchmark datasets demonstrate that benefited from local information inherent in the sequence, our approach outperforms state-of-the-art methods, especially in some challenging cases, e.g., insufficient texture, highly repetitive textures, similar appearances, and over-exposure.

show abstract

DeepVO: Towards end-to-end visual odometry with deep Recurrent Convolutional Neural Networks

Cited by 687 publications

References 27 publications

DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network

DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network

Estimation of Absolute Scale in Monocular SLAM Using Synthetic Data

Local Supports Global: Deep Camera Relocalization With Sequence Enhancement

Contact Info

Product

Resources

About