2019
DOI: 10.48550/arxiv.1906.11435
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DeepVIO: Self-supervised Deep Learning of Monocular Visual Inertial Odometry using 3D Geometric Constraints

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(13 citation statements)
references
References 30 publications
0
11
0
Order By: Relevance
“…For supervised learning VO methods, these approaches infer the camera pose by learning directly from real image data, such as Flowdometry [43], cast the VO problem as a regression problem by using FlowNet [44] to extract optical flow features and a fully connected layer to predict camera translation and rotation, and DVO [13] and ESP-VO [45] incorporate recurrent neural networks (RNNs), to implicitly model the sequential motion dynamics of the image sequences. Han, L. et al [13] presented a self-supervised deep learning network for monocular VIO; Shamwell et al [46] presented an unsupervised deep neural network approach to the fusion of RGB-D imagery with inertial measurements for absolute trajectory estimation. Inspired by this work, we incorporated raw IMU data into a visual, odometry-based deep keypoint with a fusion model to regularize the camera pose and alignment depth map.…”
Section: Deep Visual-inertial Odometry Learning Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…For supervised learning VO methods, these approaches infer the camera pose by learning directly from real image data, such as Flowdometry [43], cast the VO problem as a regression problem by using FlowNet [44] to extract optical flow features and a fully connected layer to predict camera translation and rotation, and DVO [13] and ESP-VO [45] incorporate recurrent neural networks (RNNs), to implicitly model the sequential motion dynamics of the image sequences. Han, L. et al [13] presented a self-supervised deep learning network for monocular VIO; Shamwell et al [46] presented an unsupervised deep neural network approach to the fusion of RGB-D imagery with inertial measurements for absolute trajectory estimation. Inspired by this work, we incorporated raw IMU data into a visual, odometry-based deep keypoint with a fusion model to regularize the camera pose and alignment depth map.…”
Section: Deep Visual-inertial Odometry Learning Methodsmentioning
confidence: 99%
“…Both extensions improve the baseline and the attention module performs well. When coupled with the self-supervised depth estimation, the DeepVIO performance training-to have a consistent pose estimation-outperforms all of the state-of-the-art, compared to classical SLAM libviso2 and learning-based techniques, Sc-SfMLearner and SfMLearner [5,13,62]. Figures 6 and 7 show the trajectory in the XY-plane.…”
Section: Pose Estimationmentioning
confidence: 99%
See 1 more Smart Citation
“…VIOLearner [49] presents an online error correction module for deep visual-inertial odometry that estimates the trajectory by fusing RGB-D images with inertial data. DeepVIO [18] recently proposed a fusion network to fuse visual and inertial features. This network is trained with a dedicated loss.…”
Section: Learning-based Pose Estimationmentioning
confidence: 99%
“…In case of a localization problem as an example, learning from complementary sensor types such as cameras and IMUs can bring improved robustness and redundancy when there exists sensor corruptions and failures. Existing methods for learning based VIO rely on different strategies to fuse visual and inertial data, which include so-called FC-fusion networks (Clark et al, 2017b;Han et al, 2019) and soft/hard feature selection networks (Chen et al, 2019;Almalioglu et al, 2019). Broadly speaking, these strategies typically learn additive interac-tions for integrating multiple streams of information, i.e.…”
Section: Introductionmentioning
confidence: 99%