2018 IEEE International Conference on Robotics and Automation (ICRA) 2018
DOI: 10.1109/icra.2018.8460564
|View full text |Cite
|
Sign up to set email alerts
|

Driven to Distraction: Self-Supervised Distractor Learning for Robust Monocular Visual Odometry in Urban Environments

Abstract: We present a self-supervised approach to ignoring "distractors" in camera images for the purposes of robustly estimating vehicle motion in cluttered urban environments. We leverage offline multi-session mapping approaches to automatically generate a per-pixel ephemerality mask and depth map for each input image, which we use to train a deep convolutional network. At run-time we use the predicted ephemerality and depth as an input to a monocular visual odometry (VO) pipeline, using either sparse features or den… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
54
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 63 publications
(54 citation statements)
references
References 34 publications
0
54
0
Order By: Relevance
“…However, these approaches commonly handle non-rigid objects within a relative static background, which is out of major scope of this paper. Most recently, Barnes et al [78] show that explicitly modeling moving things with a 3D prior map can avoid visual odometry drifting. We also consider moving object segmentation, which is under an unsupervised setting with videos only.…”
Section: Related Workmentioning
confidence: 99%
“…However, these approaches commonly handle non-rigid objects within a relative static background, which is out of major scope of this paper. Most recently, Barnes et al [78] show that explicitly modeling moving things with a 3D prior map can avoid visual odometry drifting. We also consider moving object segmentation, which is under an unsupervised setting with videos only.…”
Section: Related Workmentioning
confidence: 99%
“…Whereas single-view depth estimation and multi-view geometry are mostly taken as individual tasks, a few works combine both. The single-view depth estimation can be useful for scale estimation in monocular visual odometry [2,69,71] or fused with SfM-based depth estimates in static environments [12,54,71]. Kumar et al [37] used singleview depth estimation for depth initialization in a multibody or non-rigid SfM-based approach similar to [36].…”
Section: Related Workmentioning
confidence: 99%
“…The key idea applied here is to integrate single-view depth information to provide the metric scale. In contrast to [2,69,71], we apply this idea additionally for scale-aware pose estimation of moving objects. First, object instances in the images I 0 and I 1 detected by a Mask R-CNN [23] (implementation of [63]) are paired based on sparse flow correspondences (p i 0 , p i 1 ) [18] using a simple voting scheme.…”
Section: Monocular Scene Flowmentioning
confidence: 99%
“…Deep learning can be used to improve the accuracy of VO by directly influencing the precision of the keypoints detector. In Barnes, Maddern, Pascoe, and Posner (), a deep neural network has been trained for learning keypoints distractors in monocular VO. The so‐called learned ephemerality mask acts as a rejection scheme for keypoints outliers which might decrease the vehicle localization's accuracy.…”
Section: Deep Learning For Driving Scene Perception and Localizationmentioning
confidence: 99%