2018
DOI: 10.1007/978-3-030-01249-6_43
|View full text |Cite
|
Sign up to set email alerts
|

Supervising the New with the Old: Learning SFM from SFM

Abstract: Recent work has demonstrated that it is possible to learn deep neural networks for monocular depth and ego-motion estimation from unlabelled video sequences, an interesting theoretical development with numerous advantages in applications. In this paper, we propose a number of improvements to these approaches. First, since such selfsupervised approaches are based on the brightness constancy assumption, which is valid only for a subset of pixels, we propose a probabilistic learning formulation where the network … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
107
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 136 publications
(108 citation statements)
references
References 26 publications
(46 reference statements)
1
107
0
Order By: Relevance
“…This baseline is similar to the additional supervision from SLAM found in [17,35]. Similarly, Zhu et al [43] add a supervised loss [1] to solve for optical flow and Kuznietsov et al [18] add a supervised loss for depth estimation from LiDAR.…”
Section: Baseline Loss Functionsmentioning
confidence: 95%
See 1 more Smart Citation
“…This baseline is similar to the additional supervision from SLAM found in [17,35]. Similarly, Zhu et al [43] add a supervised loss [1] to solve for optical flow and Kuznietsov et al [18] add a supervised loss for depth estimation from LiDAR.…”
Section: Baseline Loss Functionsmentioning
confidence: 95%
“…Klodt and Vedaldi [17] use sparse depths and poses from a traditional SLAM system as a supervisory signal to train depth and pose prediction networks. They train from monocular videos (in contrast to [35]), which requires special consideration of scale, and modeling of uncertainty in the depth and poses.…”
Section: Additional Supervisionmentioning
confidence: 99%
“…Minimizing the epipolar and re-projection errors of all matches using CNNs mimics the non-linear pose estimation [3]. The experiment shows that this weak supervisory signal significantly improves the pose estimation and is superior to other SfM supervisions such as [24].…”
Section: Learning From Indirect Methodsmentioning
confidence: 91%
“…Since both Klodt et al [24] and ours use self-supervised weak supervisions, we redo the experiments in [24] that use self-generated poses and sparse depth maps from ORB- Figure 5. Qualitative comparison for depth estimation on the Eigen split.…”
Section: Depth Estimationmentioning
confidence: 99%
“…The predicted unreliable matches are prevented from being utilized during joint learning process to improve the robustness of our model against possible occlusions or ambiguous matches. Unlike existing methods [23,22,19,12] where the uncertainty map is inferred from an input image, our uncertainty module leverages the matching score volume C st to provide more informative cues, as in the approaches for confidence estimation in stereo matching [17]. Concretely, a series of convolutional layers with parameters W C are applied to predict the uncertainty map σ from matching similarity scores C st such that σ = F(C st ; W C ) ∈ R H×W ×1 .…”
Section: Feature Extraction and Similarity Score Computationmentioning
confidence: 99%