2021
DOI: 10.48550/arxiv.2105.14520
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Unsupervised Joint Learning of Depth, Optical Flow, Ego-motion from Video

Jianfeng Li,
Junqiao Zhao,
Shuangfu Song
et al.

Abstract: Estimating geometric elements such as depth, camera motion, and optical flow from images is an important part of the robot's visual perception. We use a joint selfsupervised method to estimate the three geometric elements. Depth network, optical flow network and camera motion network are independent of each other but are jointly optimized during training phase. Compared with independent training, joint training can make full use of the geometric relationship between geometric elements and provide dynamic and s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 32 publications
0
3
0
Order By: Relevance
“…In the context of monocular camera data, the lack of direct depth ground truth for comparison has necessitated the use of a 2D-plane-based loss function in previous research. This challenge is further exacerbated by the necessity to incorporate the learning of the camera pose algorithm, complicating the application of specialized loss functions focused on features like lines, surfaces, or vanishing points, as well as those designed for generative models [64][65][66].…”
Section: Self-supervised Loss Functions Including Depth Consistencymentioning
confidence: 99%
See 1 more Smart Citation
“…In the context of monocular camera data, the lack of direct depth ground truth for comparison has necessitated the use of a 2D-plane-based loss function in previous research. This challenge is further exacerbated by the necessity to incorporate the learning of the camera pose algorithm, complicating the application of specialized loss functions focused on features like lines, surfaces, or vanishing points, as well as those designed for generative models [64][65][66].…”
Section: Self-supervised Loss Functions Including Depth Consistencymentioning
confidence: 99%
“…Contrary to two-stage learning methods or transfer learning utilized in some previous studies, our research is dedicated to a purely self-supervised learning approach. Through experimentation, we have found that the implementation of the geometric consistency constraint loss function, as adopted in recent camera pose prediction research [17,64,65,70,72], significantly improves prediction performance. The reduction in loss through geometric consistency not only addresses the occlusion issue but also bolsters camera pose prediction.…”
Section: Camera Pose Estimationmentioning
confidence: 99%
“…without recognizing their relevance. Some methods [5,3,17,15] train motion-semantics networks in a multi-task manner, using loss functions that may contradict each other, which may drop the performance.…”
Section: Introductionmentioning
confidence: 99%