2022
DOI: 10.1109/tits.2020.3010418
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Learning of Depth, Optical Flow and Pose With Occlusion From 3D Geometry

Abstract: Optical flow estimation is a fundamental problem of computer vision and has many applications in the fields of robot learning and autonomous driving. This paper reveals novel geometric laws of optical flow based on the insight and detailed definition of non-occlusion. Then, two novel loss functions are proposed for the unsupervised learning of optical flow based on the geometric laws of non-occlusion. Specifically, after the occlusion part of the images are masked, the flowing process of pixels is carefully co… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 48 publications
(18 citation statements)
references
References 53 publications
0
18
0
Order By: Relevance
“…On both monocular depth estimation and monocular visual odometry by the unsupervised learning of depth and pose, our method outperforms recent state-of-the-art methods. Besides, our method does not need auxiliary tasks, such as optical flow estimation [9]- [12], [26], [39], semantic segmentation [39] or dynamic mask estimation [11], [26], normal estimation [8]. This paper realizes end-to-end iterative view synthesis and pose refinement to jointly optimize the pose and depth estimation networks, allowing the overall parameters to selflearn according to the optimization objective.…”
Section: B Evaluation Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…On both monocular depth estimation and monocular visual odometry by the unsupervised learning of depth and pose, our method outperforms recent state-of-the-art methods. Besides, our method does not need auxiliary tasks, such as optical flow estimation [9]- [12], [26], [39], semantic segmentation [39] or dynamic mask estimation [11], [26], normal estimation [8]. This paper realizes end-to-end iterative view synthesis and pose refinement to jointly optimize the pose and depth estimation networks, allowing the overall parameters to selflearn according to the optimization objective.…”
Section: B Evaluation Resultsmentioning
confidence: 99%
“…parameters from the video to make the depth prediction more accurate, and use the predicted depth map directly to deal with occlusion. DOP [12] divides an image into three regions, static regions, dynamic regions, and occlusion regions by the information of adjacent frames. They solve the occlusion problem by explicit geometric calculation using the predicted point cloud.…”
Section: Self-supervised From Monocular Videomentioning
confidence: 99%
See 1 more Smart Citation
“…Works that employ these algorithmic networks are trained with images and, in the case of supervised learning, with ground truth depth data. The trained network can then be used to estimate depths from unknown 2D images, so called monocular-or single-view depth [7][8][9][10][11].…”
Section: Introductionmentioning
confidence: 99%
“…Recently, some works (Liu et al, 2019a;Wang et al, 2021d;Puy et al, 2020;Li et al, 2021b,a;Wang et al, 2021a) have been done to realize supervised estimation of 3D scene flow from two consecutive frames of point clouds. However, just like it is difficult to obtain the ground truth of optical flow (Wang et al, 2021c(Wang et al, , 2020b, the ground truth of 3D scene flow is also difficult to obtain. Therefore, it is essential to perform unsupervised learning of 3D scene flow.…”
Section: Introductionmentioning
confidence: 99%