2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00697
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised High-Resolution Depth Learning From Videos With Dual Networks

Abstract: Unsupervised depth learning takes the appearance difference between a target view and a view synthesized from its adjacent frame as supervisory signal. Since the supervisory signal only comes from images themselves, the resolution of training data significantly impacts the performance. High-resolution images contain more fine-grained details and provide more accurate supervisory signal. However, due to the limitation of memory and computation power, the original images are typically down-sampled during trainin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 65 publications
(30 citation statements)
references
References 49 publications
0
29
0
Order By: Relevance
“…To handle edge cases, such as object motion and occlusion/disocclusion, a predictive explainable mask was used. Inspired by this work, later works extended it by imposing extra geometric priors (Yang et al, 2018;Mahjourian et al, 2018;Chen et al, 2019b;Bian et al, 2019), introducing visual representation learning (Zhan et al, 2018;Spencer et al, 2020;Shu et al, 2020), and adding a self-attention mechanism (Zhou et al, 2019;Johnston and Carneiro, 2020). Unfortunately, these methods may not generally be applicable to endoscopy due to the unique characteristics of minimally invasive surgery environments interframe brightness inconsistency, for example.…”
Section: Self-supervised Depth and Ego-motion Estimationmentioning
confidence: 99%
“…To handle edge cases, such as object motion and occlusion/disocclusion, a predictive explainable mask was used. Inspired by this work, later works extended it by imposing extra geometric priors (Yang et al, 2018;Mahjourian et al, 2018;Chen et al, 2019b;Bian et al, 2019), introducing visual representation learning (Zhan et al, 2018;Spencer et al, 2020;Shu et al, 2020), and adding a self-attention mechanism (Zhou et al, 2019;Johnston and Carneiro, 2020). Unfortunately, these methods may not generally be applicable to endoscopy due to the unique characteristics of minimally invasive surgery environments interframe brightness inconsistency, for example.…”
Section: Self-supervised Depth and Ego-motion Estimationmentioning
confidence: 99%
“…Results. To extensively evaluate the performance, we compare with four types of methods including: (1) self-supervised monocular-based methods (M) [58,44,43,59,60,17,18,15], (2) selfsupervised stereo-based methods (S) [61,62,16,63,64], (3) supervised methods (Sup) [45,46], and (4) methods that use LiDAR signal as guidance (L) [1,47]. The performance comparison with the state-of-the-art methods for these groups is shown in Table 7.…”
Section: Depth Predictionmentioning
confidence: 99%
“…Due to the limitations of the re-project photo-metric loss, the self-supervised monocular (M) [58,44,43,59,60,17,18,15] and stereo-based methods (S) [61,62,16,63,64] usually have Abs Rel over 0.1. With the advantage of using the sparse LiDAR, the performance of our initial depth maps [41] on KITTI dataset, AP @0.7 for cars, AP @0.5 for pedestrians and cyclists.…”
Section: Depth Predictionmentioning
confidence: 99%
“…For the second category, additional predictions of the relative pose of the camera are required. Recently, abundant works have improved the performance of self-supervised MDE through new loss function [10,49,13,42,68], new architecture [38,66,58,14,32] and new supervision from extra constraints [54,57,40,2,26,68,15]. In this paper, we further excavate the potential capacity of self-supervised MDE with the realization of training on stereo images.…”
Section: Related Workmentioning
confidence: 99%