2020
DOI: 10.1016/j.cviu.2019.102848
|View full text |Cite
|
Sign up to set email alerts
|

On the benefit of adversarial training for monocular depth estimation

Abstract: In this paper we address the benefit of adding adversarial training to the task of monocular depth estimation. A model can be trained in a self-supervised setting on stereo pairs of images, where depth (disparities) are an intermediate result in a right-to-left image reconstruction pipeline. For the quality of the image reconstruction and disparity prediction, a combination of different losses is used, including L1 image reconstruction losses and left-right disparity smoothness. These are local pixel-wise loss… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 26 publications
(11 citation statements)
references
References 34 publications
0
9
0
2
Order By: Relevance
“…Alternatively, the recent PackNet model (Guizilini et al, 2020a) proposes to automatically scale estimations with additional constraints imposed by the instantaneous velocity of the ego-vehicle. Some works have also moved to a stereo setup to disambiguate the scale factor, using additional information, at train time only (Godard et al, 2017;Groenendijk et al, 2020) or also at run time (Chang and Chen, 2018;Kendall et al, 2017;Cheng et al, 2019), thus abandoning the monocular setup.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Alternatively, the recent PackNet model (Guizilini et al, 2020a) proposes to automatically scale estimations with additional constraints imposed by the instantaneous velocity of the ego-vehicle. Some works have also moved to a stereo setup to disambiguate the scale factor, using additional information, at train time only (Godard et al, 2017;Groenendijk et al, 2020) or also at run time (Chang and Chen, 2018;Kendall et al, 2017;Cheng et al, 2019), thus abandoning the monocular setup.…”
Section: Background and Related Workmentioning
confidence: 99%
“…GAN-based methods can be divided into two categories, namely fully-supervised and semi-supervised settings. The former [21,33] uses the discriminator to distinguish model predictions from ground truths, while the latter [57,26] uses the GAN to explore the contribution of unlabeled data. [88] uses a cooperative learning framework [71,74] for generative saliency prediction.…”
Section: Vision Transformersmentioning
confidence: 99%
“…Pilzer et al [40] proposed a cycled architecture consisting of two generators and two discriminators for depth estimation, inspired by CycleGAN [41]. Groenendijk et al [42] evaluated depth estimation performance by integrating Vanilla GAN with a PatchGAN [43] discriminator, Least Square GAN (LSGANs) [44], and Gradient-Penalty Wasserstein GANs (WGAN-GP) [45], [46] with pixel-level reconstruction loss. Ji et al [34] integrates a PatchGANbased pair discriminator and a depth discriminator into one framework and feeds back the generator as a loss for more realistic and accurate depth estimation.…”
Section: Gans For Depth Predictionmentioning
confidence: 99%
“…Recently, several approaches for predicting depth using generative adversarial networks (GANs) have been proposed for improving CNN based studies [34], [37]- [40], [42]. GAN [2] is highlighted as an innovative concept, mainly in fields of image synthesis, style transfer, and super-resolution image.…”
Section: Introductionmentioning
confidence: 99%