2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00465
|View full text |Cite
|
Sign up to set email alerts
|

Learning the Depths of Moving People by Watching Frozen People

Abstract: Train Static scene, moving camera MannequinChallenge (MC) Dataset Predicted depth Human Mask Initial depth from flow RGB Image MVS Depth (supervison) Inference Our depth predictions Moving people, moving camera

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
184
0
1

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
3
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 234 publications
(187 citation statements)
references
References 52 publications
0
184
0
1
Order By: Relevance
“…Generally, monocular depth estimation includes techniques that use a temporal sequence of images captured from a single camera[9]-[11]. However, in this paper, we focus on methods that use only a single image from a single viewpoint for depth estimation.…”
mentioning
confidence: 99%
“…Generally, monocular depth estimation includes techniques that use a temporal sequence of images captured from a single camera[9]-[11]. However, in this paper, we focus on methods that use only a single image from a single viewpoint for depth estimation.…”
mentioning
confidence: 99%
“…This does not mean, however, that these approaches are not suitable in general. Recent work that explicitly dealt with monocular depth estimation, for instance, with focus on videos with humans [ 61 ] or for obstacle detection in autonomous cars [ 62 ], showed that monocular depth estimation from neural networks could be a promising technique in the future.…”
Section: Crime Scene Analysis Framework: Processing Pipelinementioning
confidence: 99%
“…YouTube is used as a data source for many different applications. For video analytics, [4] used~2000 videos obtained from YouTube of people doing "The Mannequin Challenge" [5], to train a model to predicting dense depth in scenarios where both a monocular camera and people in the scene are freely moving. Reference [6] uses video clips collected from YouTube depicting human actors performing various acrobatic stunts (e.g.…”
Section: Related Workmentioning
confidence: 99%