2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016
DOI: 10.1109/cvpr.2016.595
|View full text |Cite
|
Sign up to set email alerts
|

Deep Stereo: Learning to Predict New Views from the World's Imagery

Abstract: Deep networks have recently enjoyed enormous success when applied to recognition and classification problems in computer vision [20,29], but their use in graphics problems has been limited ([21, 7] are notable recent exceptions). In this work, we present a novel deep architecture that performs new view synthesis directly from pixels, trained from a large number of posed image sets. In contrast to traditional approaches which consist of multiple complex stages of processing, each of which require careful tunin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
444
0
2

Year Published

2016
2016
2018
2018

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 481 publications
(447 citation statements)
references
References 37 publications
1
444
0
2
Order By: Relevance
“…view extrapolation); 2) Unlike [27], who have a fixed number of input views, our multiview network is more flexible at both training and test time as it could take in an arbitrary number of input views for joint prediction, which is particularly beneficial when the number of input views varies at test time.…”
Section: Comparison With Deepstereo [27]mentioning
confidence: 99%
See 2 more Smart Citations
“…view extrapolation); 2) Unlike [27], who have a fixed number of input views, our multiview network is more flexible at both training and test time as it could take in an arbitrary number of input views for joint prediction, which is particularly beneficial when the number of input views varies at test time.…”
Section: Comparison With Deepstereo [27]mentioning
confidence: 99%
“…Buehler et al [26] presented a unifying framework for these image-based rendering techniques. The recent DeepStereo work by Flynn et al [27] is a learning-based extension that performs compositing through learned geometric reasoning using a CNN, and can generate intermediate views of a scene by interpolating from a set of surrounding views. While these methods yield high-quality novel views, they do so by composting the corresponding input image rays for each output pixel and can therefore only generate already seen content, (e.g.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In particular, our work bears resemblances to [23], which generates such imagery using a convolutional neural network (CNN). In contrast to [23], our goal is not to learn an image synthesis pipeline to be queried at arbitrary poses, but rather to learn a correction to the appearance of a single image taken from a fixed pose.…”
Section: Related Workmentioning
confidence: 91%
“…Our work is related to the field of image-based rendering (e.g., [22], [23]), which aims to synthesize new views of a scene by blending existing images. In particular, our work bears resemblances to [23], which generates such imagery using a convolutional neural network (CNN).…”
Section: Related Workmentioning
confidence: 99%