2017 IEEE Intelligent Vehicles Symposium (IV) 2017
DOI: 10.1109/ivs.2017.7995953
|View full text |Cite
|
Sign up to set email alerts
|

Geometry-based next frame prediction from monocular video

Abstract: Abstract-We consider the problem of next frame prediction from video input. A recurrent convolutional neural network is trained to predict depth from monocular video input, which, along with the current video image and the camera trajectory, can then be used to compute the next frame. Unlike prior nextframe prediction approaches, we take advantage of the scene geometry and use the predicted depth for generating the next frame prediction. Our approach can produce rich next frame predictions which include depth … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
23
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(24 citation statements)
references
References 28 publications
0
23
0
Order By: Relevance
“…In some earlier studies [23][24][25], a next frame was predicted using the current frame and previous sequential frames. A dual-motion GAN model (ConvLSTMGAN) was proposed [23], and image prediction was performed using a visible light image.…”
Section: Prediction Of Next Framementioning
confidence: 99%
See 1 more Smart Citation
“…In some earlier studies [23][24][25], a next frame was predicted using the current frame and previous sequential frames. A dual-motion GAN model (ConvLSTMGAN) was proposed [23], and image prediction was performed using a visible light image.…”
Section: Prediction Of Next Framementioning
confidence: 99%
“…In this method, the proposed network was trained in a hybrid way using real and synthetic videos. In [25], a method for generating the next frame using a visible light image and ConvLSTM was proposed. In this study, the depth image is predicted using a current image and camera The remainder of this study is organized as follows.…”
Section: Prediction Of Next Framementioning
confidence: 99%
“…Structured random forests [28], feed-forward CNNs [29] and variational autoencoders [30] have all been used to predict dense pixel trajectories from single frames. Prediction of raw pixels-as opposed to pixel trajectories-has been attempted using ordinary feedforward networks, GANs [31], [32] and RNNs [33], [34].…”
Section: Background and Related Workmentioning
confidence: 99%
“…LSTM is a widely applicable kind of RNN which contains feedback connections for both single data points and entire data sequences in deep learning [ 50 ]. The optimization task regarding accurate future image prediction has been a highlighted problem in artificial intelligence in recent several years [ 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 ]. Kalchbrenner et al have developed a video pixel network to predict the joint distribution of future image in pixel videos [ 60 ].…”
Section: Introductionmentioning
confidence: 99%