2018
DOI: 10.1007/978-3-030-01424-7_62
|View full text |Cite
|
Sign up to set email alerts
|

Location Dependency in Video Prediction

Abstract: Deep convolutional neural networks are used to address many computer vision problems, including video prediction. The task of video prediction requires analyzing the video frames, temporally and spatially, and constructing a model of how the environment evolves. Convolutional neural networks are spatially invariant, though, which prevents them from modeling location-dependent patterns. In this work, the authors propose location-biased convolutional layers to overcome this limitation. The effectiveness of locat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
3
1

Relationship

4
3

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 6 publications
(7 reference statements)
0
5
0
Order By: Relevance
“…We evaluated our LFDTN model on this dataset. We compared our model against many well-known models like Conv-PGP [8], VLN-ResNet [23], VLN-LDC [21], HPNetT [6], PredRNN [24], † http://ais.uni-bonn.de/~hfarazi/LFDTN/ Fig. 7: Internal states' formation for two samples for color a) "Moving MNIST on STL", and b) "NGSIM" datasets.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We evaluated our LFDTN model on this dataset. We compared our model against many well-known models like Conv-PGP [8], VLN-ResNet [23], VLN-LDC [21], HPNetT [6], PredRNN [24], † http://ais.uni-bonn.de/~hfarazi/LFDTN/ Fig. 7: Internal states' formation for two samples for color a) "Moving MNIST on STL", and b) "NGSIM" datasets.…”
Section: Resultsmentioning
confidence: 99%
“…To get a better result and help the model decide how reliable each local velocity is, we also include each direction's variance [𝜎 2 𝑥 , 𝜎 2 𝑦 ] and concatenate it channel-wise. Since a convolutional model cannot learn location-dependent features, similar to the positional encoding proposed by Azizi et.al [21], we add two additional channels to the input. To grant the model the ability account for former velocities, we channelwise concatenate the same saved features from previous time steps.…”
Section: A Transform Modelmentioning
confidence: 99%
“…While requiring a large number of parameters, they lack interpretability. Two successful examples are Location-Dependent Video Ladder Networks [5] and PredRNN++ [6], which employs a stack of LSTM modules to predict plausible future frames.…”
Section: Related Workmentioning
confidence: 99%
“…Transpose-convolutional layers are used for up-sampling the representations. To use location-dependent features, we used newly proposed location-dependent convolutional layer [15]. In order to limit the number of parameters used, a shared learnable bias between both output heads is implemented.…”
Section: Deep Learning Visual Perceptionmentioning
confidence: 99%