DYAN: A Dynamical Atoms-Based Network for Video Prediction

Liu, WenQian; Sharma, Abhishek; Camps, Octavia; Sznaier, Mario

doi:10.1007/978-3-030-01258-8_11

Cited by 26 publications

(33 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The deterministic models are trained to predict the future frames, exactly as in the ground truth video. The deterministic models we use are PredNet [15], MCnet [14], Future GAN [45] and DYAN [46]. On the other hand, the stochastic models model uncertainty by being trained to predict a distribution of possible futures using noise as an additional input.…”

Section: Databasementioning

confidence: 99%

Understanding the perceived quality of video predictions

Somraj

Kashi

Arun

et al. 2022

Signal Processing: Image Communication

View full text Add to dashboard Cite

Section: Databasementioning

confidence: 99%

Understanding the perceived quality of video predictions

Somraj

Kashi

Arun

et al. 2022

Signal Processing: Image Communication

View full text Add to dashboard Cite

“…Later, Wu et al [92] extended this approach, conditioning predictions on the trajectories of objects that their model segments and tracks. Recent work has used other motion representations, such as factorizing a scene into stationary and moving components [12,85,83], per-pixel kernels [24,86,64,70,47], or Eulerian motion [55]. Work in 3D view synthesis has adopted a similar copy-and-paste approach, known as appearance flow [102,66,66].…”

Section: Related Workmentioning

confidence: 99%

Comparing Correspondences: Video Prediction with Correspondence-wise Losses

Geng¹,

Owens²

2021

Preprint

View full text Add to dashboard Cite

Today's image prediction methods struggle to change the locations of objects in a scene, producing blurry images that average over the many positions they might occupy. In this paper, we propose a simple change to existing image similarity metrics that makes them more robust to positional errors: we match the images using optical flow, then measure the visual similarity of corresponding pixels. This change leads to crisper and more perceptually accurate predictions, and can be used with any image prediction network. We apply our method to predicting future frames of a video, where it obtains strong performance with simple, off-the-shelf architectures.

show abstract

“…The main consideration is to minimize the reconstruction error between the true future frame and the generated future frame. Such models can be classified as direct prediction models [35,46,43,21,3,39,30,38,18,25] and transformationbased prediction models [49,40,37,32]. Direct prediction models predict pixel values of future frames directly.…”

Section: Video Generation and Video Predictionmentioning

confidence: 99%

Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction

Jin

Tang

et al. 2020

2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

View full text Add to dashboard Cite

Video prediction is a pixel-wise dense prediction task to infer future frames based on past frames. Missing appearance details and motion blur are still two major problems for current predictive models, which lead to image distortion and temporal inconsistency. In this paper, we point out the necessity of exploring multi-frequency analysis to deal with the two problems. Inspired by the frequency band decomposition characteristic of Human Vision System (HVS), we propose a video prediction network based on multi-level wavelet analysis to deal with spatial and temporal information in a unified manner. Specifically, multi-level spatial discrete wavelet transform decomposes each video frame into anisotropic sub-bands with multiple frequencies, helping to enrich structural information and reserve fine details. On the other hand, multi-level temporal discrete wavelet transform which operates on time axis decomposes the frame sequence into sub-band groups of different frequencies to accurately capture multi-frequency motions under fixed frame rate. Extensive experiments on diverse datasets demonstrate that our model shows significant improvements on fidelity and temporal consistency over the state-of-the-art works.

show abstract

DYAN: A Dynamical Atoms-Based Network for Video Prediction

Cited by 26 publications

References 38 publications

Understanding the perceived quality of video predictions

Understanding the perceived quality of video predictions

Comparing Correspondences: Video Prediction with Correspondence-wise Losses

Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction

Contact Info

Product

Resources

About