2021
DOI: 10.48550/arxiv.2107.08037
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CCVS: Context-aware Controllable Video Synthesis

Abstract: This presentation introduces a self-supervised learning approach to the synthesis of new video clips from old ones, with several new key elements for improved spatial resolution and realism: It conditions the synthesis process on contextual information for temporal continuity and ancillary information for fine control. The prediction model is doubly autoregressive, in the latent space of an autoencoder for forecasting, and in image space for updating contextual information, which is also used to enforce spatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 39 publications
(71 reference statements)
0
1
0
Order By: Relevance
“…In Tab. 1, N ÜWA significantly outperforms 2 181 VideoFlow [18] 3 131 LVT [31] 1 126±3 SAVP [20] 2 116 DVD-GAN-FP [7] 1 110 Video Transformer (S) [44] 1 106±3 TriVD-GAN-FP [23] 1 103 CCVS [25] 1 99±2 Video Transformer (L) [44] 1 94±2…”
Section: Comparison With State-of-the-artmentioning
confidence: 99%
“…In Tab. 1, N ÜWA significantly outperforms 2 181 VideoFlow [18] 3 131 LVT [31] 1 126±3 SAVP [20] 2 116 DVD-GAN-FP [7] 1 110 Video Transformer (S) [44] 1 106±3 TriVD-GAN-FP [23] 1 103 CCVS [25] 1 99±2 Video Transformer (L) [44] 1 94±2…”
Section: Comparison With State-of-the-artmentioning
confidence: 99%