2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018
DOI: 10.1109/cvpr.2018.00819
|View full text |Cite
|
Sign up to set email alerts
|

Controllable Video Generation with Sparse Trajectories

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
82
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 79 publications
(83 citation statements)
references
References 12 publications
0
82
0
Order By: Relevance
“…Meanwhile, Reda et al [34] propose to model moving appearances with both convolutional kernels as in [10] and vectors as optical flow. Our closest prior work is [11] which also composes the pixel-and flow-based predictions through occlusion maps. However, our proposed method can differentiate in three aspects: (1) our pixel and flow prediction tasks are separately trained; (2) we employ an occlusion inpainter for our pixel generation so that more contextual information after warping can be utilized; (3) instead of predicting occlusion as another side task, we directly refer to predicted flows as the proxy for post warping confidence.…”
Section: High-fidelity Video Predictionmentioning
confidence: 99%
“…Meanwhile, Reda et al [34] propose to model moving appearances with both convolutional kernels as in [10] and vectors as optical flow. Our closest prior work is [11] which also composes the pixel-and flow-based predictions through occlusion maps. However, our proposed method can differentiate in three aspects: (1) our pixel and flow prediction tasks are separately trained; (2) we employ an occlusion inpainter for our pixel generation so that more contextual information after warping can be utilized; (3) instead of predicting occlusion as another side task, we directly refer to predicted flows as the proxy for post warping confidence.…”
Section: High-fidelity Video Predictionmentioning
confidence: 99%
“…A second closely related task to story visualization is video generation, especially that of text-to-video [24,13] or image-to-video generation [1,31,32]. Existing approaches only generate short video clips [13,5,12] without scene changes. The biggest challenge in video generation is how to ensure a smooth motion transition across successive video frames.…”
Section: Related Workmentioning
confidence: 99%
“…The biggest challenge in video generation is how to ensure a smooth motion transition across successive video frames. A trajectory, skeleton or simple landmark is used in existing work, to help model the motion feature [12,37,33]. To this end, researchers disentangle dynamic and static features for motion and background, respectively [32,24,31,6].…”
Section: Related Workmentioning
confidence: 99%
“…To make predictions occur more realistic, others tackled the problem by learning separate representations for the static and dynamic components of a video. This is done either by incorporating motion conditions, such as optical flow information [12], [34], [15], [17], [25], or by learning sparse features that represent pixel dynamics [26]. Decomposing the video into static and non-static components allows the network to simply reproduce the values of the static part for the majority of pixels.…”
Section: Related Workmentioning
confidence: 99%