2020
DOI: 10.1007/978-3-030-58558-7_18
|View full text |Cite
|
Sign up to set email alerts
|

DTVNet: Dynamic Time-Lapse Video Generation via Single Still Image

Abstract: This paper presents a novel end-to-end dynamic time-lapse video generation framework, named DTVNet, to generate diversified time-lapse videos from a single landscape image, which are conditioned on normalized motion vectors. The proposed DTVNet consists of two submodules: Optical Flow Encoder (OFE) and Dynamic Video Generator (DVG). The OFE maps a sequence of optical flow maps to a normalized motion vector that encodes the motion information inside the generated video. The DVG contains motion and content strea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(25 citation statements)
references
References 32 publications
(56 reference statements)
0
25
0
Order By: Relevance
“…When the motion clue is not provided at all, videos are generated in a stochastic manner constrained by the spatial information provided by the input image [2,5,8,18,45,47,48]. The models used to generate videos can be generative adversarial network (GAN) [48] or variational autoencoder (VAE) [8]. This kind of stochastic video generation can only handle short dynamic patterns in distribution.…”
Section: Image-to-video Generationmentioning
confidence: 99%
“…When the motion clue is not provided at all, videos are generated in a stochastic manner constrained by the spatial information provided by the input image [2,5,8,18,45,47,48]. The models used to generate videos can be generative adversarial network (GAN) [48] or variational autoencoder (VAE) [8]. This kind of stochastic video generation can only handle short dynamic patterns in distribution.…”
Section: Image-to-video Generationmentioning
confidence: 99%
“…Therefore, we compare with Monkey-Net. Also, we compare with DTV-Net [22] and MD-GAN [12]. DTV-Net uses optical flow to extract a motion vector.…”
Section: ) Video Reconstructionmentioning
confidence: 99%
“…Each optical flow was extracted using the Farneback method, except for DTV-Net. For DTV-Net, we use optical flow extracted by ARFLow [23], following the original method [22].…”
Section: ) Optical Flow Evaluationmentioning
confidence: 99%
“…This topic has drawn great attention in recent years [6,34,36], as video representation has become an evolving trend in content consumption that makes people more engaged. More specifically, landscape animation [2,27,32] can benefit a broad range of applications, such as video footage production, social engagement boosting, virtual background animation for online video conferences. Thus, we mainly focus on landscape animation in this paper.…”
Section: Introductionmentioning
confidence: 99%
“…Besides, the lack of motion embedding leads to a lack of diverse video generation capabilities, which limits the practicality of these methods. To solve these problems, some works use an encoder-decoder paradigm to explicitly learn motion by using a convolutional encoder to embed estimated optical flows into a latent space [2,32]. Relying on optical flows as the representation of motion, these works are able to well learn motion then generate vivid videos.…”
Section: Introductionmentioning
confidence: 99%