2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00248
|View full text |Cite
|
Sign up to set email alerts
|

Animating Arbitrary Objects via Deep Motion Transfer

Abstract: This paper introduces a novel deep learning framework for image animation. Given an input image with a target object and a driving video sequence depicting a moving object, our framework generates a video in which the target object is animated according to the driving sequence. This is achieved through a deep architecture that decouples appearance and motion information. Our framework consists of three main modules: (i) a Keypoint Detector unsupervisely trained to extract object keypoints, (ii) a Dense Motion … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
424
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
2

Relationship

3
7

Authors

Journals

citations
Cited by 351 publications
(426 citation statements)
references
References 37 publications
2
424
0
Order By: Relevance
“…Several video generation methods for face animation were proposed. Given a face image, such methods perform video prediction [26] or motion transfer [22,28] to manipulate faces. Recently, Pumarola et al [21] presented a work performing anatomically-aware face animation.…”
Section: Related Workmentioning
confidence: 99%
“…Several video generation methods for face animation were proposed. Given a face image, such methods perform video prediction [26] or motion transfer [22,28] to manipulate faces. Recently, Pumarola et al [21] presented a work performing anatomically-aware face animation.…”
Section: Related Workmentioning
confidence: 99%
“…To generate domain-specific images, Conditional GAN (CGAN) [27] has been proposed. CGAN usually combines a vanilla GAN and some external information, such as class labels or tags [29,30,4,40,37], text descriptions [33,50], human pose [8,38,28,22,35] and reference images [25,16]. Image-to-Image Translation frameworks adopt inputoutput data to learn a parametric mapping between inputs and outputs.…”
Section: Related Workmentioning
confidence: 99%
“…To deform input to the target image, backward motion is usually considered as a vector from the target to source points [14,6,33]. However, such motion representation may not be suitable for convolution/deconvolution layer networks with aligned operators, since the representation is aligned to the unknown deformed image rather than the input one.…”
Section: Motion Field Directionmentioning
confidence: 99%