2022
DOI: 10.1007/978-3-031-19790-1_6
|View full text |Cite
|
Sign up to set email alerts
|

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
40
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 49 publications
(43 citation statements)
references
References 34 publications
0
40
0
Order By: Relevance
“…Only a few works target the disentanglement of pose and expression for talking face generation. Almost all of them [6,24,37] are based on 3DMMs that explicitly decouple pose and expression. PIRenderer [24] extracts the 3DMM parameters for a driving face through a pre-trained model and then predict the flow given a source face and the 3DMM parameters.…”
Section: Decouplingmentioning
confidence: 99%
See 3 more Smart Citations
“…Only a few works target the disentanglement of pose and expression for talking face generation. Almost all of them [6,24,37] are based on 3DMMs that explicitly decouple pose and expression. PIRenderer [24] extracts the 3DMM parameters for a driving face through a pre-trained model and then predict the flow given a source face and the 3DMM parameters.…”
Section: Decouplingmentioning
confidence: 99%
“…During inference, it can transfer only the expression from the driving face by replac-ing the expression parameters of the source face with those of the driving one. StyleHEAT [37] follows the similar way based on a pre-trained StyleGAN. However, the performance of these methods heavily depend on the accuracy of 3DMMs.…”
Section: Decouplingmentioning
confidence: 99%
See 2 more Smart Citations
“…ATVG [Chen et al 2019] and MakeItTalk [Zhou et al 2020] first generate the facial landmarks from audio, and then, render the video using a landmark-to-video network. Dense flow field is another active research direction [Siarohin et al 2019;Yin et al 2022]. [Zhang et al 2021a] predict the 3DMM coefficients from audio and then transfer these parameters into a flow-based warping network.…”
Section: Audio-based Single Image Facial Animationmentioning
confidence: 99%