2023
DOI: 10.1007/978-3-031-25198-6_11
|View full text |Cite
|
Sign up to set email alerts
|

Shallow Diffusion Motion Model for Talking Face Generation from Speech

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 41 publications
0
1
0
Order By: Relevance
“…While the above approaches are currently the only end-to-end diffusion based methods, a number of structural based approaches, that leverage diffusion models have also been proposed in recent months. Zhang et al (2022) proposed an approach that used audio to predict landmarks, before using a diffusion based renderer to output the final frame. Zhua et al (2023) also utilised a diffusion model similarly, using it to take the source image and the predicted motion features as input to generate the high-resolution frames.…”
Section: Diffusion-based Generationmentioning
confidence: 99%
“…While the above approaches are currently the only end-to-end diffusion based methods, a number of structural based approaches, that leverage diffusion models have also been proposed in recent months. Zhang et al (2022) proposed an approach that used audio to predict landmarks, before using a diffusion based renderer to output the final frame. Zhua et al (2023) also utilised a diffusion model similarly, using it to take the source image and the predicted motion features as input to generate the high-resolution frames.…”
Section: Diffusion-based Generationmentioning
confidence: 99%
“…Such talking face generation requires producing realistic facial movements and synchronized speech in response to audio input. With the rapid evolution of deep learning, it becomes easily to handle with a huge amount of audio and visual data and producing satisfying results with techniques like Generative Adversarial Network (GAN) [18], [19] and diffusion model [20], [21]. Recent methods focus on the optimization on the important parts, such as identity preservation [22], face animation [23], pose control [5] and audio-video synchronization [24].…”
Section: A Talking Face Generationmentioning
confidence: 99%