2021
DOI: 10.48550/arxiv.2106.14014
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Txt2Vid: Ultra-Low Bitrate Compression of Talking-Head Videos via Text

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 25 publications
(30 reference statements)
0
2
0
Order By: Relevance
“…Talking head generation works can be broadly classified in three categories based on the type of input they use to generate a talking head: Text-driven [16,33,36], Audio-driven [9,13,18,31,37,43,45], and Video-driven [12,27,29,39,44] Talking Head Generation.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Talking head generation works can be broadly classified in three categories based on the type of input they use to generate a talking head: Text-driven [16,33,36], Audio-driven [9,13,18,31,37,43,45], and Video-driven [12,27,29,39,44] Talking Head Generation.…”
Section: Related Workmentioning
confidence: 99%
“…Inspired by the recent success of GANs in generating static faces from text [38], Li et al [16] proposed a method to use text for driving animation parameters of the mouth, upper face and head. Txt2Vid [33] converts the spoken language and facial webcam data into text and transmits it to achieve lowbandwidth video conferencing using talking head generation. However, this method relies heavily on the generated speech, altering the original speaker's voice, prosody, and head movements in the video call.…”
Section: Related Workmentioning
confidence: 99%