2018 12th International Conference on Software, Knowledge, Information Management &Amp; Applications (SKIMA) 2018
DOI: 10.1109/skima.2018.8631536
|View full text |Cite
|
Sign up to set email alerts
|

Synthesizing Expressive Facial and Speech Animation by Text-to-IPA Translation with Emotion Control

Abstract: Given the complexity of the human facial anatomy, animating facial expressions and lip movements for speech is a very time-consuming and tedious task. In this paper, a new text-toanimation framework for facial animation synthesis is proposed. The core idea is to improve the expressiveness of lip-sync animation by incorporating facial expressions in 3D animated characters. This idea is realized as a plug-in in Autodesk Maya, one of the most popular animation platforms in the industry, such that professional ani… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 16 publications
0
6
0
Order By: Relevance
“…hundredths of speech sentences and tenths of emotions captured from people) required for a statistical model [21] or a clustering model [23]. Using the low-dimensional parameter space, compared to the slider-based system [24], our approach requires a considerably smaller number of parameters for emotion controls.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…hundredths of speech sentences and tenths of emotions captured from people) required for a statistical model [21] or a clustering model [23]. Using the low-dimensional parameter space, compared to the slider-based system [24], our approach requires a considerably smaller number of parameters for emotion controls.…”
Section: Resultsmentioning
confidence: 99%
“…Although their approach could generate an emotional speech animation through a simple interface of emotion control, training a CAT model requires the collection of a speech and video corpus, which is laborious, whenever new emotion is introduced. Stef et al introduced an apporoach that converted a given text into the international phonetic alphabet (IPA), mapped the corresponding lip shape to each symbol, and generated emotional animation by a key-framing technique [24]. Their approach relied on a commerical animation tool and it was necessary to adjust a large number of emotional parameters to create a desired expression.…”
Section: Emotional Speech Animationmentioning
confidence: 99%
“…Similarly, since facial expression and the underlying emotional state of the subject can also affect measurement accuracy, we are interested in normalizing the facial expression. By analyzing 2D [52] and 3D [53] facial information and the associated emotional states, we may also be able to further improve the robustness of the proposed method in the future.…”
Section: Discussionmentioning
confidence: 99%
“…Some of these methods target rigged 3D characters or meshes with prede ined mouth blend shapes that correspond to speech sounds [33,34,35,36,37,38] which have primarily focused on mouth motions only and show a inite number of emotions, blinks, facial action units movements. Realistic Speech-Driven Facial Animation with GANs (RSDGAN) [42] used a GAN-based approach to produce quality videos.…”
Section: Phoneme and Visemes Generation Of Videosmentioning
confidence: 99%