Visual speech synthesis from 3D mesh sequences driven by combined speech features

Kuhnke, Felix; Östermann, Jörn

doi:10.1109/icme.2017.8019546

Cited by 4 publications

(3 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They focused on the tip of the tongue between the teeth or the back of the tongue, but this type of approach requires sufficient data collected from the subject in advance to produce animation results. Kuhnke and Ostermann collected a sequence of 3D mesh data along the phoneme label by capturing 3D facial movement and recording voice data at the same time [11]. With a regression-based method, they presented a combination method of several speech features for better performance.…”

Section: Text-driven Speech Animation Generationmentioning

confidence: 99%

“…It should be able to express the shape of the lips that is precisely synchronized with the speaking voice. Numerous studies have presented ways to create visual speech animation with the speech track [1,2,3,4,5,6,7,8,9,10,11] while other approaches have focused on simulating facial movements from a set of physical properties [12,13,14,15] or synthesizing emotional expressions from given facial models [16,17,18,19,20]. For more natural and realistic facial animation, an explicit solution is needed to combine the lip movements and facial expressions into one animation sequence.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Text-driven Speech Animation with Emotion Control

Chae¹,

Kim²

2020

KSII TIIS

View full text Add to dashboard Cite

show abstract

Section: Text-driven Speech Animation Generationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Text-driven Speech Animation with Emotion Control

Chae¹,

Kim²

2020

KSII TIIS

View full text Add to dashboard Cite

show abstract

“…ECAs are those CAs that can facilitate full virtual body and the available embodiment in order to incorporate humanlike responses. The ECA technology ranges from chatbots and 2D/3D realizations in a form of talking heads [22][23][24] to fully articulated embodied conversational agents engaged in various concepts of HMI, including sign language [25], storytelling [26], companions [27], and virtual hosts within user interfaces, and even used as moderators of various concepts in ambient-assisted living environments [28][29][30][31][32].…”

Section: Introductionmentioning

confidence: 99%

Advanced Content and Interface Personalization through Conversational Behavior and Affective Embodied Conversational Agents

Rojc¹,

Kačič²,

Mlakar³

2018

Artificial Intelligence - Emerging Trends and Applications

View full text Add to dashboard Cite

Conversation is becoming one of the key interaction modes in HMI. As a result, the conversational agents (CAs) have become an important tool in various everyday scenarios. From Apple and Microsoft to Amazon, Google, and Facebook, all have adapted their own variations of CAs. The CAs range from chatbots and 2D, carton-like implementations of talking heads to fully articulated embodied conversational agents performing interaction in various concepts. Recent studies in the field of face-to-face conversation show that the most natural way to implement interaction is through synchronized verbal and co-verbal signals (gestures and expressions). Namely, co-verbal behavior represents a major source of discourse cohesion. It regulates communicative relationships and may support or even replace verbal counterparts. It effectively retains semantics of the information and gives a certain degree of clarity in the discourse. In this chapter, we will represent a model of generation and realization of more natural machine-generated output.

show abstract