Development of an articulatory visual-speech synthesizer to support language learning

Wong, Ka-Ho; Leung, Wai-Kim; Lo, Wai Kit; Meng, Helen

doi:10.1109/iscslp.2010.5684832

Cited by 8 publications

(12 citation statements)

References 1 publication

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our exaggerated feedback is implemented on the previous visual-speech synthesizer reported in [9][10], which visualized pronunciation movement from midsagittal and front views of the vocal tract. It focuses on providing corrective feedback and can offer a reliable visualization for coarticulation.…”

Section: The Exaggeration Methodsmentioning

confidence: 99%

“…In our visual-speech synthesis system, each phoneme is assigned to two visemes as the key frames for animation generation [9][10]. Each viseme can be assumed as the representation of a key articulatory action.…”

Section: Realization Of Visual Exaggerationmentioning

confidence: 99%

“…The morphing results between the k th viseme V k and the (k + 1) th viseme V k+1 are generated according to (9), where w(t) is the blending weight at time t. To emphasize one key action, we slow down the rate of change in blending weights around the corresponding viseme in focus. To be specific, let the two visemes indexed by n and (n + 1) belong to one phoneme and we will emphasize its first viseme V n by non-linear morphing.…”

Section: Non-linear Morphing For Key Action Emphasismentioning

confidence: 99%

See 2 more Smart Citations

Audiovisual synthesis of exaggerated speech for corrective feedback in computer-assisted pronunciation training

Zhao

Yuan

Leung

et al. 2013

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

Self Cite

View full text Add to dashboard Cite

In second language learning, unawareness of the differences between correct and incorrect pronunciations is one of the largest obstacles for mispronunciation correction. In order to make the feedback more discriminatively perceptible, this paper presents a novel method for corrective feedback generation, namely, exaggerated feedback, for language learning. To produce exaggeration effect, the neutral audio and visual speech are both exaggerated and then re-synthesized synchronously based on the audiovisual synthesis technology. The audio speech exaggeration is realized by adjusting the acoustic features related to duration, pitch and energy of the speech according to different phonemes conditions. The visual speech exaggeration is realized by increasing the articulatory movement range and slowing down the movement around the key action. The results show that our methods can effectively generate bimodal exaggeration effect for feedback provision and make them more discriminative to be perceived.

show abstract

Section: The Exaggeration Methodsmentioning

confidence: 99%

Section: Realization Of Visual Exaggerationmentioning

confidence: 99%

Section: Non-linear Morphing For Key Action Emphasismentioning

confidence: 99%

See 1 more Smart Citation

Audiovisual synthesis of exaggerated speech for corrective feedback in computer-assisted pronunciation training

Zhao

Yuan

Leung

et al. 2013

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

Self Cite

View full text Add to dashboard Cite

show abstract

“…Learning approaches to pronunciation can be roughly divided into two types: phonics training and whole-word training. 13 The phonics training approach emphasizes phoneme training and aims to correct phoneme-level errors. 14 In contrast, whole-word training emphasizes meaning to encourage memorization by students.…”

Section: Introductionmentioning

confidence: 99%

Effective computer‐assisted pronunciation training based on phone‐sensitive word recommendation

Ko²,

Kim

et al. 2019

Concurrency and Computation

View full text Add to dashboard Cite

Owing to their autonomous nature, computer-aided pronunciation training systems are regarded as a useful tool for pronunciation training for beginner-level students, particularly those who feel uncomfortable having their pronunciation corrected in front of other students or who have difficulty with face-to-face training for other reasons. In this study, we propose a computer-assisted pronunciation training system targeting unacceptable pronunciation due to confusion among contextual allophones, a problem that often emerges from phoneme-based feedback provided during pronunciation training. To consider the different pronunciation of phonemes according to position, the proposed system is implemented to recommend a set of words focusing on phoneme pairs. Experimental results show that the proposed system results in the improvement of pronunciation skills through training sessions that use recommended words containing phoneme pairs that were initially pronounced incorrectly.

show abstract

“…WASAY generates synchronized animations of the speech articulators in the midsagittal and the front views. The initial implementation of WASAY [6] uses context-independent visemes, and blending such visemes does not offer a reliable visualization for coarticulation.…”

Section: Introductionmentioning

confidence: 99%

Allophonic variations in visual speech synthesis for corrective feedback in CAPT

Wong

Meng

2011

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Self Cite

View full text Add to dashboard Cite

Abstract-This paper presents a visual speech synthesizer providing midsagittal and front views of the vocal tract to help language learners to correct their mispronunciations. We adopt a set of allophonic rules to determine the visualization of allophonic variations. We also implement coarticulation by decomposing a viseme (visualization of all articulators) into viseme components (visualization of tongue, lips, jaw, and velum separately). Viseme components are morphed independently while the temporally adjacent articulations are considered. Subjective evaluation involving 6 subjects with linguistic background shows that 54% of their responses prefer having allophonic variations incorporated.

show abstract

Development of an articulatory visual-speech synthesizer to support language learning

Cited by 8 publications

References 1 publication

Audiovisual synthesis of exaggerated speech for corrective feedback in computer-assisted pronunciation training

Audiovisual synthesis of exaggerated speech for corrective feedback in computer-assisted pronunciation training

Effective computer‐assisted pronunciation training based on phone‐sensitive word recommendation

Allophonic variations in visual speech synthesis for corrective feedback in CAPT

Contact Info

Product

Resources

About