Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees

Li, Wei; Li, Kehuang; Siniscalchi, Sabato Marco; Chen, Nancy F.; Lee, Chin‐Hui

doi:10.21437/interspeech.2016-517

Cited by 17 publications

(14 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To provide those situations of oral interaction technically to learners, dialogue-based CALL (Computer Aided Language Learning) systems have been developed [1,2,3], where not only pronunciation errors but also grammatical errors can be detected and their corrective feedback is also provided. To assess learners' pronunciation, native speakers' acoustic models are often referred to and comparison is made between learners' speech and its corresponding native model.…”

Section: Background and Objectivementioning

confidence: 99%

A Study of Objective Measurement of Comprehensibility through Native Speakers' Shadowing of Learners' Utterances

et al. 2018

View full text Add to dashboard Cite

While learners desire to acquire so comprehensible pronunciations as to make themselves understood smoothly, acquisition often becomes difficult because, outside of classrooms, it is not rare that learners can hardly find chances to talk in the target language. Even when they talk to native speakers, they may receive only lenient or superficial suggestions from native speakers. How can learners know native speakers' honest perception on their utterances? In this paper, shadowing is introduced not to learners but to native listeners, who are asked to shadow learners' utterances. Since shadowing is as simultaneous repetition as possible, it is expected that native listeners' perceived comprehensibility can be measured objectively as smoothness of natives' shadowings. Experiments show that 1) shadowers' subjective assessment of learners' speech and that of their shadowings are highly correlated and that 2) the former is more correlated with the GOP scores of natives' shadowings than those of learners' speech. These results suggest it is valid to regard comprehensible pronunciation as shadowable pronunciation.

show abstract

Section: Background and Objectivementioning

confidence: 99%

A Study of Objective Measurement of Comprehensibility through Native Speakers' Shadowing of Learners' Utterances

et al. 2018

View full text Add to dashboard Cite

show abstract

“…As can be seen in Section 4, the number of the correctly pronounced phone is much larger than the number of mispronounced, which could make trained models biased. To prevent the bias problem, we adopt other phones' correctly pronounced observations as mispro-nounced samples of the target phone as much as the difference between the number of correct instances and the number of incorrect instances to make a balance [5].…”

Section: Methodsmentioning

confidence: 99%

“…There have been several studies to detect pronunciation errors of learners [2][3][4] [5]. The study of [2] suggested an extended recognition network (ERN), which expands pronunciation dictionaries of learners by predicting frequent erroneous pronunciation sequences.…”

Section: Introductionmentioning

confidence: 99%

“…The study of Ryu and Chung [10] proposes articulatory Goodness-Of-Pronunciations (aGOPs) as novel features for pronunciation assessment in English spoken by Korean learners. Furthermore, Li et al [5] extended GOP into speech attributes to detect mispronunciation of onset consonants in learners' Chinese by decision trees. By using speech attributes, they have shown the possibility to provide corrective feedback to improve pronunciation.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Mispronunciation Diagnosis of L2 English at Articulatory Level Using Articulatory Goodness-Of-Pronunciation Features

Ryu¹,

Chung

2017

7th ISCA Workshop on Speech and Language Technology in Education (SLaTE 2017)

View full text Add to dashboard Cite

This paper proposes a method to provide an articulatory diagnosis of English produced by Korean learners using articulatory Goodness-Of-Pronunciation (aGOP) features, which are based on the distinctive feature theory in phonology. Previous studies on mispronunciation diagnosis have mainly dealt with pronunciation errors at phone-level. They inform learners of which phone is recognized as a diagnosis, when the corresponding segment is realized as a mispronunciation. However, to provide learners more effective corrective feedback, diagnosis had better be performed at articulatory-level, such as place and manner of articulation, rather than at phone-level. This study aims to provide automatic articulatory diagnosis using articulationbased confidence scores. At first, the speech of learners is forced-aligned and recognized to compute the GOP and aGOPs. When the forced-aligned segment is a consonant, articulatory diagnosis is conducted in three articulatory categories: voicing, place of articulation, and manner of articulation. Otherwise, diagnosis is performed in terms of rounding, height, and backness corresponding to articulatory characteristics of vowels. Experimental results show that F1 scores for voicing, place, and manner corresponding to consonants are 0.828, 0.754, and 0.781, respectively, whereas F1 score for rounding, height, and backness corresponding to vowels are 0.843, 0.782, and 0.824, respectively. These results indicate that the proposed method yields effective articulatory diagnosis.

show abstract

“…Faced with the challenges of inconsistency in non-native phone-based labeling and imperfect acoustic modeling, our previous work [18,19] has investigated articulatory-based modeling for CAPT, where speech attributes [20,21] {lee.wei, chl}@gatech.edu, marco.siniscalchi@unikore.it, nfychen@i2r.a-star.edu.sg…”

Section: Introductionmentioning

confidence: 99%

Improving Mispronunciation Detection for Non-Native Learners with Multisource Information and LSTM-Based Deep Models

Chen

Siniscalchi

et al. 2017

Interspeech 2017

Self Cite

View full text Add to dashboard Cite

In this paper, we utilize manner and place of articulation features and deep neural network models (DNNs) with long short-term memory (LSTM) to improve the detection performance of phonetic mispronunciations produced by second language learners. First, we show that speech attribute scores are complementary to conventional phone scores, so they can be concatenated as features to improve a baseline system based only on phone information. Next, pronunciation representation, usually calculated by frame-level averaging in a DNN, is now learned by LSTM, which directly uses sequential context information to embed a sequence of pronunciation scores into a pronunciation vector to improve the performance of subsequent mispronunciation detectors. Finally, when both proposed techniques are incorporated into the baseline phonebased GOP (goodness of pronunciation) classifier system trained on the same data, the integrated system reduces the false acceptance rate (FAR) and false rejection rate (FRR) by 37.90% and 38.44% (relative), respectively, from the baseline system.

show abstract

Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees

Cited by 17 publications

References 21 publications

A Study of Objective Measurement of Comprehensibility through Native Speakers' Shadowing of Learners' Utterances

A Study of Objective Measurement of Comprehensibility through Native Speakers' Shadowing of Learners' Utterances

Mispronunciation Diagnosis of L2 English at Articulatory Level Using Articulatory Goodness-Of-Pronunciation Features

Improving Mispronunciation Detection for Non-Native Learners with Multisource Information and LSTM-Based Deep Models

Contact Info

Product

Resources

About