Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-1522
|View full text |Cite
|
Sign up to set email alerts
|

Detection of Mispronunciations and Disfluencies in Children Reading Aloud

Abstract: To automatically evaluate the performance of children reading aloud or to follow a child's reading in reading tutor applications, different types of reading disfluencies and mispronunciations must be accounted for. In this work, we aim to detect most of these disfluencies in sentence and pseudoword reading. Detecting incorrectly pronounced words, and quantifying the quality of word pronunciations, is arguably the hardest task. We approach the challenge as a two-step process. First, a segmentation using task-sp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0
1

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(17 citation statements)
references
References 14 publications
0
16
0
1
Order By: Relevance
“…It has also been used to align and correct approximate transcriptions of long audio recordings [30], and for audio indexing and displaying subtitles. In a tutoring application, this approach was used by [31] to track and align a child's read passage.…”
Section: Lightly-supervised Lattice Decodingmentioning
confidence: 99%
“…It has also been used to align and correct approximate transcriptions of long audio recordings [30], and for audio indexing and displaying subtitles. In a tutoring application, this approach was used by [31] to track and align a child's read passage.…”
Section: Lightly-supervised Lattice Decodingmentioning
confidence: 99%
“…Using less reduced data contributes to performance improvement in speech recognition tasks [13]. Therefore, we expect similar performance improvement in classifying phonemes compared to existing models based on hidden Markov model or the maximum entropy model that use MFCC [9,12].…”
Section: Phoneme Classifiermentioning
confidence: 91%
“…Therefore, we first created a phoneme classifier that can provide aligned phoneme labels given just a speech file. Unlike existing research that detects miscues with miscue annotated corpus [9], our approach starts the miscue detection task using speech files with only transcriptions, but no aligned phonemes. Figure 1 shows our phoneme classification pipeline as well as the data augmentation process.…”
Section: Phoneme Classifiermentioning
confidence: 99%
See 1 more Smart Citation
“…In [16], the problems raised by the accurate and robust recognition of children's speech is reported. More recently, Proença et al [17] propose another specialized ASR system with the same goal. Both publications are more involved in solving the problem of the recognition of disfluent speech than in correlating results with subjective scores of the quality of the reading.…”
Section: L1 Learnersmentioning
confidence: 99%