Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-2320
|View full text |Cite
|
Sign up to set email alerts
|

Child Speech Disorder Detection with Siamese Recurrent Network Using Speech Attribute Features

Abstract: Acoustics-based automatic assessment is a highly desirable approach to detecting speech sound disorder (SSD) in children. The performance of an automatic speech assessment system depends greatly on the availability of a good amount of properly annotated disordered speech, which is a critical problem particularly for child speech. This paper presents a novel design of child speech disorder detection system that requires only normal speech for model training. The system is based on a Siamese recurrent network, w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
20
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 20 publications
(21 citation statements)
references
References 13 publications
(12 reference statements)
1
20
0
Order By: Relevance
“…Mauro et al incorporate the speech of a reference speaker to detect mispronunciations at the phoneme level [11]. Wang et al use siamese networks for modeling discrepancy between normal and distorted children's speech [12]. We take a similar approach but we do not need a database of reference speech.…”
Section: Related Workmentioning
confidence: 99%
“…Mauro et al incorporate the speech of a reference speaker to detect mispronunciations at the phoneme level [11]. Wang et al use siamese networks for modeling discrepancy between normal and distorted children's speech [12]. We take a similar approach but we do not need a database of reference speech.…”
Section: Related Workmentioning
confidence: 99%
“…The absolute difference of the two LSTM networks' output is calculated and then fed as input to the fully connected layer with the two outputs as illustrated in figure 3. This architecture is similar with the one used in [19]. Henceforth, the first model will be referred to as MaLSTM, while the second model is called Siamese-Classifier.…”
Section: = σ(mentioning
confidence: 99%
“…GRU is a simplified architecture with an efficiency degree that is comparable to LSTM. These two approaches have been adopted for building automatic speech assessment systems [10,[16][17][18][19], e.g., the work done by Korzekwa et al on dysarthric speech [16].…”
Section: Automatic Assessment Approachesmentioning
confidence: 99%
“…Mel-frequency cepstral coefficients (MFCCs) are commonly used in speech assessment systems for acoustic modeling [10,[17][18][19]24] and feature extraction [25,26]. While deep learning models recently attract intense attentions, Mel Spectrogram is also getting increasingly popular [10,12,16].…”
Section: Speech Representationmentioning
confidence: 99%
See 1 more Smart Citation