Proceedings of SLPAT 2015: 6th Workshop on Speech and Language Processing for Assistive Technologies 2015
DOI: 10.18653/v1/w15-5111
|View full text |Cite
|
Sign up to set email alerts
|

Automatic dysfluency detection in dysarthric speech using deep belief networks

Abstract: Dysarthria is a speech disorder caused by difficulties in controlling muscles, such as the tongue and lips, that are needed to produce speech. These differences in motor skills cause speech to be slurred, mumbled, and spoken relatively slowly, and can also increase the likelihood of dysfluency. This includes nonspeech sounds, and 'stuttering', defined here as a disruption in the fluency of speech manifested by prolongations, stop-gaps, and repetitions. This paper investigates different types of input features … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 28 publications
(9 citation statements)
references
References 10 publications
0
9
0
Order By: Relevance
“…45 MFCC and 14 LPCC features from TORGO dataset [72] has been used in this case study for the detection of disfluencies [72]. The experimental results obtained showed that MFCCs and LPCCs produce similar detection accuracies of approximately 86% for repetitions and 84% for non-speech disfluencies [105].…”
Section: Statistical Approachesmentioning
confidence: 99%
See 1 more Smart Citation
“…45 MFCC and 14 LPCC features from TORGO dataset [72] has been used in this case study for the detection of disfluencies [72]. The experimental results obtained showed that MFCCs and LPCCs produce similar detection accuracies of approximately 86% for repetitions and 84% for non-speech disfluencies [105].…”
Section: Statistical Approachesmentioning
confidence: 99%
“…In 2005, Oue et al [105] introduced deep belief network for the automatic detection of repetitions, non-speech disfluencies. 45 MFCC and 14 LPCC features from TORGO dataset [72] has been used in this case study for the detection of disfluencies [72].…”
Section: Statistical Approachesmentioning
confidence: 99%
“…Due to recent advancements in deep learning, the improvement in speech technology surpasses the shallow neural network based approaches, and thus, resulted in a shift towards deep learning based framework and, disfluency identification is no exception. The work in [19] used deep belief networks with cepstral features for the detection of repetitions and stop gaps on TORGO dataset. T. Kourkounakis et al [20] introduced a deep residual neural network and bi-directional long term short memory (ResNet+BiLSTM) based method to learn stutterspecific features from the audio.…”
Section: Related Workmentioning
confidence: 99%
“…Gauthier, Shippagan, NB, E8S 1P6, Canada Full list of author information is available at the end of the article signal and Gaussian Mixture Models (GMMs) to model the distribution of the spectral representation of a waveform. However, HMM-GMM-based systems require a large amount of data to be trained, which is not efficient in the case of dysarthric speech where the corpora used for training are always small [6]. Therefore, these approaches cannot be applied with ease in the context of dysarthric speech.…”
Section: Related Workmentioning
confidence: 99%