2011
DOI: 10.1109/tasl.2010.2072499
|View full text |Cite
|
Sign up to set email alerts
|

Articulatory Knowledge in the Recognition of Dysarthric Speech

Abstract: Abstract-Disabled speech is not compatible with modern generative and acoustic-only models of speech recognition (ASR). This work considers the use of theoretical and empirical knowledge of the vocal tract for atypical speech in labeling segmented and unsegmented sequences. These combined models are compared against discriminative models such as neural networks, support vector machines, and conditional random fields. Results show significant improvements in accuracy over the baseline through the use of product… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
58
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 88 publications
(67 citation statements)
references
References 49 publications
1
58
0
Order By: Relevance
“…The histogram of ALS patients follows general trend of warping factor distribution for females (typically < 1.0) and males (typically > 1.0). Figures 5,6,7,and 8 give the PERs of speaker-independent dysarthric (due to ALS) speech recognition results using different context models and recognizers, respectively: (1) monophone GMM-HMM, (2) triphone GMM-HMM, (3) monophone DNN-HMM, and (4) triphone DNN-HMM with individual or combinations of VTLN, Procrustes matching, and fMLLR. These results suggest that VTLN, Procrustes matching, and fMLLR were all effective for speaker-independent dysarthric speech recognition from acoustic data, articulatory data, or combined.…”
Section: Recognizer and Experimental Setupmentioning
confidence: 99%
See 2 more Smart Citations
“…The histogram of ALS patients follows general trend of warping factor distribution for females (typically < 1.0) and males (typically > 1.0). Figures 5,6,7,and 8 give the PERs of speaker-independent dysarthric (due to ALS) speech recognition results using different context models and recognizers, respectively: (1) monophone GMM-HMM, (2) triphone GMM-HMM, (3) monophone DNN-HMM, and (4) triphone DNN-HMM with individual or combinations of VTLN, Procrustes matching, and fMLLR. These results suggest that VTLN, Procrustes matching, and fMLLR were all effective for speaker-independent dysarthric speech recognition from acoustic data, articulatory data, or combined.…”
Section: Recognizer and Experimental Setupmentioning
confidence: 99%
“…Based on the recent literature on speech recognition with articulatory data (e.g., [7,[14][15][16][17][18][19][20]), we hypothesized the followings for dysarthric speech recognition for ALS: 1) adding articulatory data (collected from ALS patients) would improve the speech recognition performance, 2) feature normalization in articulatory, acoustic, and both spaces is critical and necessary for speaker-independent dysarthric speech recognition with articulatory data, and 3) recent state-of-the-art approach, deep neural network (DNN)-hidden Markov model (HMM) would outperform the long-standing approach, Gaussian mixture model (GMM)-HMM.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, [17] studied the predictability of articulatory errors and trained a Bayesian network in order to build an augmented ASR system that considered the statistical relationships between vocal tract configurations and their acoustic consequences. Similarly, [18] focused on aspects of syllabic strength for moderate hypokinetic dysarthric speech.…”
Section: Related Workmentioning
confidence: 99%
“…HMMs are a broad class of doubly stochastic models for non-stationary signals that can be inserted in to other stochastic models to incorporate information from several hierarchical knowledge sources [6].An HMM-based recognizer using variable duration of Hamming window is proposed in [7] that raises the recognition rate of dysarthric speech up to 80 %. Discrete articulatory feature recognition has been applied to identify values for concurrent features in [8]. Here, articulatory features are collected into different categories, each with a number of possible values.…”
Section: Introductionmentioning
confidence: 99%