Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition

Kim, Myungjong; Kim, Younggwan; Yoo, Joohong; Wang, Jun; Kim, Hoirin

doi:10.1109/tnsre.2017.2681691

Cited by 44 publications

(21 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our experimental results demonstrated that CLSTM-RNN has the potential to improve the ASR performance as a speaker-independent acoustic model for the patients with ALS. To further improve the ASR accuracies, techniques for session/speaker variability compensation including acoustic feature transformation [25,26], acoustic model adaptation [27], and pronunciation variation modeling [27,28] can be further applied. We speculate that the results may improve once a larger training dataset from more ALS patients is obtained.…”

Section: Discussionmentioning

confidence: 99%

“…Our approach presents a possibility in effectively modeling dysarthric speech (even low intelligible speech) in a speaker-independent way. Future directions include 1) a test of the CLSTM-RNN approach using a larger dataset collected from more subjects, 2) applying speaker adaptation/normalization techniques [27], and 3) using articulatory information [25,29].…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Dysarthric Speech Recognition Using Convolutional LSTM Neural Network

Kim

Cao

An³

et al. 2018

Interspeech 2018

Self Cite

View full text Add to dashboard Cite

Dysarthria is a motor speech disorder that impedes the physical production of speech. Speech in patients with dysarthria is generally characterized by poor articulation, breathy voice, and monotonic intonation. Therefore, modeling the spectral and temporal characteristics of dysarthric speech is critical for better performance in dysarthric speech recognition. Convolutional long short-term memory recurrent neural networks (CLSTM-RNNs) have recently successfully been used in normal speech recognition, but have rarely been used in dysarthric speech recognition. We hypothesized CLSTM-RNNs have the potential to capture the distinct characteristics of dysarthric speech, taking advantage of convolutional neural networks (CNNs) for extracting effective local features and LSTM-RNNs for modeling temporal dependencies of the features. In this paper, we investigate the use of CLSTM-RNNs for dysarthric speech recognition. Experimental evaluation on a database collected from nine dysarthric patients showed that our approach provides substantial improvement over both standard CNN and LSTM-RNN based speech recognizers.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Dysarthric Speech Recognition Using Convolutional LSTM Neural Network

Kim

Cao

An³

et al. 2018

Interspeech 2018

Self Cite

View full text Add to dashboard Cite

show abstract

“…System References Model Hybrid [74], [75], [113], [141], [153], [154], [168], [178]- [180], [195], [212], [213], [230], [240], [284]-[290] E2E…”

Section: Levelmentioning

confidence: 99%

Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview

Bell

Fainberg

Klejch

et al. 2021

IEEE Open J. Signal Process.

View full text Add to dashboard Cite

“…Korean Phonetically Optimized Words (KPOW) databas, Korean Phonetically Balanced Words (KPBW) database, Korean Phonetically Rich Words (KPRW) database and SI dysarthria adaptation were used for dysarthtic speech recognition with KL-HMM and compared with GMM-HMM and DNN-HMM. The framework of KL-HMM showed that is effective for dysarthric speakers to improve the performance [29].…”

Section: Related Workmentioning

confidence: 99%

Dysarthric Speech Recognition using Convolutional Recurrent Neural Networks

Albaqshi¹,

Sagheer²

2020

IJIES

View full text Add to dashboard Cite

Automatic speech recognition (ASR) transcribes the human voice into a text automatically. Recently, ASR systems has reached, almost, the human performance in specific scenarios. In contrast, dysarthric speech recognition (DSR) is still a challenging task due to many reasons including unintelligible speech, irregular phonemes articulation, along with scarcity and heterogeneous of data. Most of the existing DSR works are employed the ASR systems that trained on an unimpaired speech to recognize such impaired speech, which of course is impractical and inefficient. In this paper, we developed a deep architecture of the convolutional recurrent neural network (CRNN) model and compared its performance with the vanilla convolutional neural network (CNN) model. We train both models using the samples of the Torgo dataset, which contains a mixed of impaired and unimpaired speech data. The experimental results show that the CRNN model attains 40.6% against 31.4% for the vanilla CNN. This indicates the effectiveness of the proposed hybrid structure of the CRNN to improve the recognition of dysarthric speech.

show abstract

Regularized Speaker Adaptation of KL-HMM for Dysarthric Speech Recognition

Cited by 44 publications

References 36 publications

Dysarthric Speech Recognition Using Convolutional LSTM Neural Network

Dysarthric Speech Recognition Using Convolutional LSTM Neural Network

Adaptation Algorithms for Neural Network-Based Speech Recognition: An Overview

Dysarthric Speech Recognition using Convolutional Recurrent Neural Networks

Contact Info

Product

Resources

About