Adversarial Data Augmentation for Disordered Speech Recognition

Jin, Zengrui; Xie, Xurong; Yu, Jianwei; Liu, Shansong; Liu, Xunying; Meng, Helen

doi:10.48550/arxiv.2108.00899

Cited by 1 publication

(1 citation statement)

References 42 publications

(60 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A second approach is to decrease the model size [7], or to train an inserted small module instead of finetuning the whole model [8,9], so the number of parameters learned on the dysarthric data is limited. Thirdly and differently from the solutions that work on training strategy or model structure, [10,11,12,13] focus directly on the data and do augmentation to generate more dysarthric speech for use in training.…”

Section: Introductionmentioning

confidence: 99%

Weak-Supervised Dysarthria-Invariant Features for Spoken Language Understanding Using an Fhvae and Adversarial Training

hamme

2023

2022 IEEE Spoken Language Technology Workshop (SLT)

View full text Add to dashboard Cite

The scarcity of training data and the large speaker variation in dysarthric speech lead to poor accuracy and poor speaker generalization of spoken language understanding systems for dysarthric speech. Through work on the speech features, we focus on improving the model generalization ability with limited dysarthric data. Factorized Hierarchical Variational Auto-Encoders (FHVAE) trained unsupervisedly have shown their advantage in disentangling content and speaker representations. Earlier work showed that the dysarthria shows in both feature vectors. Here, we add adversarial training to bridge the gap between the control and dysarthric speech data domains. We extract dysarthric and speaker invariant features using weak supervision. The extracted features are evaluated on a Spoken Language Understanding task and yield a higher accuracy on unseen speakers with more severe dysarthria compared to features from the basic FHVAE model or plain filterbanks.

show abstract

Section: Introductionmentioning

confidence: 99%

Weak-Supervised Dysarthria-Invariant Features for Spoken Language Understanding Using an Fhvae and Adversarial Training

hamme

2023

2022 IEEE Spoken Language Technology Workshop (SLT)

View full text Add to dashboard Cite

show abstract

Adversarial Data Augmentation for Disordered Speech Recognition

Cited by 1 publication

References 42 publications

Weak-Supervised Dysarthria-Invariant Features for Spoken Language Understanding Using an Fhvae and Adversarial Training

Weak-Supervised Dysarthria-Invariant Features for Spoken Language Understanding Using an Fhvae and Adversarial Training

Contact Info

Product

Resources

About