ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022
DOI: 10.1109/icassp43922.2022.9746989
|View full text |Cite
|
Sign up to set email alerts
|

Exploiting Cross Domain Acoustic-to-Articulatory Inverted Features for Disordered Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 35 publications
0
4
0
Order By: Relevance
“…Their experimental results on the UASpeech corpus demonstrated that the AVSR based on cross-domain visual feature generation outperformed baseline ASR and AVSR without this approach. Hu, et al [130] proposed a cross-domain acoustic-to-articulator inversion approach. Their model was pre-trained using parallel acousticarticulatory data from the 15-hour TORGO corpus.…”
Section: Trends Of Avsr Technologies For Dysarthric Speechmentioning
confidence: 99%
“…Their experimental results on the UASpeech corpus demonstrated that the AVSR based on cross-domain visual feature generation outperformed baseline ASR and AVSR without this approach. Hu, et al [130] proposed a cross-domain acoustic-to-articulator inversion approach. Their model was pre-trained using parallel acousticarticulatory data from the 15-hour TORGO corpus.…”
Section: Trends Of Avsr Technologies For Dysarthric Speechmentioning
confidence: 99%
“…and a 1h evaluation set (1892 utt.). Further speed perturbation [24], [104] produces a 34.1h augmented training set (61813 utt. ).…”
Section: A Experiments On Dysarthric Speechmentioning
confidence: 99%
“…After removal of excessive silence, the training and test set contain 6.5 hours (14541 utterances) and 1 hour (1892 utterances) of speech respectively. After data augmentation with both speaker dependent and speaker independent speed perturbation [15], [126], the augmented training set contains 34.1 hours of data (61813 utterances).…”
Section: A Experiments On Dysarthric Speechmentioning
confidence: 99%