Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-712
|View full text |Cite
|
Sign up to set email alerts
|

Conformer Based Elderly Speech Recognition System for Alzheimer’s Disease Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(6 citation statements)
references
References 0 publications
0
6
0
Order By: Relevance
“…It is noticed that when using the ground truth transcripts rather than ASR outputs, a comparable or worse performance was obtained with a F 1 score of 87%. Wang et al ( 2022a ) employed ASR optimization using neural architecture search, cross-domain adaptation and fine-grained elderly speaker adaptation and multi-pass rescoring based system combination with hybrid TDNN.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…It is noticed that when using the ground truth transcripts rather than ASR outputs, a comparable or worse performance was obtained with a F 1 score of 87%. Wang et al ( 2022a ) employed ASR optimization using neural architecture search, cross-domain adaptation and fine-grained elderly speaker adaptation and multi-pass rescoring based system combination with hybrid TDNN.…”
Section: Methodsmentioning
confidence: 99%
“…For example, Sarawgi et al ( 2020 ) extracted three diverse features and used model fusion strategies, resulting in an accuracy of 88% on Pitt dataset and 83.3% on the ADReSS dataset. Wang et al ( 2022a ) employed ASR optimization and model fusion strategies based on BERT and RoBERTa features. As a result, the paper achieved state-of-the-art performance with a F 1 score of 92% on the Pitt dataset.…”
Section: Methodsmentioning
confidence: 99%
“…We conduct our experiments with either ground truth manual transcripts or transcripts generated by ASR system from audios. ASR systems: The experimental results in [29] suggest that the transcripts generated by the adapted hybrid CNN-TDNN ASR system [24] achieve better AD detection performance than those obtained from the adapted E2E Conformer model [38]. Hence, the hybrid CNN-TDNN ASR system is used.…”
Section: Text Datamentioning
confidence: 99%
“…The wav2vec 2.0 models have also been used, for example, for detection of aphasia [22], for detection of stuttering [23], and for speech rating of disordered children's speech [24]. Various pre-training approaches have been used to detect Alzheimer's disease [25], [26], and heart failure [27]. However, only a few studies have applied these techniques on multi-class classification of voice disorders.…”
Section: Introductionmentioning
confidence: 99%