Interspeech 2021 2021
DOI: 10.21437/interspeech.2021-1519
|View full text |Cite
|
Sign up to set email alerts
|

Using the Outputs of Different Automatic Speech Recognition Paradigms for Acoustic- and BERT-Based Alzheimer’s Dementia Detection Through Spontaneous Speech

Abstract: Exploring acoustic and linguistic information embedded in spontaneous speech recordings has proven to be efficient for automatic Alzheimer's dementia detection. Acoustic features can be extracted directly from the audio recordings, however, linguistic features, in fully automatic systems, need to be extracted from transcripts generated by an automatic speech recognition (ASR) system. We explore two state-of-the-art ASR paradigms, Wav2vec2.0 (for transcription and feature extraction) and time delay neural netwo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
25
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(27 citation statements)
references
References 23 publications
0
25
0
Order By: Relevance
“…We benchmark our proposed GPT-3 embedding (Babbage) method against other state-of-the-art AD detection models. The existing methods include the studies from Luz et al [ 21 ], Balagopalan & Novikova [ 8 ] and Pan et al [ 31 ], which all used the ADReSSo Challenge data. The models selected are all trained based on the 10-fold CV and evaluated on the same unseen test set to ensure fair comparison.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We benchmark our proposed GPT-3 embedding (Babbage) method against other state-of-the-art AD detection models. The existing methods include the studies from Luz et al [ 21 ], Balagopalan & Novikova [ 8 ] and Pan et al [ 31 ], which all used the ADReSSo Challenge data. The models selected are all trained based on the 10-fold CV and evaluated on the same unseen test set to ensure fair comparison.…”
Section: Resultsmentioning
confidence: 99%
“…The models selected are all trained based on the 10-fold CV and evaluated on the same unseen test set to ensure fair comparison. For example, we do not include Model 4 & 5 in Pan et al [ 31 ] as the models were trained by holding out 20% of the training set. Instead, we select the best model (Model 2), which was trained using 10-fold CV.…”
Section: Resultsmentioning
confidence: 99%
“…In addition, some other text-based pre-trained models work well. For example, the accuracies of BERT, part of BERT or BERT-based adaptation models [46,47,54,65] were between 81% and 84.51%. Except for the text-based pre-trained models, audio and image-based pre-trained models also have been explored in speechbased AD detection.…”
Section: Comparisons Of Methods For the Adress Challengementioning
confidence: 99%
“…based assistive technologies more natural alternatives [23], [24] even though speech quality is degraded. To this end, in recent years there has been increasing interest in developing ASR technologies that are suitable for dysarthric [9], [25]- [40] and elderly speech [14], [41]- [46].…”
Section: Introductionmentioning
confidence: 99%