Interspeech 2020 2020
DOI: 10.21437/interspeech.2020-3137
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal Inductive Transfer Learning for Detection of Alzheimer’s Dementia and its Severity

Abstract: Alzheimer's disease is estimated to affect around 50 million people worldwide and is rising rapidly, with a global economic burden of nearly a trillion dollars. This calls for scalable, cost-effective, and robust methods for detection of Alzheimer's dementia (AD). We present a novel architecture that leverages acoustic, cognitive, and linguistic features to form a multimodal ensemble system. It uses specialized artificial neural networks with temporal characteristics to detect AD and its severity, which is ref… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 36 publications
(24 citation statements)
references
References 25 publications
0
22
0
Order By: Relevance
“…Karlekar et al ( 2018 ) achieved 91% accuracy using a Convoluted Neural Network (CNN)-RNN model trained on part-of-speech-tagged utterances. Using CNN on both DementiaBank and ADReSS data, Sarawgi et al ( 2020 ) presented an ensemble of three models: disfluencies, acoustic, and intervention. Balagopalan et al ( 2020 ) and Pappagari et al ( 2020 ) showed that fine-tuned bidirectional encoder representations from transformers (BERT) outperformed models with hand-engineered features.…”
Section: Related Workmentioning
confidence: 99%
“…Karlekar et al ( 2018 ) achieved 91% accuracy using a Convoluted Neural Network (CNN)-RNN model trained on part-of-speech-tagged utterances. Using CNN on both DementiaBank and ADReSS data, Sarawgi et al ( 2020 ) presented an ensemble of three models: disfluencies, acoustic, and intervention. Balagopalan et al ( 2020 ) and Pappagari et al ( 2020 ) showed that fine-tuned bidirectional encoder representations from transformers (BERT) outperformed models with hand-engineered features.…”
Section: Related Workmentioning
confidence: 99%
“…Following several other works that used the DB data set (Hernández-Domínguez et al, 2018;Pou-Prom and Rudzicz, 2018;Sarawgi et al, 2020), all of our experiments are conducted with K-fold cross validation. While the small size of the DB data set helps to justify this as a validation procedure, optimizing a cross validated performance metric (accuracy, F1, etc.)…”
Section: Discussionmentioning
confidence: 99%
“…Methods with bimodal input features (both acoustic and linguistic) are also used for AD recognition in various studies (Sarawgi et al, 2020a;Sarawgi et al, 2020b;Campbell et al, 2020;Koo et al, 2020;Pompili et al, 2020;Rohanian et al, 2020). However, in this work, we restrict ourselves to the NLP-based approaches.…”
Section: Bimodal Methodsmentioning
confidence: 99%