BackgroundThe manual diagnosis of neurodegenerative disorders such as Alzheimer’s disease (AD) and related Dementias has been a challenge. Currently, these disorders are diagnosed using specific clinical diagnostic criteria and neuropsychological examinations. The use of several Machine Learning algorithms to build automated diagnostic models using low-level linguistic features resulting from verbal utterances could aid diagnosis of patients with probable AD from a large population. For this purpose, we developed different Machine Learning models on the DementiaBank language transcript clinical dataset, consisting of 99 patients with probable AD and 99 healthy controls.ResultsOur models learned several syntactic, lexical, and n-gram linguistic biomarkers to distinguish the probable AD group from the healthy group. In contrast to the healthy group, we found that the probable AD patients had significantly less usage of syntactic components and significantly higher usage of lexical components in their language. Also, we observed a significant difference in the use of n-grams as the healthy group were able to identify and make sense of more objects in their n-grams than the probable AD group. As such, our best diagnostic model significantly distinguished the probable AD group from the healthy elderly group with a better Area Under the Receiving Operating Characteristics Curve (AUC) using the Support Vector Machines (SVM).ConclusionsExperimental and statistical evaluations suggest that using ML algorithms for learning linguistic biomarkers from the verbal utterances of elderly individuals could help the clinical diagnosis of probable AD. We emphasise that the best ML model for predicting the disease group combines significant syntactic, lexical and top n-gram features. However, there is a need to train the diagnostic models on larger datasets, which could lead to a better AUC and clinical diagnosis of probable AD.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1456-0) contains supplementary material, which is available to authorized users.
Early diagnosis of neurodegenerative disorders (ND) such as Alzheimer's disease (AD) and related Dementias is currently a challenge. Currently, AD can only be diagnosed by examining the patient's brain after death and Dementia is diagnosed typically through consensus using specific diagnostic criteria and extensive neuropsychological examinations with tools such as the Mini-Mental State Examination (MMSE) or the Montreal Cognitive Assessment (MoCA). In this paper, we use several Machine Learning (ML) algorithms to build diagnostic models using syntactic and lexical features resulting from verbal utterances of AD and related Dementia patients. We emphasize that the best diagnostic model distinguished the AD and related Dementias group from the healthy elderly group with 74% FMeasure using Support Vector Machines (SVM). Additionally, we perform several statistical tests to indicate the significance of the selected linguistic features. Our results show that syntactic and lexical features could be good indicative features for helping to diagnose AD and related Dementias.
It has been quite a challenge to diagnose Mild Cognitive Impairment due to Alzheimer’s disease (MCI) and Alzheimer-type dementia (AD-type dementia) using the currently available clinical diagnostic criteria and neuropsychological examinations. As such we propose an automated diagnostic technique using a variant of deep neural networks language models (DNNLM) on the verbal utterances of affected individuals. Motivated by the success of DNNLM on natural language tasks, we propose a combination of deep neural network and deep language models (D2NNLM) for classifying the disease. Results on the DementiaBank language transcript clinical dataset show that D2NNLM sufficiently learned several linguistic biomarkers in the form of higher order n-grams to distinguish the affected group from the healthy group with reasonable accuracy on very sparse clinical datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.