Background
We sought to leverage data routinely collected in electronic health records (EHRs), with the goal of developing patient risk stratification tools for predicting risk of developing Alzheimer's disease (AD).
Method
Using EHR data from the University of Michigan (UM) hospitals and consensus‐based diagnoses from the Michigan Alzheimer's Disease Research Center, we developed and validated a cohort discovery tool for identifying patients with AD. Applied to all UM patients, these labels were used to train an EHR‐based machine learning model for predicting AD onset within 10 years.
Results
Applied to a test cohort of 1697 UM patients, the model achieved an area under the receiver operating characteristics curve of 0.70 (95% confidence interval = 0.63‐0.77). Important predictive factors included cardiovascular factors and laboratory blood testing.
Conclusion
Routinely collected EHR data can be used to predict AD onset with modest accuracy. Mining routinely collected data could shed light on early indicators of AD appearance and progression.
Introduction
Models characterizing intermediate disease stages of Alzheimer's disease (AD) are needed to inform clinical care and prognosis. Current models, however, use only a small subset of available biomarkers, capturing only coarse changes along the complete spectrum of disease progression. We propose the use of machine learning techniques and clinical, biochemical, and neuroimaging biomarkers to characterize progression to AD.
Methods
We used a large multimodal longitudinal data set of biomarkers and demographic and genotype information from 1624 participants from the Alzheimer's Disease Neuroimaging Initiative. Using hidden Markov models, we characterized intermediate disease stages. We validated inferred disease trajectories by comparing time to first clinical AD diagnosis. We trained an L2-regularized logistic regression model to predict disease trajectory and evaluated its discriminative performance on a test set.
Results
We identified 12 distinct disease states. Progression to AD occurred most often through one of two possible paths through these states. Paths differed in terms of rate of disease progression (by 5.44 years on average), amyloid and total-tau (t-tau) burden (by 10% and 69%, respectively), and hippocampal neurodegeneration (
P
< .001). On the test set, the predictive model achieved an area under the receiver operating characteristic curve of 0.85.
Discussion
Progression to AD, in terms of biomarker trajectories, can be predicted based on participant-specific factors. Such disease staging tools could help in targeting high-risk patients for therapeutic intervention trials. As longitudinal data with richer features are collected, such models will help increase our understanding of the factors that drive the different trajectories of AD.
Introduction:Studies investigating the relationship between blood pressure (BP) measurements from electronic health records (EHRs) and Alzheimer's disease (AD) rely on summary statistics, like BP variability, and have only been validated at a single institution. We hypothesize that leveraging BP trajectories can accurately estimate AD risk across different populations.
Methods:In a retrospective cohort study, EHR data from Veterans Affairs (VA) patients were used to train and internally validate a machine learning model to predict AD onset within 5 years. External validation was conducted on patients from Michigan Medicine (MM).
Results:The VA and MM cohorts included 6860 and 1201 patients, respectively.Model performance using BP trajectories was modest but comparable (area under the receiver operating characteristic curve [AUROC] = 0.64 [95% confidence interval (CI) = 0.54-0.73] for VA vs. AUROC = 0.66 [95% CI = 0.55-0.76] for MM).
Conclusion:Approaches that directly leverage BP trajectories from EHR data could aid in AD risk stratification across institutions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.