Background COVID-19 is caused by the SARS-CoV-2 virus and has strikingly heterogeneous clinical manifestations, with most individuals contracting mild disease but a substantial minority experiencing fulminant cardiopulmonary symptoms or death. The clinical covariates and the laboratory tests performed on a patient provide robust statistics to guide clinical treatment. Deep learning approaches on a data set of this nature enable patient stratification and provide methods to guide clinical treatment. Objective Here, we report on the development and prospective validation of a state-of-the-art machine learning model to provide mortality prediction shortly after confirmation of SARS-CoV-2 infection in the Mayo Clinic patient population. Methods We retrospectively constructed one of the largest reported and most geographically diverse laboratory information system and electronic health record of COVID-19 data sets in the published literature, which included 11,807 patients residing in 41 states of the United States of America and treated at medical sites across 5 states in 3 time zones. Traditional machine learning models were evaluated independently as well as in a stacked learner approach by using AutoGluon, and various recurrent neural network architectures were considered. The traditional machine learning models were implemented using the AutoGluon-Tabular framework, whereas the recurrent neural networks utilized the TensorFlow Keras framework. We trained these models to operate solely using routine laboratory measurements and clinical covariates available within 72 hours of a patient’s first positive COVID-19 nucleic acid test result. Results The GRU-D recurrent neural network achieved peak cross-validation performance with 0.938 (SE 0.004) as the area under the receiver operating characteristic (AUROC) curve. This model retained strong performance by reducing the follow-up time to 12 hours (0.916 [SE 0.005] AUROC), and the leave-one-out feature importance analysis indicated that the most independently valuable features were age, Charlson comorbidity index, minimum oxygen saturation, fibrinogen level, and serum iron level. In the prospective testing cohort, this model provided an AUROC of 0.901 and a statistically significant difference in survival (P<.001, hazard ratio for those predicted to survive, 95% CI 0.043-0.106). Conclusions Our deep learning approach using GRU-D provides an alert system to flag mortality for COVID-19–positive patients by using clinical covariates and laboratory values within a 72-hour window after the first positive nucleic acid test result.
BACKGROUND COVID-19 is caused by the SARS-CoV-2 virus and has strikingly heterogeneous clinical manifestations with most individuals contracting mild disease but a substantial minority experiencing fulminant cardiopulmonary symptoms or death. The clinical covariates and the lab tests performed on a patient provides robust statistics to guide clinical treatment. Deep learning approaches on a dataset of this nature enables patient stratification and provide methods to guide clinical treatment. OBJECTIVE Here we report on the development and prospective validation of a state-of-the-art machine learning model to provide mortality prediction shortly after confirmation of SARS-CoV-2 infection in the Mayo Clinic patient population. METHODS We constructed one of the largest reported and most geographically diverse laboratory information system (LIS) and electronic health record (EHR) COVID-19 datasets in the published literature, which included 11,808 patients with residence in 41 states, treated at medical sites across five states in three time zones. This data was split by date into an 80/20 training and prospective testing cohort. In the training data, model selection and evaluation were performed using stratified 10-fold cross-validation. Traditional machine learning models were evaluated independently as well as in a stacked learner approach using Autogluon, and various recurrent neural network architectures were considered. We trained these models to operate solely using routine laboratory measurements and clinical covariates available within 72 hours of a patient’s first positive COVID-19 nucleic acid test. RESULTS The GRU-D recurrent neural network achieved peak cross-validation performance with 0.938±0.004 AUROC. In cross-validation, this model provides accuracy of 89% (95% CI: [88,90]), a recall of 80% (95% CI: [74,85]), a precision of 17% (95% CI: [15,19]), a negative predictive value (NPV) of 99% (95% CI: [99,100]), and statistically significant stratification in our Cox proportional hazards survival model (risk 18.9, P<.001). The model retained strong performance when reducing the follow-up time down to 12 hours (0.916±0.005 AUROC), and leave-one-out feature importance analysis indicates the most independently valuable features were: age, Charlson score, minimum oxygen saturation, fibrinogen and serum iron level. In the prospective testing cohort this model provides AUROC of 0.901, an accuracy of 78% (95% CI: [76,79]), a recall of 85% (95% CI: [77,91]), a precision of 14% (95% CI: [12,17]), a negative predictive value (NPV) of 99% (95% CI: [99,100]), and statistically significant difference in survival (P<.001, hazard ratio for those predicted to survive: 95% CI [0.043,0.106]). CONCLUSIONS Our deep learning approach using GRU-D provides an alert system to flag mortality on COVID-19 positive patients, using clinical covariates and lab values within a 72-hour window after the first positive nucleic acid test.
No abstract
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.