Background Sepsis is diagnosed in millions of people every year, resulting in a high mortality rate. Although patients with sepsis present multimorbid conditions, including cancer, sepsis predictions have mainly focused on patients with severe injuries. Objective In this paper, we present a machine learning–based approach to identify the risk of sepsis in patients with cancer using electronic health records (EHRs). Methods We utilized deidentified anonymized EHRs of 8580 patients with cancer from the Samsung Medical Center in Korea in a longitudinal manner between 2014 and 2019. To build a prediction model based on physical status that would differ between sepsis and nonsepsis patients, we analyzed 2462 laboratory test results and 2266 medication prescriptions using graph network and statistical analyses. The medication relationships and lab test results from each analysis were used as additional learning features to train our predictive model. Results Patients with sepsis showed differential medication trajectories and physical status. For example, in the network-based analysis, narcotic analgesics were prescribed more often in the sepsis group, along with other drugs. Likewise, 35 types of lab tests, including albumin, globulin, and prothrombin time, showed significantly different distributions between sepsis and nonsepsis patients (P<.001). Our model outperformed the model trained using only common EHRs, showing an improved accuracy, area under the receiver operating characteristic (AUROC), and F1 score by 11.9%, 11.3%, and 13.6%, respectively. For the random forest–based model, the accuracy, AUROC, and F1 score were 0.692, 0.753, and 0.602, respectively. Conclusions We showed that lab tests and medication relationships can be used as efficient features for predicting sepsis in patients with cancer. Consequently, identifying the risk of sepsis in patients with cancer using EHRs and machine learning is feasible.
Understanding mortality, derived from debilitations consisting of multiple diseases, is crucial for patient stratification. Here, in systematic fashion, we report comprehensive mortality data that map the temporal correlation of diseases that tend toward deaths in hospitals. We used a mortality trajectory model that represents the temporal ordering of disease appearance, with strong correlations, that terminated in fatal outcomes from one initial diagnosis in a set of patients throughout multiple admissions. Based on longitudinal healthcare records of 10.4 million patients from over 350 hospitals, we profiled 300 mortality trajectories, starting from 118 diseases, in 311,309 patients. Three-quarters (75%) of 59,794 end-stage patients and their deaths accrued throughout 160,360 multiple disease appearances in a short-term period (<4 years, 3.5 diseases per patient). This overlooked and substantial heterogeneity of disease patients and outcomes in the real world is unraveled in our trajectory map at the disease-wide level. For example, the converged dead-end in our trajectory map presents an extreme diversity of sepsis patients based on 43 prior diseases, including lymphoma and cardiac diseases. The trajectories involving the largest number of deaths for each age group highlight the essential predisposing diseases, such as acute myocardial infarction and liver cirrhosis, which lead to over 14,000 deaths. In conclusion, the deciphering of the debilitation processes of patients, consisting of the temporal correlations of diseases that tend towards hospital death at a population-wide level is feasible.
Observations of comorbidity in heart diseases, including cardiac dysfunction (CD) are increasing, including and cognitive impairment, such as Alzheimer’s disease and dementia (AD/D). This comorbidity might be due to a pleiotropic effect of genetic variants shared between CD and AD/D. Here, we validated comorbidity of CD and AD/D based on diagnostic records from millions of patients in Korea and the University of California, San Francisco Medical Center (odds ratio 11.5 [8.5–15.5, 95% Confidence Interval (CI)]). By integrating a comprehensive human disease–SNP association database (VARIMED, VARiants Informing MEDicine) and whole-exome sequencing of 50 brains from individuals with and without Alzheimer's disease (AD), we identified missense variants in coding regions including APOB , a known risk factor for CD and AD/D, which potentially have a pleiotropic role in both diseases. Of the identified variants, site-directed mutation of ADIPOQ (268 G > A; Gly90Ser) in neurons produced abnormal aggregation of tau proteins ( p = 0.02), suggesting a functional impact for AD/D. The association of CD and ADIPOQ variants was confirmed based on domain deletion in cardiac cells. Using the UK Biobank including data from over 500000 individuals, we examined a pleiotropic effect of the ADIPOQ variant by comparing CD- and AD/D-associated phenotypic evidence, including cardiac hypertrophy and cognitive degeneration. These results indicate that convergence of health care records and genetic evidences may help to dissect the molecular underpinnings of heart disease and associated cognitive impairment, and could potentially serve a prognostic function. Validation of disease–disease associations through health care records and genomic evidence can determine whether health conditions share risk factors based on pleiotropy.
BACKGROUND Sepsis is diagnosed in millions of people every year, resulting in a high mortality rate. Although patients with sepsis present multimorbid conditions, including cancer, sepsis predictions have mainly focused on patients with severe injuries. OBJECTIVE In this paper, we present a machine learning–based approach to identify the risk of sepsis in patients with cancer using electronic health records (EHRs). METHODS We utilized deidentified anonymized EHRs of 8580 patients with cancer from the Samsung Medical Center in Korea in a longitudinal manner between 2014 and 2019. To build a prediction model based on physical status that would differ between sepsis and nonsepsis patients, we analyzed 2462 laboratory test results and 2266 medication prescriptions using graph network and statistical analyses. The medication relationships and lab test results from each analysis were used as additional learning features to train our predictive model. RESULTS Patients with sepsis showed differential medication trajectories and physical status. For example, in the network-based analysis, narcotic analgesics were prescribed more often in the sepsis group, along with other drugs. Likewise, 35 types of lab tests, including albumin, globulin, and prothrombin time, showed significantly different distributions between sepsis and nonsepsis patients <i>(<i>P</i><</i>.001). Our model outperformed the model trained using only common EHRs, showing an improved accuracy, area under the receiver operating characteristic (AUROC), and F1 score by 11.9%, 11.3%, and 13.6%, respectively. For the random forest–based model, the accuracy, AUROC, and F1 score were 0.692, 0.753, and 0.602, respectively. CONCLUSIONS We showed that lab tests and medication relationships can be used as efficient features for predicting sepsis in patients with cancer. Consequently, identifying the risk of sepsis in patients with cancer using EHRs and machine learning is feasible.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.