Background Many mortality prediction models have been developed for patients in intensive care units (ICUs); most are based on data available at ICU admission. We investigated whether machine learning methods using analyses of time-series data improved mortality prognostication for patients in the ICU by providing real-time predictions of 90-day mortality. In addition, we examined to what extent such a dynamic model could be made interpretable by quantifying and visualising the features that drive the predictions at different timepoints.Methods Based on the Simplified Acute Physiology Score (SAPS) III variables, we trained a machine learning model on longitudinal data from patients admitted to four ICUs in the Capital Region, Denmark, between 2011 and 2016. We included all patients older than 16 years of age, with an ICU stay lasting more than 1 h, and who had a Danish civil registration number to enable 90-day follow-up. We leveraged static data and physiological time-series data from electronic health records and the Danish National Patient Registry. A recurrent neural network was trained with a temporal resolution of 1 h. The model was internally validated using the holdout method with 20% of the training dataset and externally validated using previously unseen data from a fifth hospital in Denmark. Its performance was assessed with the Matthews correlation coefficient (MCC) and area under the receiver operating characteristic curve (AUROC) as metrics, using bootstrapping with 1000 samples with replacement to construct 95% CIs. A Shapley additive explanations algorithm was applied to the prediction model to obtain explanations of the features that drive patient-specific predictions, and the contributions of each of the 44 features in the model were analysed and compared with the variables in the original SAPS III model. Findings From a dataset containing 15 615 ICU admissions of 12 616 patients, we included 14 190 admissions of 11 492 patients in our analysis. Overall, 90-day mortality was 33⋅1% (3802 patients). The deep learning model showed a predictive performance on the holdout testing dataset that improved over the timecourse of an ICU stay: MCC 0⋅29 (95% CI 0⋅25-0⋅33) and AUROC 0⋅73 (0⋅71-0⋅74) at admission, 0⋅43 (0⋅40-0⋅47) and 0⋅82 (0⋅80-0⋅84) after 24 h, 0⋅50 (0⋅46-0⋅53) and 0⋅85 (0⋅84-0⋅87) after 72 h, and 0⋅57 (0⋅54-0⋅60) and 0⋅88 (0⋅87-0⋅89) at the time of discharge. The model exhibited good calibration properties. These results were validated in an external validation cohort of 5827 patients with 6748 admissions: MCC 0⋅29 (95% CI 0⋅27-0⋅32) and AUROC 0⋅75 (0⋅73-0⋅76) at admission, 0⋅41 (0⋅39-0⋅44) and 0⋅80 (0⋅79-0⋅81) after 24 h, 0⋅46 (0⋅43-0⋅48) and 0⋅82 (0⋅81-0⋅83) after 72 h, and 0⋅47 (0⋅44-0⋅49) and 0⋅83 (0⋅82-0⋅84) at the time of discharge.Interpretation The prediction of 90-day mortality improved with 1-h sampling intervals during the ICU stay. The dynamic risk prediction can also be explained for an individual patient, visualising the features contributing to the prediction at any point in ...
Implementing precision medicine hinges on the integration of omics data, such as proteomics, into the clinical decision-making process, but the quantity and diversity of biomedical data, and the spread of clinically relevant knowledge across multiple biomedical databases and publications, pose a challenge to data integration. Here we present the Clinical Knowledge Graph (CKG), an open-source platform currently comprising close to 20 million nodes and 220 million relationships that represent relevant experimental data, public databases and literature. The graph structure provides a flexible data model that is easily extendable to new nodes and relationships as new databases become available. The CKG incorporates statistical and machine learning algorithms that accelerate the analysis and interpretation of typical proteomics workflows. Using a set of proof-of-concept biomarker studies, we show how the CKG might augment and enrich proteomics data and help inform clinical decision-making.
Background Intensive-care units (ICUs) treat the most critically ill patients, which is complicated by the heterogeneity of the diseases that they encounter. Severity scores based mainly on acute physiology measures collected at ICU admission are used to predict mortality, but are non-specific, and predictions for individual patients can be inaccurate. We investigated whether inclusion of long-term disease history before ICU admission improves mortality predictions.Methods Registry data for long-term disease histories for more than 230 000 Danish ICU patients were used in a neural network to develop an ICU mortality prediction model. Long-term disease histories and acute physiology measures were aggregated to predict mortality risk for patients for whom both registry and ICU electronic patient record data were available. We compared mortality predictions with admission scores on the Simplified Acute Physiology Score (SAPS) II, the Acute Physiologic Assessment and Chronic Health Evaluation (APACHE) II, and the best available multimorbidity score, the Multimorbidity Index. An external validation set from an additional hospital was acquired after model construction to confirm the validity of our model. During initial model development data were split into a training set (85%) and an independent test set (15%), and a five-fold cross-validation was done during training to avoid overfitting. Neural networks were trained for datasets with disease history of 1 month, 3 months, 6 months, 1 year, 2⋅5 years, 5 years, 7⋅5 years, 10 years, and 23 years before ICU admission.Findings Mortality predictions with a model based solely on disease history outperformed the Multimorbidity Index (Matthews correlation coefficient 0⋅265 vs 0⋅065), and performed similarly to SAPS II and APACHE II (Matthews correlation coefficient with disease history, age, and sex 0•326 vs 0•347 and 0•300 for SAPS II and APACHE II, respectively). Diagnoses up to 10 years before ICU admission affected current mortality prediction. Aggregation of previous disease history and acute physiology measures in a neural network yielded the most precise predictions of in-hospital mortality (Matthews correlation coefficient 0⋅391 for in-hospital mortality compared with 0⋅347 with SAPS II and 0⋅300 with APACHE II). These results for the aggregated model were validated in an external independent dataset of 1528 patients (Matthews correlation coefficient for prediction of in-hospital mortality 0⋅341).Interpretation Longitudinal disease-spectrum-wide data available before ICU admission are useful for mortality prediction. Disease history can be used to differentiate mortality risk between patients with similar vital signs with more precision than SAPS II and APACHE II scores. Machine learning models can be deconvoluted to generate novel understandings of how ICU patient features from long-term and short-term events interact with each other. Explainable machine learning models are key in clinical settings, and our results emphasise how to progress towards the transformatio...
Sepsis affects millions of people every year, many of whom will die. In contrast to current survival prediction models for sepsis patients that primarily are based on data from within-admission clinical measurements (e.g. vital parameters and blood values), we aim for using the full disease history to predict sepsis mortality. We benefit from data in electronic medical records covering all hospital encounters in Denmark from 1996 to 2014. This data set included 6.6 million patients of whom almost 120,000 were diagnosed with the ICD-10 code: A41 ‘Other sepsis’. Interestingly, patients following recurrent trajectories of time-ordered co-morbidities had significantly increased sepsis mortality compared to those who did not follow a trajectory. We identified trajectories which significantly altered sepsis mortality, and found three major starting points in a combined temporal sepsis network: Alcohol abuse, Diabetes and Cardio-vascular diagnoses. Many cancers also increased sepsis mortality. Using the trajectory based stratification model we explain contradictory reports in relation to diabetes that recently have appeared in the literature. Finally, we compared the predictive power using 18.5 years of disease history to scoring based on within-admission clinical measurements emphasizing the value of long term data in novel patient scores that combine the two types of data.
The promise of precision medicine is to deliver personalized treatment based on the unique physiology of each patient. This concept was fueled by the genomic revolution, but it is now evident that integrating other types of omics data, like proteomics, into the clinical decisionmaking process will be essential to accomplish precision medicine goals. However, quantity and diversity of biomedical data, and the spread of clinically relevant knowledge across myriad biomedical databases and publications makes this exceptionally difficult. To address this, we developed the Clinical Knowledge Graph (CKG), an open source platform currently comprised of more than 16 million nodes and 220 million relationships to represent relevant experimental data, public databases and the literature. The CKG also incorporates the latest statistical and machine learning algorithms, drastically accelerating analysis and interpretation of typical proteomics workflows. We use several biomarker studies to illustrate how the CKG may support, enrich and accelerate clinical decision-making. Graphical Abstract
Imputation techniques provide means to replace missing measurements with a value and are used in almost all downstream analysis of mass spectrometry (MS) based proteomics data using label-free quantification (LFQ). Some methods only impute assuming the limit of detection (LOD) was not passed and therefore impute missing values with too low or too high intensities, potentially leading to biased results in downstream statistical analysis. Here we test how self supervised deep learning models can impute missing values in the context of LFQ at different levels: precursors, aggregated peptides or protein groups. We demonstrate how collaborative filtering, denoising autoencoders, and variational autoencoders can be used to reconstruct missing values and can make more relevant features available for downstream analysis compared to current approaches. Additionally, we show that deep learning approaches can model data in its entirety for imputation and offer an approach for controlled evaluation of imputation approaches. We applied our method, proteomics imputation modeling mass spectrometry (PIMMS), to an alcohol-related liver disease (ALD) cohort with blood plasma proteomics data available for 358 individuals. We identified 49 additional proteins (+52.7%) that are significantly differentially abundant across disease stages compared to traditional methods and found that some of these were predictive of ALD progression in machine learning models. We, therefore, suggest the use of deep learning approaches for imputing missing values in MS-based proteomics and provide workflows for these.
Prediction of survival for patients in intensive care units (ICUs) has been subject to intense research. However, no models exist that embrace the multiverse of data in ICUs. It is an open question whether deep learning methods using automated data integration with minimal pre-processing of mixed data domains such as free text, medical history and high-frequency data can provide discrete-time survival estimates for individual ICU patients. We trained a deep learning model on data from patients admitted to ten ICUs in the Capital Region of Denmark and the Region of Southern Denmark between 2011 and 2018. Inspired by natural language processing we mapped the electronic patient record data to an embedded representation and fed the data to a recurrent neural network with a multi-label output layer representing the chance of survival at different follow-up times. We evaluated the performance using the time-dependent concordance index. In addition, we quantified and visualized the drivers of survival predictions using the SHAP methodology. We included 37,355 admissions of 29,417 patients in our study. Our deep learning models outperformed traditional Cox proportional-hazard models with concordance index in the ranges 0.72–0.73, 0.71–0.72, 0.71, and 0.69–0.70, for models applied at baseline 0, 24, 48, and 72 h, respectively. Deep learning models based on a combination of entity embeddings and survival modelling is a feasible approach to obtain individualized survival estimates in data-rich settings such as the ICU. The interpretable nature of the models enables us to understand the impact of the different data domains.
Background The CACNA1C protein is a L-type calcium channel, which influence affective disorders. Purpose The purpose of the present study was to examine the possible association between the different genotypes of rs100677 CACNA1C gene and anxiety and other clinical symptoms in patients with unipolar depression. Patients and controls A total of 754 patients and 708 controls from the Danish Psychiatric Biobank participated. Results A significant correlation was found between anxiety and the A allele. It was further found that patients with the A allele more often were treated with electroconvulsive therapy and patients with the AA phenotype had the highest age. Limitations The only information about controls was their sex and that they were recruited from the blood bank. Two types of inclusion criteria were used. The clinical data were not complete for all patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.