ObjectiveTo conduct a systematic review of deep learning models for electronic health record (EHR) data, and illustrate various deep learning architectures for analyzing different data sources and their target applications. We also highlight ongoing research and identify open challenges in building deep learning models of EHRs.Design/methodWe searched PubMed and Google Scholar for papers on deep learning studies using EHR data published between January 1, 2010, and January 31, 2018. We summarize them according to these axes: types of analytics tasks, types of deep learning model architectures, special challenges arising from health data and tasks and their potential solutions, as well as evaluation strategies.ResultsWe surveyed and analyzed multiple aspects of the 98 articles we found and identified the following analytics tasks: disease detection/classification, sequential prediction of clinical events, concept embedding, data augmentation, and EHR data privacy. We then studied how deep architectures were applied to these tasks. We also discussed some special challenges arising from modeling EHR data and reviewed a few popular approaches. Finally, we summarized how performance evaluations were conducted for each task.DiscussionDespite the early success in using deep learning for health analytics applications, there still exist a number of issues to be addressed. We discuss them in detail including data and label availability, the interpretability and transparency of the model, and ease of deployment.
The graph convolutional networks (GCN) recently proposed by Kipf and Welling are an effective graph model for semi-supervised learning. This model, however, was originally designed to be learned with the presence of both training and test data. Moreover, the recursive neighborhood expansion across layers poses time and memory challenges for training with large, dense graphs. To relax the requirement of simultaneous availability of test data, we interpret graph convolutions as integral transforms of embedding functions under probability measures. Such an interpretation allows for the use of Monte Carlo approaches to consistently estimate the integrals, which in turn leads to a batched training scheme as we propose in this work-FastGCN. Enhanced with importance sampling, FastGCN not only is efficient for training but also generalizes well for inference. We show a comprehensive set of experiments to demonstrate its effectiveness compared with GCN and related models. In particular, training is orders of magnitude more efficient while predictions remain comparably accurate.
Summary Accurate prediction of drug–target interactions (DTI) is crucial for drug discovery. Recently, deep learning (DL) models for show promising performance for DTI prediction. However, these models can be difficult to use for both computer scientists entering the biomedical field and bioinformaticians with limited DL experience. We present DeepPurpose, a comprehensive and easy-to-use DL library for DTI prediction. DeepPurpose supports training of customized DTI prediction models by implementing 15 compound and protein encoders and over 50 neural architectures, along with providing many other useful features. We demonstrate state-of-the-art performance of DeepPurpose on several benchmark datasets. Availability and implementation https://github.com/kexinhuang12345/DeepPurpose. Contact jimeng@illinois.edu Supplementary information Supplementary data are available at Bioinformatics online.
Motivation Drug target interaction (DTI) prediction is a foundational task for in-silico drug discovery, which is costly and time-consuming due to the need of experimental search over large drug compound space. Recent years have witnessed promising progress for deep learning in DTI predictions. However, the following challenges are still open: (1) existing molecular representation learning approaches ignore the sub-structural nature of DTI, thus produce results that are less accurate and difficult to explain; (2) existing methods focus on limited labeled data while ignoring the value of massive unlabelled molecular data. Results We propose a Molecular Interaction Transformer (MolTrans) to address these limitations via: (1) knowledge inspired sub-structural pattern mining algorithm and interaction modeling module for more accurate and interpretable DTI prediction; (2) an augmented transformer encoder to better extract and capture the semantic relations among substructures extracted from massive unlabeled biomedical data. We evaluate MolTrans on real world data and show it improved DTI prediction performance compared to state-of-the-art baselines. Availability The model scripts is available at https://github.com/kexinhuang12345/moltrans. Contact jimeng@illinois.edu Supplementary information Supplementary data are available at Bioinformatics online.
Parkinson’s disease (PD) is associated with diverse clinical manifestations including motor and non-motor signs and symptoms, and emerging biomarkers. We aimed to reveal the heterogeneity of PD to define subtypes and their progression rates using an automated deep learning algorithm on the top of longitudinal clinical records. This study utilizes the data collected from the Parkinson’s Progression Markers Initiative (PPMI), which is a longitudinal cohort study of patients with newly diagnosed Parkinson’s disease. Clinical information including motor and non-motor assessments, biospecimen examinations, and neuroimaging results were used for identification of PD subtypes. A deep learning algorithm, Long-Short Term Memory (LSTM), was used to represent each patient as a multi-dimensional time series for subtype identification. Both visualization and statistical analysis were performed for analyzing the obtained PD subtypes. As a result, 466 patients with idiopathic PD were investigated and three subtypes were identified. Subtype I (Mild Baseline, Moderate Motor Progression) is comprised of 43.1% of the participants, with average age 58.79 ± 9.53 years, and was characterized by moderate functional decay in motor ability but stable cognitive ability. Subtype II (Moderate Baseline, Mild Progression) is comprised of 22.9% of the participants, with average age 61.93 ± 6.56 years, and was characterized by mild functional decay in both motor and non-motor symptoms. Subtype III (Severe Baseline, Rapid Progression) is comprised 33.9% of the patients, with average age 65.32 ± 8.86 years, and was characterized by rapid progression of both motor and non-motor symptoms. These subtypes suggest that when comprehensive clinical and biomarker data are incorporated into a deep learning algorithm, the disease progression rates do not necessarily associate with baseline severities, and the progression rate of non-motor symptoms is not necessarily correlated with the progression rate of motor symptoms.
Recent progress in deep learning is revolutionizing the healthcare domain including providing solutions to medication recommendations, especially recommending medication combination for patients with complex health conditions. Existing approaches either do not customize based on patient health history, or ignore existing knowledge on drug-drug interactions (DDI) that might lead to adverse outcomes. To fill this gap, we propose the Graph Augmented Memory Networks (GAMENet), which integrates the drug-drug interactions knowledge graph by a memory module implemented as a graph convolutional networks, and models longitudinal patient records as the query. It is trained end-to-end to provide safe and personalized recommendation of medication combination. We demonstrate the effectiveness and safety of GAMENet by comparing with several state-of-the-art methods on real EHR data. GAMENet outperformed all baselines in all effectiveness measures, and also achieved 3.60% DDI rate reduction from existing EHR data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.