While gene expression profiling is commonly used to gain an overview of cellular processes, the identification of upstream processes that drive expression changes remains a challenge. To address this issue, we introduce CARNIVAL, a causal network contextualization tool which derives network architectures from gene expression footprints. CARNIVAL (CAusal Reasoning pipeline for Network identification using Integer VALue programming) integrates different sources of prior knowledge including signed and directed protein–protein interactions, transcription factor targets, and pathway signatures. The use of prior knowledge in CARNIVAL enables capturing a broad set of upstream cellular processes and regulators, leading to a higher accuracy when benchmarked against related tools. Implementation as an integer linear programming (ILP) problem guarantees efficient computation. As a case study, we applied CARNIVAL to contextualize signaling networks from gene expression data in IgA nephropathy (IgAN), a condition that can lead to chronic kidney disease. CARNIVAL identified specific signaling pathways and associated mediators dysregulated in IgAN including Wnt and TGF-β, which we subsequently validated experimentally. These results demonstrated how CARNIVAL generates hypotheses on potential upstream alterations that propagate through signaling networks, providing insights into diseases.
Background Drug-induced liver injury (DILI) is a major safety concern characterized by a complex and diverse pathogenesis. In order to identify DILI early in drug development, a better understanding of the injury and models with better predictivity are urgently needed. One approach in this regard are in silico models which aim at predicting the risk of DILI based on the compound structure. However, these models do not yet show sufficient predictive performance or interpretability to be useful for decision making by themselves, the former partially stemming from the underlying problem of labeling the in vivo DILI risk of compounds in a meaningful way for generating machine learning models. Results As part of the Critical Assessment of Massive Data Analysis (CAMDA) “CMap Drug Safety Challenge” 2019 (http://camda2019.bioinf.jku.at), chemical structure-based models were generated using the binarized DILIrank annotations. Support Vector Machine (SVM) and Random Forest (RF) classifiers showed comparable performance to previously published models with a mean balanced accuracy over models generated using 5-fold LOCO-CV inside a 10-fold training scheme of 0.759 ± 0.027 when predicting an external test set. In the models which used predicted protein targets as compound descriptors, we identified the most information-rich proteins which agreed with the mechanisms of action and toxicity of nonsteroidal anti-inflammatory drugs (NSAIDs), one of the most important drug classes causing DILI, stress response via TP53 and biotransformation. In addition, we identified multiple proteins involved in xenobiotic metabolism which could be novel DILI-related off-targets, such as CLK1 and DYRK2. Moreover, we derived potential structural alerts for DILI with high precision, including furan and hydrazine derivatives; however, all derived alerts were present in approved drugs and were over specific indicating the need to consider quantitative variables such as dose. Conclusion Using chemical structure-based descriptors such as structural fingerprints and predicted protein targets, DILI prediction models were built with a predictive performance comparable to previous literature. In addition, we derived insights on proteins and pathways statistically (and potentially causally) linked to DILI from these models and inferred new structural alerts related to this adverse endpoint.
The global outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) necessitates the rapid development of new therapies against coronavirus disease 2019 (COVID-19) infection. Here, we present the identification of 200 approved drugs, appropriate for repurposing against COVID-19. We constructed a SARS-CoV-2–induced protein network, based on disease signatures defined by COVID-19 multiomics datasets, and cross-examined these pathways against approved drugs. This analysis identified 200 drugs predicted to target SARS-CoV-2–induced pathways, 40 of which are already in COVID-19 clinical trials, testifying to the validity of the approach. Using artificial neural network analysis, we classified these 200 drugs into nine distinct pathways, within two overarching mechanisms of action (MoAs): viral replication (126) and immune response (74). Two drugs (proguanil and sulfasalazine) implicated in viral replication were shown to inhibit replication in cell assays. This unbiased and validated analysis opens new avenues for the rapid repurposing of approved drugs into clinical trials.
While gene expression profiling is commonly used to gain an overview of cellular processes, the identification of upstream processes that drive expression changes remains a challenge. To address this issue, we introduce CARNIVAL, a causal network contextualization tool which derives network architectures from gene expression footprints. CARNIVAL (CAusal Reasoning pipeline for Network identification using Integer VALue programming) integrates different sources of prior knowledge including signed and directed protein-protein interactions, transcription factor targets, and pathway signatures. The use of prior knowledge in CARNIVAL enables capturing a broad set of upstream cellular processes and regulators, leading to a higher accuracy when benchmarked against related tools. Implementation as an integer linear programming (ILP) problem guarantees efficient computation. As a case study, we applied CARNIVAL to contextualize signaling networks from gene expression data in IgA nephropathy, a condition that can lead to chronic kidney disease. CARNIVAL identified specific signaling pathways and associated mediators dysregulated in IgAN including WNT and TGF-β, that we subsequently validated experimentally. These results demonstrated how CARNIVAL generates hypotheses on potential upstream alterations that propagate through signaling networks, providing insights into diseases.
The global outbreak of SARS-CoV-2 necessitates the rapid development of new therapies against COVID-19 infection. Here, we present the identification of 200 approved drugs, appropriate for repurposing against COVID-19. We constructed a SARS-CoV-2-induced protein (SIP) network, based on disease signatures defined by COVID-19 multi-omic datasets(Bojkova et al., 2020; Gordon et al., 2020), and cross-examined these pathways against approved drugs. This analysis identified 200 drugs predicted to target SARS-CoV-2-induced pathways, 40 of which are already in COVID-19 clinical trials(Clinicaltrials.gov, 2020) testifying to the validity of the approach. Using artificial neural network analysis we classified these 200 drugs into 9 distinct pathways, within two overarching mechanisms of action (MoAs): viral replication (130) and immune response (70). A subset of drugs implicated in viral replication were tested in cellular assays and two (proguanil and sulfasalazine) were shown to inhibit replication. This unbiased and validated analysis opens new avenues for the rapid repurposing of approved drugs into clinical trials.
Background: Drug-induced liver injury (DILI) is a major safety concern characterized by a complex and diverse pathogenesis. In order to identify DILI early in drug development, a better understanding of the injury and models with better predictivity are urgently needed. One approach in this regard are in silico models which aim at predicting the risk of DILI based on the compound structure. However, these models do yet show sufficient predictive performance or interpretability to be useful for decision making by themselves, the former partially stemming from the underlying problem of labeling the in vivo DILI risk of compounds in a meaningful way for generating machine learning models.Results: As part of the Critical Assessment of Massive Data Analysis (CAMDA) “CMap Drug Safety Challenge” 2019 (http://papers.camda.info/), chemical structure-based models were generated using the binarized DILIrank annotations. Support Vector Machine (SVM) and Random Forest (RF) classifiers showed comparable performance to previously published models with a mean balanced accuracy over models generated using 5-fold cross-validation inside a 10-fold training scheme of 0.759±0.027 when predicting an external test set. In the models which used predicted protein targets as compound descriptors, we identified the most information-rich proteins which agreed with the mechanisms of action and toxicity of nonsteroidal anti-inflammatory drugs (NSAIDs), one of the most important drug classes causing DILI, and the previously established stress response pathways mediated by mitochondria, p38 MAPK and TP53. In addition, we identified multiple proteins involved in xenobiotic metabolism which could be novel DILI-related off-targets, such as CLK1 and DDR2. Moreover, we derived potential structural alerts for DILI with high precision, including furan and hydrazine derivatives; however, all derived alerts were present in approved drugs and were over specific indicating the need to consider quantitative variables such as dose.Conclusion: Using chemical structure-based descriptors such as structural fingerprints and predicted protein targets, DILI prediction models were built with a predictive performance comparable to previous literature. In addition, we derived insights on proteins and pathways statistically (and potentially causally) linked to DILI from these models and inferred new structural alerts related to this adverse endpoint.
Drug-induced liver injury (DILI) is a class of adverse drug reactions (ADR) that causes problems in both clinical and research settings. It is the most frequent cause of acute liver failure in the majority of Western countries and is a major cause of attrition of novel drug candidates. Manual trawling of the literature is the main route of deriving information on DILI from research studies. This makes it an inefficient process prone to human error. Therefore, an automatized AI model capable of retrieving DILI-related articles from the huge ocean of literature could be invaluable for the drug discovery community. In this study, we built an artificial intelligence (AI) model combining the power of natural language processing (NLP) and machine learning (ML) to address this problem. This model uses NLP to filter out meaningless text (e.g., stop words) and uses customized functions to extract relevant keywords such as singleton, pair, and triplet. These keywords are processed by an apriori pattern mining algorithm to extract relevant patterns which are used to estimate initial weightings for a ML classifier. Along with pattern importance and frequency, an FDA-approved drug list mentioning DILI adds extra confidence in classification. The combined power of these methods builds a DILI classifier (DILIC), with 94.91% cross-validation and 94.14% external validation accuracy. To make DILIC as accessible as possible, including to researchers without coding experience, an R Shiny app capable of classifying single or multiple entries for DILI is developed to enhance ease of user experience and made available at https://researchmind.co.uk/diliclassifier/. Additionally, a GitHub link (https://github.com/sanjaysinghrathi/DILI-Classifier) for app source code and ISMB extended video talk (https://www.youtube.com/watch?v=j305yIVi_f8) are available as supplementary materials.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.