Histologically normal tissue adjacent to the tumor (NAT) is commonly used as a control in cancer studies. However, little is known about the transcriptomic profile of NAT, how it is influenced by the tumor, and how the profile compares with non-tumor-bearing tissues. Here, we integrate data from the Genotype-Tissue Expression project and The Cancer Genome Atlas to comprehensively analyze the transcriptomes of healthy, NAT, and tumor tissues in 6506 samples across eight tissues and corresponding tumor types. Our analysis shows that NAT presents a unique intermediate state between healthy and tumor. Differential gene expression and protein–protein interaction analyses reveal altered pathways shared among NATs across tissue types. We characterize a set of 18 genes that are specifically activated in NATs. By applying pathway and tissue composition analyses, we suggest a pan-cancer mechanism of pro-inflammatory signals from the tumor stimulates an inflammatory response in the adjacent endothelium.
The decreasing cost of genomic technologies has enabled the molecular characterization of large-scale clinical disease samples and of molecular changes upon drug treatment in various disease models. Exploring methods to relate diseases to potentially efficacious drugs through various molecular features is critically important in the discovery of new therapeutics. Here we show that the potency of a drug to reverse cancer-associated gene expression changes positively correlates with that drug’s efficacy in preclinical models of breast, liver and colon cancers. Using a systems-based approach, we predict four compounds showing high potency to reverse gene expression in liver cancer and validate that all four compounds are effective in five liver cancer cell lines. The in vivo efficacy of pyrvinium pamoate is further confirmed in a subcutaneous xenograft model. In conclusion, this systems-based approach may be complementary to the traditional target-based approach in connecting diseases to potentially efficacious drugs.
A central premise in systems pharmacology is that structurally similar compounds have similar cellular responses; however, this principle often does not hold. One of the most widely used measures of cellular response is gene expression. By integrating gene expression data from Library of Integrated Network-based Cellular Signatures (LINCS) with chemical structure and bioactivity data from PubChem, we performed a large-scale correlation analysis of chemical structures and gene expression profiles of over 11,000 compounds taking into account confounding factors such as biological conditions (e.g., cell line, dose) and bioactivities. We found that structurally similar compounds do indeed yield similar gene expression profiles. There is an ∼20% chance that two structurally similar compounds (Tanimoto Coefficient ≥ 0.85) share significantly similar gene expression profiles. Regardless of structural similarity, two compounds tend to share similar gene expression profiles in a cell line when they are administrated at a higher dose or when the cell line is sensitive to both compounds.
The Gene Expression Omnibus (GEO) contains more than two million digital samples from functional genomics experiments amassed over almost two decades. However, individual sample meta-data remains poorly described by unstructured free text attributes preventing its largescale reanalysis. We introduce the Search Tag Analyze Resource for GEO as a web application (http://STARGEO.org) to curate better annotations of sample phenotypes uniformly across different studies, and to use these sample annotations to define robust genomic signatures of disease pathology by meta-analysis. In this paper, we target a small group of biomedical graduate students to show rapid crowd-curation of precise sample annotations across all phenotypes, and we demonstrate the biological validity of these crowd-curated annotations for breast cancer. STARGEO.org makes GEO data findable, accessible, interoperable and reusable (i.e., FAIR) to ultimately facilitate knowledge discovery. Our work demonstrates the utility of crowd-curation and interpretation of open ‘big data’ under FAIR principles as a first step towards realizing an ideal paradigm of precision medicine.
Prediction of new disease indications for approved drugs by computational methods has been based largely on the genomics signatures of drugs and diseases. We propose a method for drug repositioning that uses the clinical signatures extracted from over 13 years of electronic medical records from a tertiary hospital, including >9.4 M laboratory tests from >530,000 patients, in addition to diverse genomics signatures. Cross-validation using over 17,000 known drug–disease associations shows this approach outperforms various predictive models based on genomics signatures and a well-known “guilt-by-association” method. Interestingly, the prediction suggests that terbutaline sulfate, which is widely used for asthma, is a promising candidate for amyotrophic lateral sclerosis for which there are few therapeutic options. In vivo tests using zebrafish models found that terbutaline sulfate prevents defects in axons and neuromuscular junction degeneration in a dose-dependent manner. A therapeutic potential of terbutaline sulfate was also observed when axonal and neuromuscular junction degeneration have already occurred in zebrafish model. Cotreatment with a β2-adrenergic receptor antagonist, butoxamine, suggests that the effect of terbutaline is mediated by activation of β2-adrenergic receptors.
Background The results of clinical laboratory tests are an essential component of medical decision-making. To guide interpretation, test results are returned with reference intervals defined by the range in which the central 95% of values occur in healthy individuals. Clinical laboratories often set their own reference intervals to accommodate variation in local population and instrumentation. For some tests, reference intervals change as a function of sex, age, and self-identified race and ethnicity. Methods In this work, we develop a novel approach, which leverages electronic health record data, to identify healthy individuals and tests for differences in laboratory test values between populations. Results We found that the distributions of >50% of laboratory tests with currently fixed reference intervals differ among self-identified racial and ethnic groups (SIREs) in healthy individuals. Conclusions Our results confirm the known SIRE-specific differences in creatinine and suggest that more research needs to be done to determine the clinical implications of using one-size-fits-all reference intervals for other tests with SIRE-specific distributions.
BackgroundHuman diseases frequently cause complications such as obesity-induced diabetes and share numbers of pathological conditions, such as inflammation, by dysfunctions of common functional modules, such as protein–protein interactions (PPIs).MethodsOur developed pipeline, ICod (Interaction analysis for disease Comorbidity), grades similarities between pairs of disease-related PPIs including comorbid diseases and pathological conditions. ICod displayed a disease similarity network consisting of nodes of disease PPIs and edges of similarity value. As a proof of concept, eight complex diseases and pathological conditions, such as type 2 diabetes, obesity, inflammation, and cancers, were examined to discover whether PPIs shared between diseases were associated with comorbidities.ResultsBy comparing Medicare reports of disease co-occurrences from 31 million patients, the disease similarity network shows that PPIs of pathological conditions, including insulin resistance, and inflammation, overlap significantly with PPIs of various comorbid diseases, including diabetes, obesity, and cancers (p < 0.05). Interestingly, maintaining connectivity between essential genes was more drastically perturbed by removing a node of a disease-related gene rather than a pathological condition-related gene, such as one related to inflammations.ConclusionThus, PPIs of pathological symptoms are underlying functional modules across diseases accompanying comorbidity phenomena, whereas they contribute only marginally to maintaining interactions between essential genes.
Protein location and function can change dynamically depending on many factors, including environmental stress, disease state, age, developmental stage, and cell type. Here, we describe an integrative computational framework, called the conditional function predictor (CoFP; http://nbm.ajou.ac.kr/cofp/), for predicting changes in subcellular location and function on a proteome-wide scale. The essence of the CoFP approach is to cross-reference general knowledge about a protein and its known network of physical interactions, which typically pool measurements from diverse environments, against gene expression profiles that have been measured under specific conditions of interest. Using CoFP, we predict condition-specific subcellular locations, biological processes, and molecular functions of the yeast proteome under 18 specified conditions. In addition to highly accurate retrieval of previously known gold standard protein locations and functions, CoFP predicts previously unidentified condition-dependent locations and functions for nearly all yeast proteins. Many of these predictions can be confirmed using high-resolution cellular imaging. We show that, under DNA-damaging conditions, Tsr1, Caf120, Dip5, Skg6, Lte1, and Nnf2 change subcellular location and RNA polymerase I subunit A43, Ino2, and Ids2 show changes in DNA binding. Beyond specific predictions, this work reveals a global landscape of changing protein location and function, highlighting a surprising number of proteins that translocate from the mitochondria to the nucleus or from endoplasmic reticulum to Golgi apparatus under stress.dynamic function prediction | protein translocation | DTT and MMS | systems biology | bioinformatics A cellular response can induce striking changes in the subcellular location and function of proteins. As a recent example, the activating transcription factor-2 (ATF2) plays an oncogenic role in the nucleus, whereas genotoxic stress-induced localization within the mitochondria gives ATF2 the ability to play tumor suppressor, resulting in promotion of cell death (1). Changes in protein location are typically identified using a variety of experimental methods [e.g., protein tagging (2), immunolabeling (3), or cellular subfractionation of target organelles followed by mass spectrometry (4)]. Although highly successful, such measurements can be laborious and time-consuming, even for a single protein (all methods except mass spectrometry) and condition (all methods).For these reasons and others, computational prediction of protein location and function has been a very active area of bioinformatic research. Early methods attempted to infer protein function based mainly on individual protein features, such as sequence similarity or structural homology (3,(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17). These methods range from simple sequence-sequence comparisons to profile-or pattern-based supervised learning methods. Other methods predicted protein function using gene expression data (18, 19) based on the observation that proteins wit...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.