Despite tremendous advances in targeted therapies against lung adenocarcinoma, the majority of patients do not benefit from personalized treatments. A deeper understanding of potential therapeutic targets is crucial to increase the survival of patients. One promising target, ADAR, is amplified in 13% of lung adenocarcinomas and in-vitro studies have demonstrated the potential of its therapeutic inhibition to inhibit tumor growth. ADAR edits millions of adenosines to inosines within the transcriptome, and while previous studies of ADAR in cancer have solely focused on protein-coding edits, > 99% of edits occur in non-protein coding regions. Here, we develop a pipeline to discover the regulatory potential of RNA editing sites across the entire transcriptome and apply it to lung adenocarcinoma tumors from The Cancer Genome Atlas. This method predicts that 1413 genes contain regulatory edits, predominantly in non-coding regions. Genes with the largest numbers of regulatory edits are enriched in both apoptotic and innate immune pathways, providing a link between these known functions of ADAR and its role in cancer. We further show that despite a positive association between ADAR RNA expression and apoptotic and immune pathways, ADAR copy number is negatively associated with apoptosis and several immune cell types' signatures.
Motivation: Technologies that generate high-throughput omics data are flourishing, creating enormous, publicly available repositories of multi-omics data. As many data repositories continue to grow, there is an urgent need for computational methods that can leverage these data to create comprehensive clusters of patients with a given disease. Results: Our proposed approach creates a patient-to-patient similarity graph for each data type as an intermediate representation of each omics data type and merges the graphs through subspace analysis on a Grassmann manifold. We hypothesize that this approach generates more informative clusters by preserving the complementary information from each level of omics data. We applied our approach to The Cancer Genome Atlas (TCGA) breast cancer dataset and show that by integrating gene expression, microRNA and DNA methylation data, our proposed method can produce clinically useful subtypes of breast cancer. We then investigate the molecular characteristics underlying these subtypes. We discover a highly expressed cluster of genes on chromosome 19p13 that strongly correlates with survival in TCGA breast cancer patients and validate these results in three additional breast cancer datasets. We also compare our approach with previous integrative clustering approaches and obtain comparable or superior results.
Scatterplot matrices or SPLOMs provide a feasible method of visualizing and representing multi‐dimensional data especially for a small number of dimensions. For very high dimensional data, we introduce a novel technique to summarize a SPLOM, as a clustered matrix of glyphs, or a Glyph SPLOM. Each glyph visually encodes a general measure of dependency strength, distance correlation, and a logical dependency class based on the occupancy of the scatterplot quadrants. We present the Glyph SPLOM as a general alternative to the traditional correlation based heatmap and the scatterplot matrix in two examples: demography data from the World Health Organization (WHO), and gene expression data from developmental biology. By using both, dependency class and strength, the Glyph SPLOM illustrates high dimensional data in more detail than a heatmap but with more summarization than a SPLOM. More importantly, the summarization capabilities of Glyph SPLOM allow for the assertion of “necessity” causal relationships in the data and the reconstruction of interaction networks in various dynamic systems.
Our results suggest that mRNA and protein data possess independent biological and clinical importance, which can be leveraged to create higher-powered expression biomarkers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.