DNA accessibility, chromatin regulation, and genome methylation are key drivers of cancer transcription. However, there is much left to be understood about the functional implications of sequence-level data to the regulation of gene expression, especially when it comes to the noncoding genome. Recently [Kelley, D., Snoek, J., and Rinn, J., Genome Res. 2016] trained neural networks to effectively predict DNA accessibility in multiple cell types. These models make it possible to explore the impact of mutations on the predicted accessibility and thus directly link one aspect of the gene regulation puzzle all the way down to the sequence level. We present a model with improved performance on the original dataset of 164 ENCODE and Roadmap Epigenomics Consortium sample types, and then extend the method to provide predictions on any sample with RNA-Seq data without need of DNase-seq for the sample. We first demonstrate that with several model and algorithmic changes we improve performance across 164 cell types from a mean AUC of 0.895 to a mean AUC of 0.910. Unfortunately current accessibility models require DNase-seq for each new cell type. Models for detecting transcription factor binding sites, which rely on ChIP-seq for training data, also share this issue. In order to generalize sequence-based predictive models to apply to unseen cell types without requiring re-training we investigate using RNA-Seq as a proxy signature of cell type. The model aims to capture the interdependence of gene expression levels that characterize a cell with the regulatory logic in which sequence-level signatures are combined to determine accessibility without restriction to cell type. We explore the model’s performance when applied to held-out cell types in the ENCODE and Roadmap Epigenomics Consortium data as well as data from the TCGA Pan-Cancer initiative. We look for the impact of non-coding changes in whole-genome sequencing data in TCGA samples, and report on predicted differences in DNA accessibility across cancer subtypes. Citation Format: Kamil Wnuk, Jeremi Sudol, Shahrooz Rabizadeh, Patrick Soon-Shiong, Christopher Szeto, Charles Vaske. Predicting DNA accessibility in the pan-cancer tumor genome using RNA-Seq, WGS, and deep learning [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2017; 2017 Apr 1-5; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2017;77(13 Suppl):Abstract nr 393. doi:10.1158/1538-7445.AM2017-393
We present an extensible platform that integrates state of the art computer vision techniques with mobile communications to deliver a portable visual assistance tool. Live input video from a mobile smartphone is streamed over a 3G or wireless connection while an object recognition engine on a desktop processes the data stream. Recognition results are returned in real-time to the mobile device and announced by a text-to-speech engine. The system design is complete and includes the ability to add new items, share databases, and provide live remote human sighted assistance.
SummaryDNA accessibility is a key dynamic feature of chromatin regulation that can potentiate transcriptional events and tumor progression. To gain insight into chromatin state across existing tumor data, we improved neural network models for predicting accessibility from DNA sequence and extended them to incorporate a global set of RNA sequencing gene expression inputs. Our expression-informed model expanded the application domain beyond specific tissue types to tissues not present in training and achieved consistently high accuracy in predicting DNA accessibility at promoter and promoter flank regions. We then leveraged our new tool by analyzing the DNA accessibility landscape of promoters across The Cancer Genome Atlas. We show that in lung adenocarcinoma the accessibility perspective uniquely highlights immune pathways inversely correlated with a more open chromatin state and that accessibility patterns learned from even a single tumor type can discriminate immune inflammation across many cancers, often with direct relation to patient prognosis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.