2016
DOI: 10.1101/gr.199778.115
|View full text |Cite
|
Sign up to set email alerts
|

A synergistic DNA logic predicts genome-wide chromatin accessibility

Abstract: Enhancers and promoters commonly occur in accessible chromatin characterized by depleted nucleosome contact; however, it is unclear how chromatin accessibility is governed. We show that log-additive cis-acting DNA sequence features can predict chromatin accessibility at high spatial resolution. We develop a new type of high-dimensional machine learning model, the Synergistic Chromatin Model (SCM), which when trained with DNase-seq data for a cell type is capable of predicting expected read counts of genome-wid… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

6
26
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 19 publications
(32 citation statements)
references
References 37 publications
6
26
0
Order By: Relevance
“…An alternative is to perform de novo motif discovery on accessible regions using machine learning frameworks that identify sequence features that are predictive of accessibility or ChIP-seq data. In particular, SeqGL [70] and gkm-SVM [71, 72] use a binary classification framework to discriminate peak from non-peak or flanking regions using k -mer features, while the Synergistic Chromatin Model (SCM) [73] performs L1-regularized Poisson regression to predict quantitative accessibility signal. These approaches can identify sequence specific motifs based on the selected k -mers.…”
Section: Identification Of Regulatory Sequence Elements and Their Genmentioning
confidence: 99%
“…An alternative is to perform de novo motif discovery on accessible regions using machine learning frameworks that identify sequence features that are predictive of accessibility or ChIP-seq data. In particular, SeqGL [70] and gkm-SVM [71, 72] use a binary classification framework to discriminate peak from non-peak or flanking regions using k -mer features, while the Synergistic Chromatin Model (SCM) [73] performs L1-regularized Poisson regression to predict quantitative accessibility signal. These approaches can identify sequence specific motifs based on the selected k -mers.…”
Section: Identification Of Regulatory Sequence Elements and Their Genmentioning
confidence: 99%
“…In particular, we aimed to identify DNA sequences that could predict cell-type-specific effects of regulatory variants. We investigated the use of machine learning models to predict the chromatin activity of regulatory elements across our three cell types using DNA sequence only (Zhou and Troyanskaya 2015;Hashimoto et al 2016;Kelley et al 2016;Zeng et al 2016). We developed a four-layered neural network architecture, OrbWeaver, to predict cell-type-specific chromatin accessibility of 500-bp windows centered at a regulatory locus ( Fig.…”
Section: Sequence-based Model For Chromatin Activity Explains the Regmentioning
confidence: 99%
“…Briefly, a library of synthetic oligos containing a 99-bp variable region (phrase) flanked by short constant sequences used as primers is integrated into specific genomic locations in mouse embryonic stem cells (mESCs) by CRISPR/Cas9 based homology-directed repair. Previous work has shown that 20-40% of alleles will have site-specific integration of one of the variable phrases [Hashimoto et al, 2016. Binding of Tcf7l2 to each integrated phrase is measured by DamID [Vogel et al, 2007] using doxycycline-inducible ectopic expression of a genomically integrated fusion protein of Tcf7l2 and the N126A mutant of Dam, which we have recently shown to allow accurate measurement of Tcf7l2 binding genome-wide with reduced off-target methylation as compared to the wild-type Dam enzyme [Szczesnik et al, 2019].…”
Section: Resultsmentioning
confidence: 99%