2013
DOI: 10.1038/nmeth.2762
|View full text |Cite
|
Sign up to set email alerts
|

Refined DNase-seq protocol and data analysis reveals intrinsic bias in transcription factor footprint identification

Abstract: DNase-seq is a powerful technique for identifying cis-regulatory elements across the genome. We studied the key experimental parameters to optimize the performance of DNase-seq. We found that sequencing short 50-100bp fragments that accumulate in long inter-nucleosome linker regions is more efficient for identifying transcription factor binding sites than using longer fragments. We also assessed the potential of DNase-seq to predict transcription factor occupancy through the generation of nucleotide-resolution… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

12
244
2

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 200 publications
(258 citation statements)
references
References 22 publications
12
244
2
Order By: Relevance
“…Despite some variation at any individual loci, the SCM model captures the overall structure of DNase I accessibility over the held-out chromosome. DNase-seq is known to have an underlying sequence preference, resulting in the possibility that a SCM model would learn the inherent sequence bias of the DNase I enzyme rather than the relationship between DNA sequence and accessibility (He et al 2013;Lazarovici et al 2013). In order to account for this confounder, we validate our model on DNase I hypersensitive sites (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Despite some variation at any individual loci, the SCM model captures the overall structure of DNase I accessibility over the held-out chromosome. DNase-seq is known to have an underlying sequence preference, resulting in the possibility that a SCM model would learn the inherent sequence bias of the DNase I enzyme rather than the relationship between DNA sequence and accessibility (He et al 2013;Lazarovici et al 2013). In order to account for this confounder, we validate our model on DNase I hypersensitive sites (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Several genomic techniques, including ChIP-seq (Barski et al 2007;Johnson et al 2007;Mikkelsen et al 2007), DNaseseq (Crawford et al 2006;Hesselberth et al 2009;Boyle et al 2011;He et al 2014), and ATAC-seq (Buenrostro et al 2013), have been developed to experimentally identify cis-regulatory regions genome-wide. Attempts to use these data to understand gene expression have, however, been impeded by the following factors: Data for only a small subset of transcription factors (TFs) participating in any system can be generated in practice (Gerstein et al 2012); not all TF binding sites necessarily play roles in gene regulation; mapping between enhancers and genes is still an open question; the regulatory environment that controls a gene may depend on a complex interaction of many factors at the promoter and enhancers that may act cooperatively or antagonistically (Montavon et al 2011;Spitz and Furlong 2012); and technical biases in chromatin profiling data may obscure biologically relevant signal .…”
Section: [Supplemental Materials Is Available For This Article]mentioning
confidence: 99%
“…First, the transcription factor binding sites discovered in most ChIPseq experiments tend to fall within a set of genomic regions that are DNase I-hypersensitive (Hesselberth et al 2009;Neph et al 2012b;Thurman et al 2012;He et al 2014). The union of DNase-seq (UDHS) peaks across a broad array of human cell types can therefore be used to define a superset of transcription factor binding loci in most cell types.…”
Section: [Supplemental Materials Is Available For This Article]mentioning
confidence: 99%
“…TFs 25 are known to have important roles in several diseases, e.g. a third of known human 26 developmental disorders are related to deregulated TFs [64]. 27 Several general approaches have been proposed to identify TFs acting as key players 28 in gene regulation depending on the available data: Coexpression analysis combined 29 with computational predictions of TF sequence binding can be used to identify key 30 TFs [18].…”
mentioning
confidence: 99%