2020
DOI: 10.1101/2020.01.31.927798
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Global reference mapping and dynamics of human transcription factor footprints

Abstract: Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits 1 , yet it remains challenging to distinguish variants that impact regulatory function 2 . Genomic DNase I footprinting enables quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin 3-5 . However, to date only a small fraction of such sites h… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(16 citation statements)
references
References 44 publications
0
16
0
Order By: Relevance
“…For the Nanog/Oct4/Sox2 TF binding and the K562 chromatin accessibility datasets, we use additional independent binding data to further assess the ability of a model to place importance in biologically relevant regions. For Nanog/Oct4/Sox2 binding, we consider high-resolution binding peaks called from independently collected ChIP-nexus experiments [4], and for K562 chromatin accessibility, we consider a set of high-resolution binding footprints explicitly derived from several DNase-seq profiles using independent signal processing methods [12]. These ChIP-nexus peaks and footprints define high-confidence, high-resolution regions where bound regulatory motifs are expected to be found.…”
Section: Fourier-based Priors Improve Stability Of Attribution Scoresmentioning
confidence: 99%
See 1 more Smart Citation
“…For the Nanog/Oct4/Sox2 TF binding and the K562 chromatin accessibility datasets, we use additional independent binding data to further assess the ability of a model to place importance in biologically relevant regions. For Nanog/Oct4/Sox2 binding, we consider high-resolution binding peaks called from independently collected ChIP-nexus experiments [4], and for K562 chromatin accessibility, we consider a set of high-resolution binding footprints explicitly derived from several DNase-seq profiles using independent signal processing methods [12]. These ChIP-nexus peaks and footprints define high-confidence, high-resolution regions where bound regulatory motifs are expected to be found.…”
Section: Fourier-based Priors Improve Stability Of Attribution Scoresmentioning
confidence: 99%
“…For the K562 models and Nanog/Oct4/Sox2 models, we perform further analyses using orthogonal footprints or ChIP-nexus data, respectively. The K562 footprints are computed by Vierstra et al [12]. From their published footprints, we aggregated the footprints of all K562 experiments using BEDtools merge [19].…”
Section: Reliance On Biologically Relevant Regionsmentioning
confidence: 99%
“…We then asked whether aSNPs preferentially disrupted TFBS associated with a specific TF relative to naSNPs. To avoid redundancy due to the similarities between predicted motifs across closely related TFs, we took advantage of the work from Vierstra et al [31], which groups PWMs on the basis of sequence similarity into 286 distinct clusters (see Methods). To identify clusters containing a significant excess of aSNPs relative to naSNPs, we calculated the aSNP fold enrichment score by dividing the aSNP:naSNP within a cluster by the mean value of this ratio across all clusters, similar to analyses above.…”
Section: Resultsmentioning
confidence: 99%
“…For the mouse FRiS calculations, we aggregated peaks that are available from mouse bulk ATAC-seq and DNAse hypersensitivity experiments provided by the ENCODE project, followed by peak collapsing, resulting in 2,377,227 total peaks averaging 744.9 bp. For the human dataset, we used a human reference dataset for DNAse hypersensitivity 28 that contains 3,591,898 loci defined as TF footprints with an average size of 203.9 bp leading to the lower FRiS values when compared to the aggregate mouse ATAC-seq peak dataset.…”
Section: Quality Metric Calculationsmentioning
confidence: 99%
“…We designed and implemented a 250 μ m diameter punch schematic across three adjacent 200 μ m sections to produce twenty-one distinct trajectories comprised of eight punches spanning the cortex, with an additional twenty punches distributed in the subcortical white matter for a total of 188 spatially mapped tissue punches (Figure 3a). In total, 4,547 cells passed quality filters with a mean of 30,212 reads per cell (estimated mean of 98,274 passing reads per cell with additional sequencing; Methods, ExtendedData Figures 2a and 4a), a mean TSS enrichment of 15.80 -more than twice the 'ideal' ENCODE standard for bulk ATAC-seq datasets (>7, GRCh38 RefSeq annotation), and a FRiS of 0.45 using a human reference dataset28 (Methods, Extended DataFigures 2b and 2d).…”
mentioning
confidence: 99%