2019
DOI: 10.1101/637298
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Quantitative analysis of ZFY and CTCF reveals dependent recognition of tandem zinc finger proteins

Abstract: The human genome has more than 800 C2H2 Zinc Finger-containing genes, and many of them are composed of long tandem arrays of zinc fingers. Current Zinc Finger Protein (ZFP) motif prediction models assume longer finger arrays correspond to longer DNA-binding motifs and higher specificity. However, recent experimental efforts to identify ZFP binding sites in vivo contradict this assumption with many having short reported motifs. Using Zinc Finger Y (ZFY), which has 13 ZFs, we quantitatively characterize its DNA … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 73 publications
(113 reference statements)
0
10
0
Order By: Relevance
“…To do so requires both the ability to predict the sequence specificity of a novel regulator and the ability to determine significant interactions. Because of the difficulty of meeting both of these challenges, previous work has focused primarily on engineering modular regulatory proteins such as zinc-finger (ZFs) and transcription activator-like (TALs) rather than on de novo prediction of biological targets, a problem that remains largely unsolved ( 46 , 47 ). In this work, we solve the DNA-specificity code of ECF σs and build a computational pipeline that enables us to use these rules to determine statistically significant putative regulons for ∼67% of bacterial ECF σs.…”
Section: Discussionmentioning
confidence: 99%
“…To do so requires both the ability to predict the sequence specificity of a novel regulator and the ability to determine significant interactions. Because of the difficulty of meeting both of these challenges, previous work has focused primarily on engineering modular regulatory proteins such as zinc-finger (ZFs) and transcription activator-like (TALs) rather than on de novo prediction of biological targets, a problem that remains largely unsolved ( 46 , 47 ). In this work, we solve the DNA-specificity code of ECF σs and build a computational pipeline that enables us to use these rules to determine statistically significant putative regulons for ∼67% of bacterial ECF σs.…”
Section: Discussionmentioning
confidence: 99%
“…The human genome expresses~800 multiple zinc-finger genes that could potentially encode TFs of specific targeting, but most of these remain uncharacterized. In fact, well-studied TFs, including CTCF [9], GLI1 [10], and PRDM9 [11], bind to relatively short motifs, leading to the 'many fingers but short motif' paradox [12]. More generally, systematic mapping of the in vitro binding preferences of thousands of eukaryotic DBDs revealed that long motifs supporting specific targeting is the exception rather than the rule [13][14][15].…”
Section: Tf Target Search In Vivo: the Challenge Of Binding Specificitymentioning
confidence: 99%
“…Many additional studies have trained neural networks to predict TF binding and used interpretation methods similar to those that we used to discover known and sometimes novel motifs of TFs [4247], but the subset of these studies that predicted CTCF binding failed to identify our motif for ZFs 1-2, likely because, unlike our study, their models were not designed to directly learn individual DBD binding preferences. In fact, a previous study suggested that, for TFs with multiple ZFs, some ZFs have consistent binding patterns across the majority of binding sites, while others bind at only a minority of sites and do not always have the same spacing when binding, making their motifs difficult to detect when modeling all TF binding sites together [48]. Properly evaluating differences between wild-type and mutant TF binding requires multiple high-quality replicates of in vivo binding data from each of a wild-type and mutant TF, which, unfortunately, are not always available; this inability to properly detect differential binding due to generating only one biological replicate may explain why the only existing studying contrasting wild-type and mutant CTCF binding [49] was unable to obtain some of our results ( Supplemental Notes ).…”
Section: Discussionmentioning
confidence: 99%