Abstract:Chromatin accessibility is integral to the process by which transcription factors (TFs) read out cis-regulatory DNA sequences, but it is difficult to differentiate between TFs that drive accessibility and those that do not. Deep learning models that learn complex sequence rules provide an unprecedented opportunity to dissect this problem. Using zygotic genome activation in the Drosophila embryo as a model, we generated high-resolution TF binding and chromatin accessibility data, analyzed the data with interpre… Show more
“…In this type of TF cooperativity, binding enhancement occurs when the motifs are spaced within ~150 bp and are strongest the closer the motifs are. Such sequence rules point to nucleosome-mediated cooperativity 15,20,26,[117][118][119][120][121] and may reflect the likelihood by which two motifs are covered by the same nucleosome. Although the mechanism is not well understood, it does not require specific interactions between TFs, which explains how signaling TFs can receive input from a wide variety of TFs in different cell types.…”
Section: Discussionmentioning
confidence: 99%
“…Here, we hypothesized that this TF cooperativity is DNA sequence-driven and thus can be studied by measuring the binding of TFs on DNA and identifying the underlying sequence rules using interpretable deep learning. During training, deep learning models accurately learn sequence rules within genomic regions in an inherently combinatorial manner de novo until they can predict the data from sequence alone [15][16][17][18][19][20][21] . The key step is then to interrogate the model and extract the learned sequence rules using interpretation tools 15 .…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, strictly spaced motifs are not frequently observed in the genome, raising the question of whether TF binding cooperativity may also occur through more flexible motif syntax [27][28][29] . For example, we previously found that a TF may enhance the binding of another TF through soft motif syntax, which occurs at variable motif distances within ~150 bp but is stronger at closer distances 15,20 .…”
Section: Introductionmentioning
confidence: 99%
“…To discover these potential sequence rules, we performed the TF binding experiments at the highest resolution and leveraged our previously developed deep learning model BPNet to predict the data at base resolution from genomic sequences of 1-kb 15,20,22,37,38 . This approach optimally resolves sequence rules between closely spaced motifs within enhancers 15 .…”
SummaryThe response to signaling pathways is highly context-specific, and identifying the transcription factors and mechanisms that are responsible is very challenging. Using the Hippo pathway in mouse trophoblast stem cells as a model, we show here that this information is encoded incis-regulatory sequences and can be learned from high-resolution binding data of signaling transcription factors. Using interpretable deep learning, we show that the binding levels of TEAD4 and YAP1 are enhanced in a distance-dependent manner by cell type-specific transcription factors, including TFAP2C. We also discovered that strictly spacedTead doublemotifs are widespread highly active canonical response elements that mediate cooperativity by promoting labile TEAD4 protein-protein interactions on DNA. These syntax rules and mechanisms apply genome-wide and allow us to predict how small sequence changes alter the activity of enhancersin vivo. This illustrates the power of interpretable deep learning to decode canonical and cell type-specific sequence rules of signaling pathways.Graphical abstract
“…In this type of TF cooperativity, binding enhancement occurs when the motifs are spaced within ~150 bp and are strongest the closer the motifs are. Such sequence rules point to nucleosome-mediated cooperativity 15,20,26,[117][118][119][120][121] and may reflect the likelihood by which two motifs are covered by the same nucleosome. Although the mechanism is not well understood, it does not require specific interactions between TFs, which explains how signaling TFs can receive input from a wide variety of TFs in different cell types.…”
Section: Discussionmentioning
confidence: 99%
“…Here, we hypothesized that this TF cooperativity is DNA sequence-driven and thus can be studied by measuring the binding of TFs on DNA and identifying the underlying sequence rules using interpretable deep learning. During training, deep learning models accurately learn sequence rules within genomic regions in an inherently combinatorial manner de novo until they can predict the data from sequence alone [15][16][17][18][19][20][21] . The key step is then to interrogate the model and extract the learned sequence rules using interpretation tools 15 .…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, strictly spaced motifs are not frequently observed in the genome, raising the question of whether TF binding cooperativity may also occur through more flexible motif syntax [27][28][29] . For example, we previously found that a TF may enhance the binding of another TF through soft motif syntax, which occurs at variable motif distances within ~150 bp but is stronger at closer distances 15,20 .…”
Section: Introductionmentioning
confidence: 99%
“…To discover these potential sequence rules, we performed the TF binding experiments at the highest resolution and leveraged our previously developed deep learning model BPNet to predict the data at base resolution from genomic sequences of 1-kb 15,20,22,37,38 . This approach optimally resolves sequence rules between closely spaced motifs within enhancers 15 .…”
SummaryThe response to signaling pathways is highly context-specific, and identifying the transcription factors and mechanisms that are responsible is very challenging. Using the Hippo pathway in mouse trophoblast stem cells as a model, we show here that this information is encoded incis-regulatory sequences and can be learned from high-resolution binding data of signaling transcription factors. Using interpretable deep learning, we show that the binding levels of TEAD4 and YAP1 are enhanced in a distance-dependent manner by cell type-specific transcription factors, including TFAP2C. We also discovered that strictly spacedTead doublemotifs are widespread highly active canonical response elements that mediate cooperativity by promoting labile TEAD4 protein-protein interactions on DNA. These syntax rules and mechanisms apply genome-wide and allow us to predict how small sequence changes alter the activity of enhancersin vivo. This illustrates the power of interpretable deep learning to decode canonical and cell type-specific sequence rules of signaling pathways.Graphical abstract
“…It is probable that additional cofactors, such as Zld itself, may be required for Twi occupancy at these regions. Prior work demonstrated an important role for Zld in promoting Twi binding in the early embryo 39,40 . Along with earlier studies, our data suggest that PF occupancy is regulated by tissue-intrinsic features, including levels of PF expression, the complement of cofactors expressed, and chromatin structure.…”
While chromatin presents a barrier to the binding of many transcription factors, pioneer factors access nucleosomal targets and promote chromatin opening. Despite binding to target motifs in closed chromatin, many pioneer factors display cell-type specific binding and activity. The mechanisms governing pioneer-factor occupancy and the relationship between chromatin occupancy and opening remain unclear. We studied threeDrosophilatranscription factors with distinct DNA-binding domains and biological functions: Zelda, Grainy head, and Twist. We demonstrated that the level of chromatin occupancy is a key determinant of pioneering activity. Multiple factors regulate occupancy, including motif content, local chromatin, and protein concentration. Regions outside the DNA-binding domain are required for binding and chromatin opening. Our results show that pioneering activity is not a binary feature intrinsic to a protein but occurs on a spectrum and is regulated by a variety of protein-intrinsic and cell-type-specific features.
Pioneer transcription factors direct cell differentiation by deploying new enhancer repertoires through their unique ability to target and initiate remodelling of closed chromatin. The initial steps of their action remain undefined although pioneers were shown to interact with nucleosomal target DNA and with some chromatin remodelling complexes. We now define the sequence of events that provide pioneers with their unique abilities. Chromatin condensation exerted by linker histone H1 is the first constraint on pioneer recruitment, and this establishes the initial speed of chromatin remodelling. The first step of pioneer action involves recruitment of the LSD1 H3K9me2 demethylase for removal of this repressive mark, as well as recruitment of the MLL complex for deposition of the H3K4me1 mark. Further progression of pioneer action requires passage through cell division, and this involves dissociation of pioneer targets from perinuclear lamins. Only then, the SWI/SNF remodeling complex and the coactivator p300 are recruited, leading to nucleosome displacement and enhancer activation. Thus, the unique features of pioneer actions are those occurring in the lamin-associated compartment of the nucleus. This model is consistent with much prior work that showed a dependence on cell division for establishment of new cell fates.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.