Many anecdotal observations exist of a regulatory effect of DNA methylation on gene expression. However, in general, the underlying mechanisms of this effect are poorly understood. In this review, we summarize what is currently known about how this important, but mysterious, epigenetic mark impacts cellular functions. Cytosine methylation can abrogate or enhance interactions with DNA-binding proteins, or it may have no effect, depending on the context. Despite being only a small chemical change, the addition of a methyl group to cytosine can affect base readout via hydrophobic contacts in the major groove and shape readout via electrostatic contacts in the minor groove. We discuss the recent discovery that CpG methylation increases DNase I cleavage at adjacent positions by an order of magnitude through altering the local 3D DNA shape and the possible implications of this structural insight for understanding the methylation sensitivity of transcription factors (TFs). Additionally, 5-methylcytosines change the stability of nucleosomes and, thus, affect the local chromatin structure and access of TFs to genomic DNA. Given these complexities, it seems unlikely that the influence of DNA methylation on protein–DNA binding can be captured in a small set of general rules. Hence, data-driven approaches may be essential to gain a better understanding of these mechanisms.
Protein–DNA binding is a fundamental component of gene regulatory processes, but it is still not completely understood how proteins recognize their target sites in the genome. Besides hydrogen bonding in the major groove (base readout), proteins recognize minor-groove geometry using positively charged amino acids (shape readout). The underlying mechanism of DNA shape readout involves the correlation between minor-groove width and electrostatic potential (EP). To probe this biophysical effect directly, rather than using minor-groove width as an indirect measure for shape readout, we developed a methodology, DNAphi, for predicting EP in the minor groove and confirmed the direct role of EP in protein–DNA binding using massive sequencing data. The DNAphi method uses a sliding-window approach to mine results from non-linear Poisson–Boltzmann (NLPB) calculations on DNA structures derived from all-atom Monte Carlo simulations. We validated this approach, which only requires nucleotide sequence as input, based on direct comparison with NLPB calculations for available crystal structures. Using statistical machine-learning approaches, we showed that adding EP as a biophysical feature can improve the predictive power of quantitative binding specificity models across 27 transcription factor families. High-throughput prediction of EP offers a novel way to integrate biophysical and genomic studies of protein–DNA binding.
BackgroundDNA shape analysis has demonstrated the potential to reveal structure-based mechanisms of protein–DNA binding. However, information about the influence of chemical modification of DNA is limited. Cytosine methylation, the most frequent modification, represents the addition of a methyl group at the major groove edge of the cytosine base. In mammalian genomes, cytosine methylation most frequently occurs at CpG dinucleotides. In addition to changing the chemical signature of C/G base pairs, cytosine methylation can affect DNA structure. Since the original discovery of DNA methylation, major efforts have been made to understand its effect from a sequence perspective. Compared to unmethylated DNA, however, little structural information is available for methylated DNA, due to the limited number of experimentally determined structures. To achieve a better mechanistic understanding of the effect of CpG methylation on local DNA structure, we developed a high-throughput method, methyl-DNAshape, for predicting the effect of cytosine methylation on DNA shape.ResultsUsing our new method, we found that CpG methylation significantly altered local DNA shape. Four DNA shape features—helix twist, minor groove width, propeller twist, and roll—were considered in this analysis. Distinct distributions of effect size were observed for different features. Roll and propeller twist were the DNA shape features most strongly affected by CpG methylation with an effect size depending on the local sequence context. Methylation-induced changes in DNA shape were predictive of the measured rate of cleavage by DNase I and suggest a possible mechanism for some of the methylation sensitivities that were recently observed for human Pbx-Hox complexes.ConclusionsCpG methylation is an important epigenetic mark in the mammalian genome. Understanding its role in protein–DNA recognition can further our knowledge of gene regulation. Our high-throughput methyl-DNAshape method can be used to predict the effect of cytosine methylation on DNA shape and its subsequent influence on protein–DNA interactions. This approach overcomes the limited availability of experimental DNA structures that contain 5-methylcytosine.Electronic supplementary materialThe online version of this article (10.1186/s13072-018-0174-4) contains supplementary material, which is available to authorized users.
Enhancers harbor binding motifs that recruit transcription factors (TFs) for gene activation.While cooperative binding of TFs at enhancers is known to be critical for transcriptional activation of a handful of developmental enhancers, the extent TF cooperativity genome-wide is unknown. Here, we couple high-resolution nuclease footprinting with single-molecule methylation profiling to characterize TF cooperativity at active enhancers in the Drosophila genome. Enrichment of short MNase-protected DNA segments indicates that the majority of enhancers harbor two or more TF binding sites, and we uncover protected fragments that correspond to co-bound sites in thousands of enhancers. We integrate MNase-seq, methylation accessibility profiling, and CUT&RUN chromatin profiling as a comprehensive strategy to characterize co-binding of the Trithorax-like (TRL) DNA binding protein and multiple other TFs and identify states where an enhancer is bound by no TF, by either single factor, by multiple factors, or where binding sites are occluded by nucleosomes. From the analysis of co-binding, we find that cooperativity dominates TF binding in vivo at a majority of active enhancers. TF cooperativity can occur without apparent protein-protein interactions and provides a mechanism to effectively clear nucleosomes and promote enhancer function..
Cell-free DNA (cfDNA) has the potential to enable non-invasive detection of disease states and progression. Beyond its sequence, cfDNA also represents the nucleosomal landscape of cell(s)-of-origin and captures the dynamics of the epigenome. In this review, we highlight the emergence of cfDNA epigenomic methods that assess disease beyond the scope of mutant tumour genotyping. Detection of tumour mutations is the gold standard for sequencing methods in clinical oncology. However, limitations inherent to mutation targeting in cfDNA, and the possibilities of uncovering molecular mechanisms underlying disease, have made epigenomics of cfDNA an exciting alternative. We discuss the epigenomic information revealed by cfDNA, and how epigenomic methods exploit cfDNA to detect and characterize cancer. Future applications of cfDNA epigenomic methods to act complementarily and orthogonally to current clinical practices has the potential to transform cancer management and improve cancer patient outcomes.
We demonstrate here that the α subunit C-terminal domain of Escherichia coli RNA polymerase (αCTD) recognizes the upstream promoter (UP) DNA element via its characteristic minor groove shape and electrostatic potential. In two compositionally distinct crystallized assemblies, a pair of αCTD subunits bind in tandem to the UP element consensus A-tract that is 6 bp in length (A6-tract), each with their arginine 265 guanidinium group inserted into the minor groove. The A6-tract minor groove is significantly narrowed in these crystal structures, as well as in computationally predicted structures of free and bound DNA duplexes derived by Monte Carlo and molecular dynamics simulations, respectively. The negative electrostatic potential of free A6-tract DNA is substantially enhanced compared to that of generic DNA. Shortening the A-tract by 1 bp is shown to “knock out” binding of the second αCTD through widening of the minor groove. Furthermore, in computationally derived structures with arginine 265 mutated to alanine in either αCTD, either with or without the “knockout” DNA mutation, contact with the DNA is perturbed, highlighting the importance of arginine 265 in achieving αCTD–DNA binding. These results demonstrate that the importance of the DNA shape in sequence-dependent recognition of DNA by RNA polymerase is comparable to that of certain transcription factors.
Genome-wide binding profiles of estrogen receptor (ER) and FOXA1 reflect cancer state in ER + breast cancer. However, routine profiling of tumor transcription factor (TF) binding is impractical in the clinic. Here, we show that plasma cell-free DNA (cfDNA) contains high-resolution ER and FOXA1 tumor binding profiles for breast cancer. Enrichment of TF footprints in plasma reflects the binding strength of the TF in originating tissue. We defined pure in vivo tumor TF signatures in plasma using ER + breast cancer xenografts, which can distinguish xenografts with distinct ER states. Furthermore, state-specific ER-binding signatures can partition human breast tumors into groups with significantly different ER expression and mortality. Last, TF footprints in human plasma samples can identify the presence of ER + breast cancer. Thus, plasma TF footprints enable minimally invasive mapping of the regulatory landscape of breast cancer in humans and open vast possibilities for clinical applications across multiple tumor types.
DNA-binding proteins play important roles in various cellular processes, but the mechanisms by which proteins recognize genomic target sites remain incompletely understood. Functional groups at the edges of the base pairs (bp) exposed in the DNA grooves represent physicochemical signatures. As these signatures enable proteins to form specific contacts between protein residues and bp, their study can provide mechanistic insights into protein–DNA binding. Existing experimental methods, such as X-ray crystallography, can reveal such mechanisms based on physicochemical interactions between proteins and their DNA target sites. However, the low throughput of structural biology methods limits mechanistic insights for selection of many genomic sites. High-throughput binding assays enable prediction of potential target sites by determining relative binding affinities of a protein to massive numbers of DNA sequences. Many currently available computational methods are based on the sequence of standard Watson–Crick bp. They assume that the contribution of overall binding affinity is independent for each base pair, or alternatively include dinucleotides or short k -mers. These methods cannot directly expand to physicochemical contacts, and they are not suitable to apply to DNA modifications or non-Watson–Crick bp. These variations include DNA methylation, and synthetic or mismatched bp. The proposed method, DeepRec, can predict relative binding affinities as function of physicochemical signatures and the effect of DNA methylation or other chemical modifications on binding. Sequence-based modeling methods are in comparison a coarse-grain description and cannot achieve such insights. Our chemistry-based modeling framework provides a path towards understanding genome function at a mechanistic level.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.