While the methylation of DNA in 5′ promoters suppresses gene expression, the role of DNA methylation in gene bodies is unclear1–5. In mammals, tissue- and cell type-specific methylation is present in a small percentage of 5′ CpG island (CGI) promoters, while a far greater proportion occurs across gene bodies, coinciding with highly conserved sequences5–10. Tissue-specific intragenic methylation might reduce,3 or, paradoxically, enhance transcription elongation efficiency1,2,4,5. Capped analysis of gene expression (CAGE) experiments also indicate that transcription commonly initiates within and between genes11–15. To investigate the role of intragenic methylation, we generated a map of DNA methylation from human brain encompassing 24.7 million of the 28 million CpG sites. From the dense, high-resolution coverage of CpG islands, the majority of methylated CpG islands were revealed to be in intragenic and intergenic regions, while less than 3% of CpG islands in 5′ promoters were methylated. The CpG islands in all three locations overlapped with RNA markers of transcription initiation, and unmethylated CpG islands also overlapped significantly with trimethylation of H3K4, a histone modification enriched at promoters16. The general and CpG-island-specific patterns of methylation are conserved in mouse tissues. An in-depth investigation of the human SHANK3 locus17,18 and its mouse homologue demonstrated that this tissue-specific DNA methylation regulates intragenic promoter activity in vitro and in vivo. These methylation-regulated, alternative transcripts are expressed in a tissue and cell type-specific manner, and are expressed differentially within a single cell type from distinct brain regions. These results support a major role for intragenic methylation in regulating cell context-specific alternative promoters in gene bodies.
Transposable elements (TEs) have been shown to contain functional binding sites for certain transcription factors (TFs). However, the extent to which TEs contribute to the evolution of TF binding sites is not well known. We comprehensively mapped binding sites for 26 pairs of orthologous TFs in two pairs of human and mouse cell lines (representing two cell lineages), along with epigenomic profiles, including DNA methylation and six histone modifications. Overall, we found that 20% of binding sites were embedded within TEs. This number varied across different TFs, ranging from 2% to 40%. We further identified 710 TF-TE relationships in which genomic copies of a TE subfamily contributed a significant number of binding peaks for a TF, and we found that LTR elements dominated these relationships in human. Importantly, TEderived binding peaks were strongly associated with open and active chromatin signatures, including reduced DNA methylation and increased enhancer-associated histone marks. On average, 66% of TE-derived binding events were cell type-specific with a cell type-specific epigenetic landscape. Most of the binding sites contributed by TEs were speciesspecific, but we also identified binding sites conserved between human and mouse, the functional relevance of which was supported by a signature of purifying selection on DNA sequences of these TEs. Interestingly, several TFs had significantly expanded binding site landscapes only in one species, which were linked to species-specific gene functions, suggesting that TEs are an important driving force for regulatory innovation. Taken together, our data suggest that TEs have significantly and continuously shaped gene regulatory networks during mammalian evolution.
Naïve human pluripotent stem cells (hPSCs) provide a unique experimental platform of cell fate decisions during pre-implantation development, but their lineage potential remains incompletely characterized. As naïve hPSCs share transcriptional and epigenomic signatures with trophoblast cells, it has been proposed that the naïve state may have enhanced predisposition for differentiation along this extraembryonic lineage. Here we examined the trophoblast potential of isogenic naïve and primed hPSCs. We found that naïve hPSCs can directly give rise to human trophoblast stem cells (hTSCs) and undergo further differentiation into both extravillous and syncytiotrophoblast. In contrast, primed hPSCs do not support hTSC derivation, but give rise to non-self-renewing cytotrophoblasts in response to BMP4. Global transcriptome and chromatin accessibility analyses indicate that hTSCs derived from naïve hPSCs are similar to blastocyst-derived hTSCs and acquire features of post-implantation trophectoderm. The derivation of hTSCs from naïve hPSCs will enable elucidation of early mechanisms that govern normal human trophoblast development and associated pathologies.
Summary To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast, and embryonic stem cell lines. By combining the genome-wide TF occupancy repertoires, associated epigenetic signals, and TF co-association patterns, we deduced several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF occupied sequences (TF OSs). However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Importantly, occupancy conserved TF OSs tend to be pleiotropic; they function in multiple tissues and also co-associate with multiple TFs. Single nucleotide variants (SNVs) at sites with potential regulatory functions are enriched in occupancy conserved TF OSs.
Transposable elements (TEs) are an abundant and rich genetic resource of regulatory sequences 1 – 3 . Cryptic regulatory elements within TEs can be epigenetically reactivated in cancer to influence oncogenesis in a process termed onco-exaptation 4 . However, the prevalence and impact of TE onco-exaptation events across cancer types are poorly characterized. Here, we analyzed 7,769 tumors and 625 normal datasets from 15 cancer types, identifying 129 TE cryptic promoter activation events involving 106 oncogenes across 3,864 tumors. Furthermore, we interrogated the AluJb-LIN28B candidate: the genetic deletion of the TE eliminated oncogene expression, while dynamic DNA methylation modulated promoter activity, illustrating the necessity and sufficiency of a TE for oncogene activation. Collectively, our results characterize the global profile of TE onco-exaptation and highlight this prevalent phenomenon as an important mechanism for promiscuous oncogene activation and ultimately tumorigenesis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.