Mobile elements and repetitive genomic regions are sources of lineage-specific genomic innovation and uniquely fingerprint individual genomes. Comprehensive analyses of such repeat elements, including those found in more complex regions of the genome, require a complete, linear genome assembly. We present a de novo repeat discovery and annotation of the T2T-CHM13 human reference genome. We identified previously unknown satellite arrays, expanded the catalog of variants and families for repeats and mobile elements, characterized classes of complex composite repeats, and located retroelement transduction events. We detected nascent transcription and delineated CpG methylation profiles to define the structure of transcriptionally active retroelements in humans, including those in centromeres. These data expand our insight into the diversity, distribution, and evolution of repetitive regions that have shaped the human genome.
Angelman syndrome (AS) is a severe neurodevelopmental disorder caused by the loss of function from the maternal allele ofUBE3A, a gene encoding an E3 ubiquitin ligase.UBE3Ais only expressed from the maternally inherited allele in mature human neurons due to tissue-specific genomic imprinting. Imprinted expression ofUBE3Ais restricted to neurons by expression ofUBE3A antisense transcript(UBE3A-ATS) from the paternally inherited allele, which silences the paternal allele ofUBE3Aincis. However, the mechanism restrictingUBE3A-ATSexpression andUBE3Aimprinting to neurons is not understood. We used CRISPR/Cas9-mediated genome editing to functionally define a bipartite boundary element critical for neuron-specific expression ofUBE3A-ATSin humans. Removal of this element led to up-regulation ofUBE3A-ATSwithout repressing paternalUBE3A. However, increasing expression ofUBE3A-ATSin the absence of the boundary element resulted in full repression of paternalUBE3A, demonstrating thatUBE3Aimprinting requires both the loss of function from the boundary element as well as the up-regulation ofUBE3A-ATS. These results suggest that manipulation of the competition betweenUBE3A-ATSandUBE3Amay provide a potential therapeutic approach for AS.
Gene expression programs change during cellular transitions. It is well established that a network of transcription factors and chromatin modifiers regulate RNA levels during embryonic stem cell (ESC) differentiation, but the full impact of post-transcriptional processes remains elusive. While cytoplasmic RNA turnover mechanisms have been implicated in differentiation, the contribution of nuclear RNA decay has not been investigated. Here, we differentiate mouse ESCs, depleted for the ribonucleolytic RNA exosome, into embryoid bodies to determine to which degree RNA abundance in the two states can be attributed to changes in transcription versus RNA decay by the exosome. As a general observation, we find that exosome depletion mainly leads to the stabilization of RNAs from lowly transcribed loci, including several protein-coding genes. Depletion of the nuclear exosome cofactor RBM7 leads to similar effects. In particular, transcripts that are differentially expressed between states tend to be more exosome sensitive in the state where expression is low. We conclude that the RNA exosome contributes to down-regulation of transcripts with disparate expression, often in conjunction with transcriptional down-regulation.
Mobile elements and highly repetitive genomic regions are potent sources of lineage-specific genomic innovation and fingerprint individual genomes. Comprehensive analyses of large, composite or arrayed repeat elements and those found in more complex regions of the genome require a complete, linear genome assembly. Here we present the first de novo repeat discovery and annotation of a complete human reference genome, T2T-CHM13v1.0. We identified novel satellite arrays, expanded the catalog of variants and families for known repeats and mobile elements, characterized new classes of complex, composite repeats, and provided comprehensive annotations of retroelement transduction events. Utilizing PRO-seq to detect nascent transcription and nanopore sequencing to delineate CpG methylation profiles, we defined the structure of transcriptionally active retroelements in humans, including for the first time those found in centromeres. Together, these data provide expanded insight into the diversity, distribution and evolution of repetitive regions that have shaped the human genome.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.