The genetic mechanisms underlying the expansion in size and complexity of the human brain remains poorly understood. L1 retrotransposons are a source of divergent genetic information in hominoid genomes, but their importance in physiological functions and their contribution to human brain evolution is largely unknown. Using multi-omic profiling we here demonstrate that L1-promoters are dynamically active in the developing and adult human brain. L1s generate hundreds of developmentally regulated and cell-type specific transcripts, many which are co-opted as chimeric transcripts or regulatory RNAs. One L1-derived lncRNA, LINC01876, is a human-specific transcript expressed exclusively during brain development. CRISPRi-silencing of LINC01876 results in reduced size of cerebral organoids and premature differentiation of neural progenitors, implicating L1s in human-specific developmental processes. In summary, our results demonstrate that L1-derived transcripts provide a previously undescribed layer of primate- and human-specific transcriptome complexity that contributes to the functional diversification of the human brain.
Genomic sequences with high sequence similarity, such as parent-pseudogene pairs, cause short sequencing reads to align to multiple locations, thus complicating genomic analyses1. However, their impact on transcriptomic analyses, including the estimation of gene expression and transcript annotation, has been less studied. Here, we investigated the impact of pseudogenes on transcriptomic analyses by focusing on the disease-relevant example ofGBA1and its expressed pseudogeneGBAP1. Using short-read RNA-sequencing data from human brain samples2, we found that only 42% of all reads mapping toGBA1did so uniquely, with the remaining reads mapping primarily toGBAP1. This resulted in a significant misestimation of the relative expression ofGBA1toGBAP1. Using targeted long-read RNA-sequencing of 12 human brain regions we identified 18GBA1transcripts that had a novel open reading frame (ORF) and 7GBAP1transcripts predicted to encode a protein, despiteGBAP1being classified as a pseudogene. Furthermore, we demonstrated the ability of these transcripts to generate stable protein that lackedGBA’s important function as a lysosomal glucocerebrosidase (GCase). However, we found that transcripts were surprisingly common, collectively accounting for 32% of transcription from theGBA1locus in the caudate nucleus, and their usage showed cell type selectivity in human brain. Finally, we used annotation-independent analyses of both long and short-read RNA-sequencing data sets to show that parent genes were more likely to have evidence of incomplete annotation. Given that 734 (17%) genes causing Mendelian disease have at least one pseudogene, these findings significantly impact our understanding of human disease and highlight the need for long-read RNA-sequencing analyses at many loci.
The human silencing hub (HUSH) complex binds to transcripts of LINE-1 retrotransposons (L1s) and other genomic repeats, recruiting MORC2 and other effectors to remodel chromatin. However, how HUSH and MORC2 operate alongside DNA methylation, a central epigenetic regulator of repeat transcription, remains poorly understood. Here we interrogate this relationship in human neural progenitor cells (hNPCs), a somatic model of brain development that tolerates removal of DNA methyltransferase DNMT1. Upon loss of MORC2 or HUSH subunit TASOR in hNPCs, L1s remain silenced by robust promoter methylation. However, genome demethylation and activation of evolutionarily-young L1s attracts MORC2 binding. Simultaneous depletion of DNMT1 and MORC2 causes massive accumulation of L1 transcripts. We identify the same mechanistic hierarchy at pericentromeric α-satellites and clustered protocadherin genes, repetitive elements important for chromosome structure and neurodevelopment respectively. Our data delineate the independent epigenetic control of repeats in somatic cells, with implications for understanding the vital functions of HUSH-MORC2 in hypomethylated contexts throughout human development.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.