SUMMARY Mutations that lead to splicing defects can have severe consequences on gene function and cause disease. Here, we explore how human genetic variation affects exon recognition by developing a multiplexed functional assay of splicing using Sort-seq (MFASS). We assayed 27,733 variants in the Exome Aggregation Consortium (ExAC) within or adjacent to 2,198 human exons in the MFASS minigene reporter and found that 3.8% (1,050) of variants, most of which are extremely rare, led to large-effect splice-disrupting variants (SDVs). Importantly, we find that 83% of SDVs are located outside of canonical splice sites, are distributed evenly across distinct exonic and intronic regions, and are difficult to predict a priori. Our results indicate extant, rare genetic variants can have large functional effects on splicing at appreciable rates, even outside the context of disease, and MFASS enables their empirical assessment at scale.
In eukaryotes, nascent RNA transcripts undergo an intricate series of RNA processing steps to achieve mRNA maturation. RNA editing and alternative splicing are two major RNA processing steps that can introduce significant modifications to the final gene products. By tackling these processes in isolation, recent studies have enabled substantial progress in understanding their global RNA targets and regulatory pathways. However, the interplay between individual steps of RNA processing, an essential aspect of gene regulation, remains poorly understood. By sequencing the RNA of different subcellular fractions, we examined the timing of adenosine-to-inosine (A-to-I) RNA editing and its impact on alternative splicing. We observed that >95% A-to-I RNA editing events occurred in the chromatin-associated RNA prior to polyadenylation. We report about 500 editing sites in the 3 ′ acceptor sequences that can alter splicing of the associated exons. These exons are highly conserved during evolution and reside in genes with important cellular function. Furthermore, we identified a second class of exons whose splicing is likely modulated by RNA secondary structures that are recognized by the RNA editing machinery. The genome-wide analyses, supported by experimental validations, revealed remarkable interplay between RNA editing and splicing and expanded the repertoire of functional RNA editing sites.
ENCODE 3 (2012-2017) expanded production and added new types of assays 8 (Fig. 1, Extended Data Fig. 1), which revealed landscapes of RNA binding and the 3D organization of chromatin via methods such as chromatin interaction analysis by paired-end tagging (ChIA-PET) and Hi-C chromosome conformation capture. Phases 2 and 3 delivered 9,239 experiments (7,495 in human and 1,744 in mouse) in more than 500 cell types and tissues, including mapping of transcribed regions and transcript isoforms, regions of transcripts recognized by RNA-binding proteins, transcription factor binding regions, and regions that harbour specific histone modifications, open chromatin, and 3D chromatin interactions. The results of all of these experiments are available at the ENCODE portal (http://www.encodeproject.org). These efforts, combined with those of related projects and many other laboratories, have produced a greatly enhanced view of the human genome (Fig. 2), identifying 20,225 protein-coding and 37,595 noncoding genes
Identification of functional genetic variants and elucidation of their regulatory mechanisms represent significant challenges of the post-genomic era. A poorly understood topic is the involvement of genetic variants in mediating post-transcriptional RNA processing, including alternative splicing. Thus far, little is known about the genomic, evolutionary, and regulatory features of genetically modulated alternative splicing (GMAS). Here, we systematically identified intronic tag variants for genetic modulation of alternative splicing using RNA-seq data specific to cellular compartments. Combined with our previous method that identifies exonic tags for GMAS, this study yielded 622 GMAS exons. We observed that GMAS events are highly cell type independent, indicating that splicing-altering genetic variants could have widespread function across cell types. Interestingly, GMAS genes, exons, and single-nucleotide variants (SNVs) all demonstrated positive selection or accelerated evolution in primates. We predicted that GMAS SNVs often alter binding of splicing factors, with SRSF1 affecting the most GMAS events and demonstrating global allelic binding bias. However, in contrast to their GMAS targets, the predicted splicing factors are more conserved than expected, suggesting that cis-regulatory variation is the major driving force of splicing evolution. Moreover, GMAS-related splicing factors had stronger consensus motifs than expected, consistent with their susceptibility to SNV disruption. Intriguingly, GMAS SNVs in general do not alter the strongest consensus position of the splicing factor motif, except the more than 100 GMAS SNVs in linkage disequilibrium with polymorphisms reported by genome-wide association studies. Our study reports many GMAS events and enables a better understanding of the evolutionary and regulatory features of this phenomenon.
In mammals, small RNAs are important players in post-transcriptional gene regulation. While their roles in mRNA destabilization and translational repression are well appreciated, their involvement in endonucleolytic cleavage of target RNAs is poorly understood. Very few microRNAs are known to guide RNA cleavage. Endogenous small interfering RNAs are expected to induce target cleavage, but their target genes remain largely unknown. We report a systematic study of small RNA-mediated endonucleolytic cleavage in mouse through integrative analysis of small RNA and degradome sequencing data without imposing any bias toward known small RNAs. Hundreds of small cleavage-inducing RNAs and their cognate target genes were identified, significantly expanding the repertoire of known small RNA-guided cleavage events. Strikingly, both small RNAs and their target sites demonstrated significant overlap with retrotransposons, providing evidence for the long-standing speculation that retrotransposable elements in mRNAs are leveraged as signals for gene targeting. Furthermore, our analysis showed that the RNA cleavage pathway is also present in human cells but affecting a different repertoire of retrotransposons. These results show that small RNA-guided cleavage is more widespread than previously appreciated. Their impact on retrotransposons in non-coding regions shed light on important aspects of mammalian gene regulation.
Alternative splicing is an RNA processing mechanism that affects most genes in human, contributing to disease mechanisms and phenotypic diversity. The regulation of splicing involves an intricate network of cis-regulatory elements and trans-acting factors. Due to their high sequence specificity, cis-regulation of splicing can be altered by genetic variants, significantly affecting splicing outcomes. Recently, multiple methods have been applied to understanding the regulatory effects of genetic variants on splicing. However, it is still challenging to go beyond apparent association to pinpoint functional variants. To fill in this gap, we utilized large-scale data sets of the Genotype-Tissue Expression (GTEx) project to study genetically modulated alternative splicing (GMAS) via identification of allele-specific splicing events. We demonstrate that GMAS events are shared across tissues and individuals more often than expected by chance, consistent with their genetically driven nature. Moreover, although the allelic bias of GMAS exons varies across samples, the degree of variation is similar across tissues versus individuals. Thus, genetic background drives the GMAS pattern to a similar degree as tissue-specific splicing mechanisms. Leveraging the genetically driven nature of GMAS, we developed a new method to predict functional splicingaltering variants, built upon a genotype-phenotype concordance model across samples. Complemented by experimental validations, this method predicted >1000 functional variants, many of which may alter RNA-protein interactions. Lastly, 72% of GMAS-associated SNPs were in linkage disequilibrium with GWAS-reported SNPs, and such association was enriched in tissues of relevance for specific traits/diseases. Our study enables a comprehensive view of genetically driven splicing variations in human tissues.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.