The Rbfox family of splicing factors regulate alternative splicing during animal development and in disease, impacting thousands of exons in the maturing brain, heart, and muscle. Rbfox proteins have long been known to bind to the RNA sequence GCAUG with high affinity, but just half of Rbfox binding sites contain a GCAUG motif
in vivo
. We incubated recombinant RBFOX2 with over 60,000 mouse and human transcriptomic sequences to reveal substantial binding to several moderate-affinity, non-GCAYG sites at a physiologically relevant range of RBFOX concentrations. We find that many of these “secondary motifs” bind Rbfox robustly in cells and that several together can exert regulation comparable to GCAUG in a trichromatic splicing reporter assay. Furthermore, secondary motifs regulate RNA splicing in neuronal development and in neuronal subtypes where cellular Rbfox concentrations are highest, enabling a second wave of splicing changes as Rbfox levels increase.
Messenger RNA isoform differences are predominantly driven by alternative first, internal, and last exons. Despite the importance of classifying exons to understand isoform structure, few tools examine isoform-specific exon usage. We recently observed that alternative transcription start sites often arise near internal exons, often creating “hybrid” first/internal exons. To systematically detect hybrid exons, we built the hybrid-internal-terminal (HIT) pipeline to classify exons depending on their isoform-specific usage. On the basis of splice junction reads in RNA sequencing data and probabilistic modeling, the HIT index identified thousands of previously misclassified hybrid first-internal and internal-last exons. Hybrid exons are enriched in long genes and genes involved in RNA splicing and have longer flanking introns and strong splice sites. Their usage varies considerably across human tissues. By developing the first method to classify exons according to isoform contexts, our findings document the occurrence of hybrid exons, a common quirk of the human transcriptome.
Many non-coding variants associated with phenotypes occur in 3’ untranslated regions (3’ UTRs) and may affect interactions with RNA-binding proteins (RBPs) to regulate post-transcriptional gene expression. However, identifying functional 3’ UTR variants has proven difficult. We used allele frequencies from the Genome Aggregation Database (gnomAD) to identify classes of 3’ UTR variants under strong negative selection in humans. We developed intergenic mutability-adjusted proportion singleton (iMAPS), a generalized measure related to MAPS, to quantify negative selection in non-coding regions. This approach, in conjunction within vitroandin vivobinding data, identifies precise RBP binding sites, miRNA target sites, and polyadenylation signals (PASs) under strong selection. For each class of sites, we identified thousands of gnomAD variants under selection comparable to missense coding variants, and found that sites in core 3’ UTR regions upstream of the most-used PAS are under strongest selection. Together, this work improves our understanding of selection on human genes and validates approaches for interpreting genetic variants in human 3’ UTRs.
Human LUC7 family proteins associate with the U1 small nuclear ribonucleoprotein (snRNP) complex. Mutation or deletion of LUC7L2 is associated with myeloid neoplasms, and depletion of LUC7L2 alters cellular metabolism. Here, we describe distinctive 5' splice site (5'SS) features of exons impacted by each of the three human LUC7s. We find that LUC7L2 and LUC7L enhance splicing of 'right-handed' 5'SS with stronger consensus matching on the intron side of the near-invariant /GU, while LUC7L3 preferentially enhances splicing of 'left-handed' 5'SS with stronger consensus matching on the exon side of the splice junction. Specificity for right- or left-handed 5'SS is conferred by the distinct structured N-terminal domains of LUC7L2 and LUC7L3. Evolutionary analysis shows that divergence of LUC7L3 and LUC7L2 subfamilies occurred prior to the divergence of plants from animals/fungi, and suggests that loss of the LUC7L3 subfamily from the fungal lineage contributed to the predominance of right-handed 5'SS in fungi.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.