Despite a growing number of splicing mutations found in hereditary diseases, utilization of aberrant splice sites and their effects on gene expression remain challenging to predict. We compiled sequences of 346 aberrant 5′splice sites (5′ss) that were activated by mutations in 166 human disease genes. Mutations within the 5′ss consensus accounted for 254 cryptic 5′ss and mutations elsewhere activated 92 de novo 5′ss. Point mutations leading to cryptic 5′ss activation were most common in the first intron nucleotide, followed by the fifth nucleotide. Substitutions at position +5 were exclusively G>A transitions, which was largely attributable to high mutability rates of C/G>T/A. However, the frequency of point mutations at position +5 was significantly higher than that observed in the Human Gene Mutation Database, suggesting that alterations of this position are particularly prone to aberrant splicing, possibly due to a requirement for sequential interactions with U1 and U6 snRNAs. Cryptic 5′ss were best predicted by computational algorithms that accommodate nucleotide dependencies and not by weight-matrix models. Discrimination of intronic 5′ss from their authentic counterparts was less effective than for exonic sites, as the former were intrinsically stronger than the latter. Computational prediction of exonic de novo 5′ss was poor, suggesting that their activation critically depends on exonic splicing enhancers or silencers. The authentic counterparts of aberrant 5′ss were significantly weaker than the average human 5′ss. The development of an online database of aberrant 5′ss will be useful for studying basic mechanisms of splice-site selection, identifying splicing mutations and optimizing splice-site prediction algorithms.
We show that the allele-dependent expression of transcripts encoding soluble HLA-DQbeta chains is determined by branchpoint sequence (BPS) haplotypes in DQB1 intron 3. BPS RNAs associated with low inclusion of the transmembrane exon in mature transcripts showed impaired binding to splicing factor 1 (SF1), indicating that alternative splicing of DQB1 is controlled by differential BPS recognition early during spliceosome assembly. We also demonstrate that naturally occurring human BPS point mutations that alter splicing and lead to recognizable phenotypes cluster in BP and in position -2 relative to BP, implicating impaired SF1-BPS interactions in disease-associated BPS substitutions. Coding DNA variants produced smaller fluctuations of exon inclusion levels than random exonic substitutions, consistent with a selection against coding mutations that alter their own exonization. Finally, proximal splicing in this multi-allelic reporter system was promoted by at least seven SR proteins and repressed by hnRNPs F, H and I, supporting an extensive antagonism of factors balancing the splice site selection. These results provide the molecular basis for the haplotype-specific expression of soluble DQbeta, improve prediction of intronic point mutations and indicate how extraordinary, selection-driven DNA variability in HLA affects pre-mRNA splicing.
Missense, nonsense and translationally silent mtuations can inactivate genes by altering the inclusion of mutant exons in messenger RNA, but their overall fraction among disease-causing exonic substitutions is unknown. Here, we have systematically tested missense and silent mutations deposited in the BRCA1 mutation databases of unclassified variants for their effects on exon inclusion in the mRNA experimentally. The introduction of 21 BRCA1 variants in two minigene systems revealed a single example of
The auxiliary factor of U2 small nuclear RNA (U2AF) is a heterodimer consisting of 65- and 35-kD proteins that bind the polypyrimidine tract (PPT) and AG dinucleotides at the 3′ splice site (3′ss). The gene encoding U2AF35 (U2AF1) is alternatively spliced, giving rise to two isoforms U2AF35a and U2AF35b. Here, we knocked down U2AF35 and each isoform and characterized transcriptomes of HEK293 cells with varying U2AF35/U2AF65 and U2AF35a/b ratios. Depletion of both isoforms preferentially modified alternative RNA processing events without widespread failure to recognize 3′ss or constitutive exons. Over a third of differentially used exons were terminal, resulting largely from the use of known alternative polyadenylation (APA) sites. Intronic APA sites activated in depleted cultures were mostly proximal whereas tandem 3′UTR APA was biased toward distal sites. Exons upregulated in depleted cells were preceded by longer AG exclusion zones and PPTs than downregulated or control exons and were largely activated by PUF60 and repressed by CAPERα. The U2AF(35) repression and activation was associated with a significant interchange in the average probabilities to form single-stranded RNA in the optimal PPT and branch site locations and sequences further upstream. Although most differentially used exons were responsive to both U2AF subunits and their inclusion correlated with U2AF levels, a small number of transcripts exhibited distinct responses to U2AF35a and U2AF35b, supporting the existence of isoform-specific interactions. These results provide new insights into function of U2AF and U2AF35 in alternative RNA processing.
Genetic predisposition to type 1 diabetes (T1D) has been associated with a chromosome 11 locus centered on the proinsulin gene (INS) and with differential steady-state levels of INS RNA from T1D-predisposing and -protective haplotypes. Here, we show that the haplotype-specific expression is determined by INS variants that control the splicing efficiency of intron 1. The adenine allele at IVS1-6 (rs689), which rapidly expanded in modern humans, renders the 3′ splice site of this intron more dependent on the auxiliary factor of U2 small nuclear ribonucleoprotein (U2AF). This interaction required both zinc fingers of the 35-kD U2AF subunit (U2AF35) and was associated with repression of a competing 3′ splice site in INS exon 2. Systematic mutagenesis of reporter constructs showed that intron 1 removal was facilitated by conserved guanosine-rich enhancers and identified additional splicing regulatory motifs in exon 2. Sequencing of intron 1 in primates revealed that relaxation of its 3′ splice site in Hominidae coevolved with the introduction of a short upstream open reading frame, providing a more efficient coupled splicing and translation control. Depletion of SR proteins 9G8 and transformer-2 by RNA interference was associated with exon 2 skipping whereas depletion of SRp20 with increased representation of transcripts containing a cryptic 3′ splice site in the last exon. Together, these findings reveal critical interactions underlying the allele-dependent INS expression and INS-mediated risk of T1D and suggest that the increased requirement for U2AF35 in higher primates may hinder thymic presentation of autoantigens encoded by transcripts with weak 3′ splice sites.Electronic supplementary materialThe online version of this article (doi:10.1007/s00439-010-0860-1) contains supplementary material, which is available to authorized users.
Auxiliary splicing signals play a major role in the regulation of constitutive and alternative pre-mRNA splicing, but their relative importance in selection of mutation-induced cryptic or de novo splice sites is poorly understood. Here, we show that exonic sequences between authentic and aberrant splice sites that were activated by splice-site mutations in human disease genes have lower frequencies of splicing enhancers and higher frequencies of splicing silencers than average exons. Conversely, sequences between authentic and intronic aberrant splice sites have more enhancers and less silencers than average introns. Exons that were skipped as a result of splice-site mutations were smaller, had lower SF2/ASF motif scores, a decreased availability of decoy splice sites and a higher density of silencers than exons in which splice-site mutation activated cryptic splice sites. These four variables were the strongest predictors of the two aberrant splicing events in a logistic regression model. Elimination or weakening of predicted silencers in two reporters consistently promoted use of intron-proximal splice sites if these elements were maintained at their original positions, with their modular combinations producing expected modification of splicing. Together, these results show the existence of a gradient in exon and intron definition at the level of pre-mRNA splicing and provide a basis for the development of computational tools that predict aberrant splicing outcomes.
PUF60 is a splicing factor that binds uridine (U)-rich tracts and facilitates association of the U2 small nuclear ribonucleoprotein with primary transcripts. PUF60 deficiency (PD) causes a developmental delay coupled with intellectual disability and spinal, cardiac, ocular and renal defects, but PD pathogenesis is not understood. Using RNA-Seq, we identify human PUF60-regulated exons and show that PUF60 preferentially acts as their activator. PUF60-activated internal exons are enriched for Us upstream of their 3′ splice sites (3′ss), are preceded by longer AG dinucleotide exclusion zones and more distant branch sites, with a higher probability of unpaired interactions across a typical branch site location as compared to control exons. In contrast, PUF60-repressed exons show U-depletion with lower estimates of RNA single-strandedness. We also describe PUF60-regulated, alternatively spliced isoforms encoding other U-bound splicing factors, including PUF60 partners, suggesting that they are co-regulated in the cell, and identify PUF60-regulated exons derived from transposed elements. PD-associated amino-acid substitutions, even within a single RNA recognition motif (RRM), altered selection of competing 3′ss and branch points of a PUF60-dependent exon and the 3′ss choice was also influenced by alternative splicing of PUF60. Finally, we propose that differential distribution of RNA processing steps detected in cells lacking PUF60 and the PUF60-paralog RBM39 is due to the RBM39 RS domain interactions. Together, these results provide new insights into regulation of exon usage by the 3′ss organization and reveal that germline mutation heterogeneity in RRMs can enhance phenotypic variability at the level of splice-site and branch-site selection.
Selective IgA deficiency (IgAD) and common variable immunodeficiency (CVID) are the most common primary immunodeficiencies in humans. A high degree of familial clustering, marked differences in the population prevalence among ethnic groups, association of IgAD and CVID in families, and a predominant inheritance pattern in multiple-case pedigrees have suggested a strong, shared genetic predisposition. Previous genetic linkage, case-control, and family-based association studies mapped an IgAD/CVID susceptibility locus, designated IGAD1, to the MHC, but its precise location within the MHC has been controversial. We have analyzed a sample of 101 multiple- and 110 single-case families using 36 markers at the IGAD1 candidate region and mapped homozygous stretches across the MHC shared by affected family members. Haplotype analysis, linkage disequilibrium, and homozygosity mapping indicated that HLA-DQ/DR is the major IGAD1 locus, strongly suggesting the autoimmune pathogenesis of IgAD/CVID. This is supported by the highest excess of allelic sharing at 6p in the genome-wide linkage analysis of 101 IgAD/CVID families using 383 marker loci, by previously reported restrictions of the T cell repertoires in CVID, the presence of autoantibodies, impaired T cell activation, and a dysregulation of a number of genes in the targeted immune system. IgAD/CVID may thus provide a useful model for the study of pathogenesis and novel therapeutic strategies in autoimmune diseases.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.