SUMMARY In addition to sculpting eukaryotic transcripts by removing introns, pre-mRNA splicing greatly impacts protein composition of the emerging mRNP. The exon junction complex (EJC), deposited upstream of exon-exon junctions after splicing, is a major constituent of spliced mRNPs. Here we report comprehensive analysis of the endogenous human EJC protein and RNA interactomes. We confirm that the major “canonical” EJC occupancy site in vivo lies 24 nucleotides upstream of exon junctions and that the majority of exon junctions carry an EJC. Unexpectedly, we find that endogenous EJCs multimerize with one another and with numerous SR proteins to form megadalton sized complexes in which SR proteins are super-stoichiometric to EJC core factors. This tight physical association may explain known functional parallels between EJCs and SR proteins. Further, their protection of long mRNA stretches from nuclease digestion suggests that endogenous EJCs and SR proteins cooperate to promote mRNA packaging and compaction.
TAR DNA-binding protein 43 (TDP-43) is associated with a spectrum of neurodegenerative diseases. Although TDP-43 resembles heterogeneous nuclear ribonucleoproteins, its RNA targets and physiological protein partners remain unknown. Here we identify RNA targets of TDP-43 from cortical neurons by RNA immunoprecipitation followed by deep sequencing (RIP-seq). The canonical TDP-43 binding site (TG)n is 55.1-fold enriched, and moreover, a variant with adenine in the middle, (TG)nTA(TG)m, is highly abundant among reads in our TDP-43 RIP-seq library. TDP-43 RNA targets can be divided into three different groups: those primarily binding in introns, in exons, and across both introns and exons. TDP-43 RNA targets are particularly enriched for Gene Ontology terms related to synaptic function, RNA metabolism, and neuronal development. Furthermore, TDP-43 binds to a number of RNAs encoding for proteins implicated in neurodegeneration, including TDP-43 itself, FUS/TLS, progranulin, Tau, and ataxin 1 and -2. We also identify 25 proteins that co-purify with TDP-43 from rodent brain nuclear extracts. Prominent among them are nuclear proteins involved in pre-mRNA splicing and RNA stability and transport. Also notable are two neuron-enriched proteins, methyl CpG-binding protein 2 and polypyrimidine tract-binding protein 2 (PTBP2). A PTBP2 consensus RNA binding motif is enriched in the TDP-43 RIP-seq library, suggesting that PTBP2 may co-regulate TDP-43 RNA targets. This work thus reveals the protein and RNA components of the TDP-43-containing ribonucleoprotein complexes and provides a framework for understanding how dysregulation of TDP-43 in RNA metabolism contributes to neurodegeneration.
The FuncAssociate web application is freely available to all users at http://llama.med.harvard.edu/funcassociate.
Human Staufen1 (Stau1) is a double-stranded RNA (dsRNA)-binding protein implicated in multiple post-transcriptional gene-regulatory processes. Here we combined RNA immunoprecipitation in tandem (RIPiT) with RNase footprinting, formaldehyde cross-linking, sonication-mediated RNA fragmentation and deep sequencing to map Staufen1-binding sites transcriptome wide. We find that Stau1 binds complex secondary structures containing multiple short helices, many of which are formed by inverted Alu elements in annotated 3′ untranslated regions (UTRs) or in 'strongly distal' 3′ UTRs. Stau1 also interacts with actively translating ribosomes and with mRNA coding sequences (CDSs) and 3′ UTRs in proportion to their GC content and propensity to form internal secondary structure. On mRNAs with high CDS GC content, higher Stau1 levels lead to greater ribosome densities, thus suggesting a general role for Stau1 in modulating translation elongation through structured CDS regions. Our results also indicate that Stau1 regulates translation of transcription-regulatory proteins.Staufen proteins are highly conserved dsRNA-binding proteins (dsRBPs) found in most bilateral animals 1 . Mammals contain two Staufen paralogs encoded by different loci. Stau1, Reprints and permissions information is available online at
Elucidating the consequences of genetic differences between humans is essential for understanding phenotypic diversity and personalized medicine. Although variation in RNA levels, transcription factor binding, and chromatin have been explored, little is known about global variation in translation and its genetic determinants. We used ribosome profiling, RNA sequencing, and mass spectrometry to perform an integrated analysis in lymphoblastoid cell lines from a diverse group of individuals. We find significant differences in RNA, translation, and protein levels suggesting diverse mechanisms of personalized gene expression control. Combined analysis of RNA expression and ribosome occupancy improves the identification of individual protein level differences. Finally, we identify genetic differences that specifically modulate ribosome occupancy-many of these differences lie close to start codons and upstream ORFs. Our results reveal a new level of gene expression variation among humans and indicate that genetic variants can cause changes in protein levels through effects on translation.
In higher eukaryotes, messenger RNAs (mRNAs) are exported from the nucleus to the cytoplasm via factors deposited near the 5′ end of the transcript during splicing. The signal sequence coding region (SSCR) can support an alternative mRNA export (ALREX) pathway that does not require splicing. However, most SSCR–containing genes also have introns, so the interplay between these export mechanisms remains unclear. Here we support a model in which the furthest upstream element in a given transcript, be it an intron or an ALREX–promoting SSCR, dictates the mRNA export pathway used. We also experimentally demonstrate that nuclear-encoded mitochondrial genes can use the ALREX pathway. Thus, ALREX can also be supported by nucleotide signals within mitochondrial-targeting sequence coding regions (MSCRs). Finally, we identified and experimentally verified novel motifs associated with the ALREX pathway that are shared by both SSCRs and MSCRs. Our results show strong correlation between 5′ untranslated region (5′UTR) intron presence/absence and sequence features at the beginning of the coding region. They also suggest that genes encoding secretory and mitochondrial proteins share a common regulatory mechanism at the level of mRNA export.
Although introns in 5 0 -and 3 0 -untranslated regions (UTRs) are found in many protein coding genes, rarely are they considered distinctive entities with specific functions. Indeed, mammalian transcripts with 3 0 -UTR introns are often assumed nonfunctional because they are subject to elimination by nonsense-mediated decay (NMD). Nonetheless, recent findings indicate that 5 0 -and 3 0 -UTR intron status is of significant functional consequence for the regulation of mammalian genes. Therefore these features should be ignored no longer.
Cancer sequencing studies have primarily identified cancer-driver genes by the accumulation of protein-altering mutations. An improved method would be annotation-independent, sensitive to unknown distributions of functions within proteins, and inclusive of non-coding drivers. We employed density-based clustering methods in 21 tumor types to detect variably-sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and non-coding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼15% of specific tumor types. SMRs reveal spatial clustering of mutations at molecular domains and interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated among tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally-agnostic driver identification.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.