The RNA recognition motif (RRM), also known as the RNA-binding domain (RBD) or ribonucleoprotein domain (RNP), was first identified in the late 1980s when it was demonstrated that mRNA precursors (pre-mRNA) and heterogeneous nuclear RNAs (hnRNAs) are always found in complex with proteins (reviewed in [1]). Biochemical characterizations of the mRNA polyadenylate binding protein (PABP) and the hnRNP protein C shed light on a consensus RNA-binding domain of approximately 90 amino acids containing a central sequence of eight conserved residues that are mainly aromatic and positively charged [2,3]. This sequence, termed the RNP consensus sequence, was thought to be involved in RNA interaction and was defined as Lys ⁄ ArgGly-Phe ⁄ Tyr-Gly ⁄ Ala-Phe ⁄ Tyr-Val ⁄ Ile ⁄ Leu-X-Phe ⁄ Tyr, where X can be any amino acid. Later, a second consensus sequence less conserved than the previously characterized one [1] The RNA recognition motif (RRM), also known as RNA-binding domain (RBD) or ribonucleoprotein domain (RNP) is one of the most abundant protein domains in eukaryotes. Based on the comparison of more than 40 structures including 15 complexes (RRM-RNA or RRM-protein), we reviewed the structure-function relationships of this domain. We identified and classified the different structural elements of the RRM that are important for binding a multitude of RNA sequences and proteins. Common structural aspects were extracted that allowed us to define a structural leitmotif of the RRM-nucleic acid interface with its variations. Outside of the two conserved RNP motifs that lie in the center of the RRM b-sheet, the two external b-strands, the loops, the C-and N-termini, or even a second RRM domain allow high RNA-binding affinity and specific recognition. Protein-RRM interactions that have been found in several structures reinforce the notion of an extreme structural versatility of this domain supporting the numerous biological functions of the RRM-containing proteins.Abbreviations ACF, APOBEC-1 complementary factor; CBP, cap binding protein; CstF, cleavage stimulation factor; hnRNP, heterogeneous nuclear ribonucleoprotein; HuD, Hu protein D; LRR, leucine rich repeat; MIF4G, middle domain of the translation initiation factor 4 G; PABP, polyadenylate binding protein; PIE, polyadenylation inhibition element; PTB, polypyrimidine tract binding protein; RBD, RNA-binding domain; RNP, ribonucleoprotein; RRM, RNA recognition motif; SR, serine/arginine rich proteins; TLS, translocated in liposarcoma; U1A, U2A¢, U2B¢: U1 snRNP proteins A, A¢, B¢; U2AF, U2 snRNP auxiliary factor; UHM, U2AF homology motif; UPF, up-frameshift protein.
SUMMARY Mutations affecting spliceosomal proteins are the most common class of mutations in patients with myelodysplastic syndromes (MDS), yet their role in MDS pathogenesis has not been delineated. Here we report that mutations affecting the splicing factor SRSF2 directly impair hematopoietic differentiation in vivo, which is not due to SRSF2 loss of function. By contrast, SRSF2 mutations alter SRSF2’s normal sequence-specific RNA binding activity, thereby altering recognition of specific exonic splicing enhancer motifs to drive recurrent mis-splicing of key hematopoietic regulators. This includes SRSF2 mutation-dependent splicing of EZH2 that triggers nonsense-mediated decay, which, in turn, results in impaired hematopoietic differentiation. These data provide a mechanistic link between a mutant spliceosomal protein, alterations in splicing of key regulators, and impaired hematopoiesis.
The polypyrimidine tract binding protein (PTB) is a 58-kilodalton RNA binding protein involved in multiple aspects of messenger RNA metabolism, including the repression of alternative exons. We have determined the solution structures of the four RNA binding domains (RBDs) of PTB, each bound to a CUCUCU oligonucleotide. Each RBD binds RNA with a different binding specificity. RBD3 and RBD4 interact, resulting in an antiparallel orientation of their bound RNAs. Thus, PTB will induce RNA looping when bound to two separated pyrimidine tracts within the same RNA. This leads to structural models for how PTB functions as an alternative-splicing repressor.
TDP-43 encodes an alternative-splicing regulator with tandem RNA-recognition motifs (RRMs). The protein regulates cystic fibrosis transmembrane regulator (CFTR) exon 9 splicing through binding to long UG-rich RNA sequences and is found in cytoplasmic inclusions of several neurodegenerative diseases. We solved the solution structure of the TDP-43 RRMs in complex with UG-rich RNA. Ten nucleotides are bound by both RRMs, and six are recognized sequence specifically. Among these, a central G interacts with both RRMs and stabilizes a new tandem RRM arrangement. Mutations that eliminate recognition of this key nucleotide or crucial inter-RRM interactions disrupt RNA binding and TDP-43-dependent splicing regulation. In contrast, point mutations that affect base-specific recognition in either RRM have weaker effects. Our findings reveal not only how TDP-43 recognizes UG repeats but also how RNA binding-dependent inter-RRM interactions are crucial for TDP-43 function.
The Fox-1 protein regulates alternative splicing of tissuespecific exons by binding to GCAUG elements. Here, we report the solution structure of the Fox-1 RNA binding domain (RBD) in complex with UGCAUGU. The last three nucleotides, UGU, are recognized in a canonical way by the four-stranded b-sheet of the RBD. In contrast, the first four nucleotides, UGCA, are bound by two loops of the protein in an unprecedented manner. Nucleotides U 1 , G 2 , and C 3 are wrapped around a single phenylalanine, while G 2 and A 4 form a base-pair. This novel RNA binding site is independent from the b-sheet binding interface. Surface plasmon resonance analyses were used to quantify the energetic contributions of electrostatic and hydrogen bond interactions to complex formation and support our structural findings. These results demonstrate the unusual molecular mechanism of sequence-specific RNA recognition by Fox-1, which is exceptional in its high affinity for a defined but short sequence element.
A code predicting the RNA sequence that will be bound by a certain protein based on its amino acid sequence or its structure would provide a useful tool for the design of RNA binders with desired sequence-specificity. Such de novo designed RNA binders could be of extraordinary use in both medical and basic research applications. Furthermore, a code could help to predict the cellular functions of RNA-binding proteins that have not yet been extensively studied. A comparative analysis of Pumilio homology domains, zinc-containing RNA binders, hnRNP K homology domains and RNA recognition motifs is performed in this review. Based on this, a set of binding rules is proposed that hints towards a code for RNA recognition by these domains. Furthermore, we discuss the intermolecular interactions that are important for RNA binding and summarize their importance in providing affinity and specificity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.