The structure determination of protein-protein complexes is a rather tedious and lengthy process, by both NMR and X-ray crystallography. Several methods based on docking to study protein complexes have also been well developed over the past few years. Most of these approaches are not driven by experimental data but are based on a combination of energetics and shape complementarity. Here, we present an approach called HADDOCK (High Ambiguity Driven protein-protein Docking) that makes use of biochemical and/or biophysical interaction data such as chemical shift perturbation data resulting from NMR titration experiments or mutagenesis data. This information is introduced as Ambiguous Interaction Restraints (AIRs) to drive the docking process. An AIR is defined as an ambiguous distance between all residues shown to be involved in the interaction. The accuracy of our approach is demonstrated with three molecular complexes. For two of these complexes, for which both the complex and the free protein structures have been solved, NMR titration data were available. Mutagenesis data were used in the last example. In all cases, the best structures generated by HADDOCK, that is, the structures with the lowest intermolecular energies, were the closest to the published structure of the respective complexes (within 2.0 A backbone RMSD).
The RNA recognition motif (RRM), also known as the RNA-binding domain (RBD) or ribonucleoprotein domain (RNP), was first identified in the late 1980s when it was demonstrated that mRNA precursors (pre-mRNA) and heterogeneous nuclear RNAs (hnRNAs) are always found in complex with proteins (reviewed in [1]). Biochemical characterizations of the mRNA polyadenylate binding protein (PABP) and the hnRNP protein C shed light on a consensus RNA-binding domain of approximately 90 amino acids containing a central sequence of eight conserved residues that are mainly aromatic and positively charged [2,3]. This sequence, termed the RNP consensus sequence, was thought to be involved in RNA interaction and was defined as Lys ⁄ ArgGly-Phe ⁄ Tyr-Gly ⁄ Ala-Phe ⁄ Tyr-Val ⁄ Ile ⁄ Leu-X-Phe ⁄ Tyr, where X can be any amino acid. Later, a second consensus sequence less conserved than the previously characterized one [1] The RNA recognition motif (RRM), also known as RNA-binding domain (RBD) or ribonucleoprotein domain (RNP) is one of the most abundant protein domains in eukaryotes. Based on the comparison of more than 40 structures including 15 complexes (RRM-RNA or RRM-protein), we reviewed the structure-function relationships of this domain. We identified and classified the different structural elements of the RRM that are important for binding a multitude of RNA sequences and proteins. Common structural aspects were extracted that allowed us to define a structural leitmotif of the RRM-nucleic acid interface with its variations. Outside of the two conserved RNP motifs that lie in the center of the RRM b-sheet, the two external b-strands, the loops, the C-and N-termini, or even a second RRM domain allow high RNA-binding affinity and specific recognition. Protein-RRM interactions that have been found in several structures reinforce the notion of an extreme structural versatility of this domain supporting the numerous biological functions of the RRM-containing proteins.Abbreviations ACF, APOBEC-1 complementary factor; CBP, cap binding protein; CstF, cleavage stimulation factor; hnRNP, heterogeneous nuclear ribonucleoprotein; HuD, Hu protein D; LRR, leucine rich repeat; MIF4G, middle domain of the translation initiation factor 4 G; PABP, polyadenylate binding protein; PIE, polyadenylation inhibition element; PTB, polypyrimidine tract binding protein; RBD, RNA-binding domain; RNP, ribonucleoprotein; RRM, RNA recognition motif; SR, serine/arginine rich proteins; TLS, translocated in liposarcoma; U1A, U2A¢, U2B¢: U1 snRNP proteins A, A¢, B¢; U2AF, U2 snRNP auxiliary factor; UHM, U2AF homology motif; UPF, up-frameshift protein.
N6A methylation is the most abundant RNA modification occurring within messenger RNA. Impairment of methylase or demethylase functions are associated with severe phenotypes and diseases in several organisms. Beside writer and eraser enzymes of this dynamic RNA epigenetic modification, reader proteins that recognize this modification are involved in numerous cellular processes. Although the precise characterization of these reader proteins remains unknown, preliminary data showed that most potential reader proteins contained a conserved YT521-B homology (YTH) domain. Here we define the YTH domain of rat YT521-B as a N6-methylated adenosine reader domain and report its solution structure in complex with a N6-methylated RNA. The structure reveals a binding preference for NGANNN RNA hexamer and a deep hydrophobic cleft for m6A recognition. These findings establish a molecular function for YTH domains as m6A reader domains and should guide further studies into the biological functions of YTH-containing proteins in m6A recognition.
The heterogeneous nuclear ribonucleoprotein (hnRNP) F is involved in the regulation of mRNA metabolism by specifically recognizing G-tract RNA sequences. We have determined the solution structures of the three quasi RNA recognition motifs (qRRMs) of hnRNP F in complex with G-tract RNA. These structures show that qRRMs bind RNA in a very unusual manner, the G-tract being "encaged", making the qRRM a novel RNA binding domain. We defined a consensus signature sequence for qRRMs and identified other human qRRM-containing proteins, which also specifically recognize G-tract RNAs. Our structures explain how qRRMs can sequester G-tracts maintaining them in a single-stranded conformation. We also show that isolated qRRMs of hnRNP F are sufficient to regulate the alternative splicing of the Bcl-x premRNA strongly suggesting that hnRNP F would act by remodeling RNA secondary and tertiary structures.3
The protein CNOT4 possesses an N-terminal RING finger domain that acts as an E3 ubiquitin ligase and specifically interacts with UbcH5B, a ubiquitin-conjugating enzyme. The structure of the CNOT4 RING domain has been solved and the amino acids important for the binding to UbcH5B have been mapped. Here, the residues of UbcH5B important for the binding to CNOT4 RING domain were identified by NMR chemical shift perturbation experiments, and these data were used to generate structural models of the complex with the program HADDOCK. Together with the NMR data, additional biochemical data were included in a second docking, and comparisons of the resulting model with the structure of the c-Cbl/UbcH7 complex reveal some significant differences, notably at specific residues, and give structural insights into the E2/E3 specificity.
Tra2-β1 is a unique splicing factor as its single RNA recognition motif (RRM) is located between two RS (arginine-serine) domains. To understand how this protein recognizes its RNA target, we solved the structure of Tra2-β1 RRM in complex with RNA. The central 5'-AGAA-3' motif is specifically recognized by residues from the β-sheet of the RRM and by residues from both extremities flanking the RRM. The structure suggests that RNA binding by Tra2-β1 induces positioning of the two RS domains relative to one another. By testing the effect of Tra2-β1 and RNA mutations on the splicing of SMN2 exon 7, we validated the importance of the RNA-protein contacts observed in the structure for the function of Tra2-β1 and determined the functional sequence of Tra2-β1 in SMN2 exon 7. Finally, we propose a model for the assembly of multiple RNA binding proteins on this exon.
The heterogeneous nuclear ribonucleoprotein (hnRNP) F belongs to the hnRNP H family involved in the regulation of alternative splicing and polyadenylation and specifically recognizes poly(G) sequences (G-tracts). In particular, hnRNP F binds a G-tract of the Bcl-x RNA and regulates its alternative splicing, leading to two isoforms, Bcl-xS and Bcl-xL, with antagonist functions. In order to gain insight into G-tract recognition by hnRNP H members, we initiated an NMR study of human hnRNP F. We present the solution structure of the three quasi RNA recognition motifs (qRRMs) of hnRNP F and identify the residues that are important for the interaction with the Bcl-x RNA by NMR chemical shift perturbation and mutagenesis experiments. The three qRRMs exhibit the canonical βαββαβ RRM fold but additional secondary structure elements are present in the two N-terminal qRRMs of hnRNP F. We show that qRRM1 and qRRM2 but not qRRM3 are responsible for G-tract recognition and that the residues of qRRM1 and qRRM2 involved in G-tract interaction are not on the β-sheet surface as observed for the classical RRM but are part of a short β-hairpin and two adjacent loops. These regions define a novel interaction surface for RNA recognition by RRMs.
Sam68 and T-STAR are members of the STAR family of proteins that directly link signal transduction with post-transcriptional gene regulation. Sam68 controls the alternative splicing of many oncogenic proteins. T-STAR is a tissue-specific paralogue that regulates the alternative splicing of neuronal pre-mRNAs. STAR proteins differ from most splicing factors, in that they contain a single RNA-binding domain. Their specificity of RNA recognition is thought to arise from their property to homodimerize, but how dimerization influences their function remains unknown. Here, we establish at atomic resolution how T-STAR and Sam68 bind to RNA, revealing an unexpected mode of dimerization different from other members of the STAR family. We further demonstrate that this unique dimerization interface is crucial for their biological activity in splicing regulation, and suggest that the increased RNA affinity through dimer formation is a crucial parameter enabling these proteins to select their functional targets within the transcriptome.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.