Abstract:RNA has been found to play an ever-increasing role in a variety of biological processes. The function of most non-coding RNA molecules depends on their structure. Comparing and classifying macromolecular 3D structures is of crucial importance for structure-based function inference and it is used in the characterization of functional motifs and in structure prediction by comparative modeling. However, compared to the numerous methods for protein structure superposition, there are few tools dedicated to the supe… Show more
“…Some algorithms were developed to deal with flexibility by breaking the structures into smaller units that are superimposed independently, for example, in FATCAT for proteins 14 and SupeRNAlign for RNA. 15 Another approach involves the shifting of the level of comparisons from the entire structures to the individual structural elements, usually down to the level of the local environment of individual residues, which does not require superposition of structures, for example, in QCS, SphereGrinder, CAD‐score, LDDT, RPF and INF.…”
Section: Introductionmentioning
confidence: 99%
“…Many of the measures of structural similarity listed above have been developed for comparing relatively rigid structures, they require superposition of these structures, and they are not easily applicable for the comparison of molecules that exhibit significant flexibility. Some algorithms were developed to deal with flexibility by breaking the structures into smaller units that are superimposed independently, for example, in FATCAT for proteins 14 and SupeRNAlign for RNA 15 . Another approach involves the shifting of the level of comparisons from the entire structures to the individual structural elements, usually down to the level of the local environment of individual residues, which does not require superposition of structures, for example, in QCS, SphereGrinder, CAD‐score, LDDT, RPF and INF.…”
The biologically relevant structures of proteins and nucleic acids and their complexes are dynamic. They include a combination of regions ranging from rigid structural segments to structural switches to regions that are almost always disordered, which interact with each other in various ways. Comparing conformational changes and variation in contacts between different conformational states is essential to understand the biological functions of proteins, nucleic acids, and their complexes. Here, we describe a new computational tool, 1D2DSimScore, for comparing contacts and contact interfaces in all kinds of macromolecules and macromolecular complexes, including proteins, nucleic acids, and other molecules. 1D2DSimScore can be used to compare structural features of macromolecular models between alternative structures obtained in a particular experiment or to score various predictions against a defined "ideal" reference structure. Comparisons at the level of contacts are particularly useful for flexible molecules, for which comparisons in 3D that require rigid-body superpositions are difficult, and in biological systems where the formation of specific inter-residue contacts is more relevant for the biological function than the maintenance of a specific global 3D structure. Similarity/ dissimilarity scores calculated by 1D2DSimScore can be used to complement scores describing 3D structural similarity measures calculated by the existing tools.
“…Some algorithms were developed to deal with flexibility by breaking the structures into smaller units that are superimposed independently, for example, in FATCAT for proteins 14 and SupeRNAlign for RNA. 15 Another approach involves the shifting of the level of comparisons from the entire structures to the individual structural elements, usually down to the level of the local environment of individual residues, which does not require superposition of structures, for example, in QCS, SphereGrinder, CAD‐score, LDDT, RPF and INF.…”
Section: Introductionmentioning
confidence: 99%
“…Many of the measures of structural similarity listed above have been developed for comparing relatively rigid structures, they require superposition of these structures, and they are not easily applicable for the comparison of molecules that exhibit significant flexibility. Some algorithms were developed to deal with flexibility by breaking the structures into smaller units that are superimposed independently, for example, in FATCAT for proteins 14 and SupeRNAlign for RNA 15 . Another approach involves the shifting of the level of comparisons from the entire structures to the individual structural elements, usually down to the level of the local environment of individual residues, which does not require superposition of structures, for example, in QCS, SphereGrinder, CAD‐score, LDDT, RPF and INF.…”
The biologically relevant structures of proteins and nucleic acids and their complexes are dynamic. They include a combination of regions ranging from rigid structural segments to structural switches to regions that are almost always disordered, which interact with each other in various ways. Comparing conformational changes and variation in contacts between different conformational states is essential to understand the biological functions of proteins, nucleic acids, and their complexes. Here, we describe a new computational tool, 1D2DSimScore, for comparing contacts and contact interfaces in all kinds of macromolecules and macromolecular complexes, including proteins, nucleic acids, and other molecules. 1D2DSimScore can be used to compare structural features of macromolecular models between alternative structures obtained in a particular experiment or to score various predictions against a defined "ideal" reference structure. Comparisons at the level of contacts are particularly useful for flexible molecules, for which comparisons in 3D that require rigid-body superpositions are difficult, and in biological systems where the formation of specific inter-residue contacts is more relevant for the biological function than the maintenance of a specific global 3D structure. Similarity/ dissimilarity scores calculated by 1D2DSimScore can be used to complement scores describing 3D structural similarity measures calculated by the existing tools.
“…Currently, there exist a wide number of different tools for the pairwise superposition of RNA 3D structures, which can be divided into two main groups based on the specific purpose they were developed to fulfill. The majority of the tools focus on producing a structure-based RNA sequence alignment [42][43][44][45][46][47][48][49] and therefore usually require single RNA chains as input [43,44,47]. Another large group of tools is focused on local tertiary motifs superposition and search [50][51][52][53].…”
Understanding the 3D structure of RNA is key to understanding RNA function. RNA 3D structure is modular and can be seen as a composition of building blocks of various sizes called tertiary motifs. Currently, long-range motifs formed between distant loops and helical regions are largely less studied than the local motifs determined by the RNA secondary structure. We surveyed long-range tertiary interactions and motifs in a non-redundant set of non-coding RNA 3D structures. A new dataset of annotated LOng-RAnge RNA 3D modules (LORA) was built using an approach that does not rely on the automatic annotations of non-canonical interactions. An original algorithm, ARTEM, was developed for annotation-, sequence- and topology-independent superposition of two arbitrary RNA 3D modules. The proposed methods allowed us to identify and describe the most common long-range RNA tertiary motifs. Three basic interaction types were identified to be recurrent in the long-range RNA 3D modules: ribose-ribose interactions, canonical Type I and Type II A-minor interactions, and previously undescribed staple interactions. These three interaction types were found to be different building blocks of the same complex staple motifs common to non-coding RNA 3D structures.
“…CLICK is a topology-independent tool comparing of 3D structures without a scoring function measuring the structural similarity [18, 19]. Similar to SARA-Coffee [20] coupling with sequence alignments, SupeRNAlign iteratively superimposes the RNA fragment structures with R3D and maximizes the local fit [21]. They found that R3D is scoring the best among the tools without ESA-RNA in benchmark.…”
Background
RNA-protein 3D complex structure prediction is still challenging. Recently, a template-based approach PRIME is proposed in our team to build RNA-protein 3D complex structure models with a higher success rate than computational docking software. However, scoring function of RNA alignment algorithm SARA in PRIME is size-dependent, which limits its ability to detect templates in some cases.
Results
Herein, we developed a novel RNA 3D structural alignment approach RMalign, which is based on a size-independent scoring function RMscore. The parameter in RMscore is then optimized in randomly selected RNA pairs and phase transition points (from dissimilar to similar) are determined in another randomly selected RNA pairs. In tRNA benchmarking, the precision of RMscore is higher than that of SARAscore (0.88 and 0.78, respectively) with phase transition points. In balance-FSCOR benchmarking, RMalign performed as good as ESA-RNA with a non-normalized score measuring RNA structural similarity. In balance-x-FSCOR benchmarking, RMalign achieves much better than a state-of-the-art RNA 3D structural alignment approach SARA due to a size-independent scoring function. Take the advantage of RMalign, we update our RNA-protein modeling approach PRIME to version 2.0. The PRIME2.0 significantly improves about 10% success rate than PRIME.
Conclusion
Based on a size-independent scoring function RMscore, a novel RNA 3D structural alignment approach RMalign is developed and integrated into PRIME2.0, which could be useful for the biological community in modeling protein-RNA interaction.
Electronic supplementary material
The online version of this article (10.1186/s12864-019-5631-3) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.