Water molecules play a crucial role in biomolecular associations by mediating a hydrogen bond network or filling spaces with van der Waals interactions. Although current drug design technologies have taken water molecule interactions into account, their applications are still limited to their reliance on either excessive computer resources or a particular potential energy model. Here, we introduce a statistical method that is based on experimentally determined water molecules in the binding sites of high-resolution X-ray crystal structures to predict the potential hydration sites in the binding sites of crystal structures of interest. By clustering and analyzing the various interaction patterns of water molecules in the training data set, we derived a tetrahedron-water-cluster model based on a series of residue group triplets that form feature triangles of different shapes. In the tetrahedral-water-cluster model, a triplet of three polar atoms in the residue group triplet acts as the vertices of the bottom triangle of the tetrahedron, and the water molecule that interacts with these three polar atoms is set as the top vertex of the tetrahedron. By comparing the shapes of the bottom triangles in the training data set with the shape of the triangle in the residue group triplets in the crystal structure of interest, we can identify the bottom triangle that is most similar to the one in the residue group triplet of the crystal structure of interest. According to the tetrahedron-water-cluster model, the hydration site for the residue group triplet in the crystal structure of interest can be predicted based on the height of the tetrahedron that has the most similar bottom triangle in the training data set. A test set containing 193 crystal structures was used to evaluate model performance, and extensive comparison with the recently published program Dowser++ revealed that our model is at least as good at providing an accurate set of the potential hydration sites in crystal structures of interest.
BackgroundRNA-binding proteins participate in many important biological processes concerning RNA-mediated gene regulation, and several computational methods have been recently developed to predict the protein-RNA interactions of RNA-binding proteins. Newly developed discriminative descriptors will help to improve the prediction accuracy of these prediction methods and provide further meaningful information for researchers.ResultsIn this work, we designed two structural features (residue electrostatic surface potential and triplet interface propensity) and according to the statistical and structural analysis of protein-RNA complexes, the two features were powerful for identifying RNA-binding protein residues. Using these two features and other excellent structure- and sequence-based features, a random forest classifier was constructed to predict RNA-binding residues. The area under the receiver operating characteristic curve (AUC) of five-fold cross-validation for our method on training set RBP195 was 0.900, and when applied to the test set RBP68, the prediction accuracy (ACC) was 0.868, and the F-score was 0.631.ConclusionsThe good prediction performance of our method revealed that the two newly designed descriptors could be discriminative for inferring protein residues interacting with RNAs. To facilitate the use of our method, a web-server called RNAProSite, which implements the proposed method, was constructed and is freely available at http://lilab.ecust.edu.cn/NABind.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1110-x) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.