Background: Proteins are involved in many interactions with other proteins leading to networks that regulate and control a wide variety of physiological processes. Some of these proteins, called hub proteins or hubs, bind to many different protein partners. Protein intrinsic disorder, via diversity arising from structural plasticity or flexibility, provide a means for hubs to associate with many partners (Dunker AK, Cortese MS, Romero P, Iakoucheva LM, Uversky VN: Flexible Nets: The roles of intrinsic disorder in protein interaction networks. FEBS J 2005, 272:5129-5148).
Background: Our first predictor of protein disorder was published just over a decade ago in the Proceedings of the IEEE International Conference on Neural Networks (Romero P, Obradovic Z, Kissinger C, Villafranca JE, Dunker AK (1997) Identifying disordered regions in proteins from amino acid sequence. Proceedings of the IEEE International Conference on Neural Networks, 1: 90-95). By now more than twenty other laboratory groups have joined the efforts to improve the prediction of protein disorder. While the various prediction methodologies used for protein intrinsic disorder resemble those methodologies used for secondary structure prediction, the two types of structures are entirely different. For example, the two structural classes have very different dynamic properties, with the irregular secondary structure class being much less mobile than the disorder class. The prediction of secondary structure has been useful. On the other hand, the prediction of intrinsic disorder has been revolutionary, leading to major modifications of the more than 100 year-old views relating protein structure and function. Experimentalists have been providing evidence over many decades that some proteins lack fixed structure or are disordered (or unfolded) under physiological conditions. In addition, experimentalists are also showing that, for many proteins, their functions depend on the unstructured rather than structured state; such results are in marked contrast to the greater than hundred year old views such as the lock and key hypothesis. Despite extensive data on many important examples, including disease-associated proteins, the importance of disorder for protein function has been largely ignored. Indeed, to our
Intrinsically disordered proteins carry out various biological functions while lacking ordered secondary and/or tertiary structure. In order to find general intrinsic properties of amino acid residues that are responsible for the absence of ordered structure in intrinsically disordered proteins we surveyed 517 amino acid scales. Each of these scales was taken as an independent attribute for the subsequent analysis. For a given attribute value X, which is averaged over a consecutive string of amino acids, and for a given data set having both ordered and disordered segments, the conditional probabilities P(s o | x) and P(s d | x) for order and disorder, respectively, can be determined for all possible values of X. Plots of the conditional probabilities P(s o | x) and P(s d | x) versus X give a pair of curves. The area between these two curves divided by the total area of the graph gives the area ratio value (ARV), which is proportional to the degree of separation of the two probability curves and, therefore, provides a measure of the given attribute's power to discriminate between order and disorder. As ARV falls between zero and one, larger ARV corresponds to the better discrimination between order and disorder. Starting from the scale with the highest ARV, we applied a simulated annealing procedure to search for alternative scale values and have managed to increase the ARV by more than 10%. The ranking of the amino acids in this new TOP-IDP scale is as follows (from order promoting to disorder promoting): W, F, Y, I, M, L, V, N, C, T, A, G, R, D, H, Q, K, S, E, P. A web-based server has been created to apply the TOP-IDP scale to predict intrinsically disordered proteins (http://www.disprot.org/dev/disindex.php).
Previously described algorithms for mining α-helix-forming molecular recognition elements (MoREs, described in Oldfield et al. (2005) Biochemistry 44 (6) 1989-2000), known also as molecular recognition features (MoRFs, Mohan et al. (2006) J. Mol. Biol. 362 (5) 1043-1059), revealed that regions undergoing disorder-to-order transition are involved in many molecular recognition events and are crucial for protein-protein interactions. However, these algorithms were developed using training dataset of a limited size. Here we propose to improve the prediction algorithms by (1) including additional α-MoRF examples and their cross species homologues in the positive training set; (2) careful extracting monomer structure chains from PDB as the negative training set; (3) including attributes from recently developed disorder predictors, secondary structure predictions, and amino acid indices as attributes; and (4) constructing neural network based predictors and performing validation. Over 50 regions which undergo disorder-to-order transition regions that were identified in PDB together with a set of corresponding cross species homologues of each structure-based example were included in new positive training set. Over 1500 attributes, including disorder predictions, secondary structure predictions and amino acid indices were evaluated by conditional probability method. The top attributes, including VSL2 and VL3 disorder predictions and several physicochemical propensities of amino acid residues, were used to develop the feed forward neural networks. The sensitivity, specificity and accuracy of the resulting predictor, α-MoRF-PredII, were 0.87 ± 0.10, 0.87 ± 0.11, and 0.87 ± 0.08 over 10-cross validation, respectively. We present the results of these analyses and validation examples to discuss the potential improvement of the α-MoRF-PredII prediction accuracy. † This work was supported in part by the grants R01 LM007688-01A1 (to A.K.D.) and GM071714-01A2 (A.K.D and V.N.U.) from the National Institutes of Health and the Programs of the Russian Academy of Sciences for the "Molecular and cellular biology" and "Fundamental science for medicine" (to V.N.U.) CORRESPONDING AUTHOR FOOTNOTE To whom correspondence should be addressed at Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, 410 W. 10th Street, HS 5000, Indianapolis, Identification and predicting such interactions would provide insights and guides for laboratory experimental efforts to understand the mechanisms of signaling and regulation within biological systems. Further, based on such knowledge, small molecule therapies could be developed to target human diseases (1,2).Molecular recognition serves as the initial step for protein-protein interactions. The mechanisms of signaling and regulatory molecular recognition include high specificity with low affinity, and binding diversity in terms of various structural accommodations at binding surface. Coupled binding and foldi...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.