The dominant view of protein structure-function is that an amino acid sequence specifies a (mostly) fixed three-dimensional (3-D) structure that is a prerequisite to protein function. In contrast to the dominant view, many proteins display functions requiring the disordered state. Our purpose here is to provide a catalogue of disorder-function relationships. The very important molecular details in each example can be obtained from the references provided or from several excellent reviews and commentaries (1-9).For ordered protein, the ensemble members all have the same time-averaged canonical set of Ramachandran angles along their backbones. For intrinsically disordered protein, the ensemble members have different (and typically dynamic) Ramachandran angles. Such disorder has been characterized by a variety of methods including x-ray crystallography, NMR spectroscopy, CD spectroscopy, and protease sensitivity to name several. Each of these methods has advantages and limitations that are discussed in more detail elsewhere (10). Although a few disordered proteins and regions have been characterized by several methods as noted below, it would be useful to have more examples with multiple methods of characterization.In attempts to discover generalities from the known disorder examples, we recently used bioinformatics coupled with data mining (11)(12)(13)(14)(15). The results suggested that thousands of natively disordered proteins exist, representing a very substantial fraction of the proteins in the commonly used sequence databases (13,16). From these and related database predictions and from a set of functionally important disordered proteins, Wright and Dyson (17) called for a reassessment of the view that 3-D structure is always a prerequisite to protein function.In this article, we discuss the following topics: 1. how common is intrinsic disorder?; 2. intrinsic disorder in vivo; 3. functional annotations for 90 proteins having physically characterized regions of disorder; 4. disordered regions without known function 5. a structurefunction proposal called "the protein trinity"; 6. the functional repertoires of ordered and disordered protein, and 7. the need for a Disordered Protein Database (DisProt) to complement the Protein Data Bank (PDB).How Common is Intrinsic Disorder? A series of predictors of natural disordered regions (PONDRs) have been developed using amino acid sequence as inputs and giving intrinsic order or disorder tendencies as outputs (11,14,15,18,19). The various PONDRs are distinguished by different training sets, by different data representations for their inputs, and by different machine learning models for their development.For PONDR VL-XT 1 , currently the best characterized of the PONDRs, only 6% of more than 900 non-homologous proteins spanning PDB gave false positive predictions of disorder ≥ 40 consecutive amino acids in length. Even this 6% may be an over-estimate of the false positive error rate, however, because many of these predicted disordered regions are involved in ligand bindin...
Intrinsic disorder refers to segments or to whole proteins that fail to self-fold into fixed 3D structure, with such disorder sometimes existing in the native state. Here we report data on the relationships among intrinsic disorder, sequence complexity as measured by Shannon's entropy, and amino acid composition. Intrinsic disorder identified in protein crystal structures, and by nuclear magnetic resonance, circular dichroism, and prediction from amino acid sequence, all exhibit similar complexity distributions that are shifted to lower values compared to, but significantly overlapping with, the distribution for ordered proteins. Compared to sequences from ordered proteins, these variously characterized intrinsically disordered segments and proteins, and also a collection of lowcomplexity sequences, typically have obviously higher levels of protein-specific subsets of the following amino acids: R, K, E, P, and S, and lower levels of subsets of the following: C, W, Y, I, and V. The Swiss Protein database of sequences exhibits significantly higher amounts of both low-complexity and predicted-to-be-disordered segments as compared to a non-redundant set of sequences from the Protein Data Bank, providing additional data that nature is richer in disordered and low-complexity segments compared to the commonness of these features in the set of structurally characterized proteins.
BackgroundDue to the functional importance of intrinsically disordered proteins or protein regions, prediction of intrinsic protein disorder from amino acid sequence has become an area of active research as witnessed in the 6th experiment on Critical Assessment of Techniques for Protein Structure Prediction (CASP6). Since the initial work by Romero et al. (Identifying disordered regions in proteins from amino acid sequences, IEEE Int. Conf. Neural Netw., 1997), our group has developed several predictors optimized for long disordered regions (>30 residues) with prediction accuracy exceeding 85%. However, these predictors are less successful on short disordered regions (≤30 residues). A probable cause is a length-dependent amino acid compositions and sequence properties of disordered regions.ResultsWe proposed two new predictor models, VSL2-M1 and VSL2-M2, to address this length-dependency problem in prediction of intrinsic protein disorder. These two predictors are similar to the original VSL1 predictor used in the CASP6 experiment. In both models, two specialized predictors were first built and optimized for short (≤30 residues) and long disordered regions (>30 residues), respectively. A meta predictor was then trained to integrate the specialized predictors into the final predictor model. As the 10-fold cross-validation results showed, the VSL2 predictors achieved well-balanced prediction accuracies of 81% on both short and long disordered regions. Comparisons over the VSL2 training dataset via 10-fold cross-validation and a blind-test set of unrelated recent PDB chains indicated that VSL2 predictors were significantly more accurate than several existing predictors of intrinsic protein disorder.ConclusionThe VSL2 predictors are applicable to disordered regions of any length and can accurately identify the short disordered regions that are often misclassified by our previous disorder predictors. The success of the VSL2 predictors further confirmed the previously observed differences in amino acid compositions and sequence properties between short and long disordered regions, and justified our approaches for modelling short and long disordered regions separately. The VSL2 predictors are freely accessible for non-commercial use at
Regulation, recognition and cell signaling involve the coordinated actions of many players. Signaling scaffolds, with their ability to bring together proteins belonging to common and/or interlinked pathways, play crucial roles in orchestrating numerous events by coordinating specific interactions among signaling proteins. This review examines the roles of intrinsic disorder (ID) in signaling scaffold protein function. Several well-characterized scaffold proteins with structurally and functionally characterized ID regions are used here to illustrate the importance of ID for scaffolding function. These examples include scaffolds that are mostly disordered, only partially disordered or those in which the ID resides in a scaffold partner. Specific scaffolds discussed include RNase, voltage-activated potassium channels, axin, BRCA1, GSK-3beta, p53, Ste5, titin, Fus3, BRCA1, MAP2, D-AKAP2 and AKAP250. Among the mechanisms discussed are: molecular recognition features, fly-casting, ease of encounter complex formation, structural isolation of partners, modulation of interactions between bound partners, masking of intramolecular interaction sites, maximized interaction surface per residue, toleration of high evolutionary rates, binding site overlap, allosteric modification, palindromic binding, reduced constraints for alternative splicing, efficient regulation via posttranslational modification, efficient regulation via rapid degradation, protection of normally solvent-exposed sites, enhancing the plasticity of interaction and molecular crowding. We conclude that ID can enhance scaffold function by a diverse array of mechanisms. In other words, scaffold proteins utilize several ID-facilitated mechanisms to enhance function, and by doing so, get more functionality from less structure.
The functional specificity of type 1 protein phosphatases (PP1) depends on the associated regulatory/targeting and inhibitory subunits. To gain insights into the mechanism of PP1 regulation by inhibitor-2, an ancient and intrinsically disordered regulator, we solved the crystal structure of the complex to 2.5Å resolution. Our studies show that, when complexed with PP1c, I-2 acquires three regions of order: site 1, residues 12-17, binds adjacent to a region recognized by many PP1 regulators; site 2, amino acids 44 -56, interacts along the RVXF binding groove through an unsuspected sequence, KSQKW; and site 3, residues 130 -169, forms ␣-helical regions that lie across the substratebinding cleft. Specifically, residues 148 -151 interact at the catalytic center, displacing essential metal ions, accounting for both rapid inhibition and slower inactivation of PP1c. Thus, our structure provides novel insights into the mechanism of PP1 inhibition and subsequent reactivation, has broad implications for the physiological regulation of PP1, and highlights common inhibitory interactions among phosphoprotein phosphatase family members.
To investigate the determinants of protein order and disorder, three primary and one derivative database of intrinsically disordered proteins were compiled. The segments in each primary database were characterized by one of the following: X-ray crystallography, nuclear magnetic resonance (NMR), or circular dichroism (CD). The derivative database was based on homology. The three primary disordered databases have a combined total of 157 proteins or segments of length à "à vuà ' à rvqrà uvyrà urà qrvhvrà qhhihrà phvà $&! proteins from 32 families with 52,688 putatively disordered residues. For the four disordered databases, the amino acid compositions were compared with those from a database of ordered structure. Relative to the ordered protein, the intrinsically disordered segments in all four databases were significantly depleted in W, C, F, I, Y, V, L and N, significantly enriched in A, R, G, Q, S, P, E and K, and inconsistently different in H, M, T, and D, suggesting that the first set be called order-promoting and the second set disorder-promoting. Also, 265 amino acid properties were ranked by their ability to discriminate order and disorder and then pruned to remove the most highly correlated pairs. The 10 highest-ranking properties after pruning consisted of 2 residue contact scales, 4 hydrophobicity scales, 3 scales associated vuà urrà hqà rà yhvà phyrà à Vvtà urrà à rvrà sà phvà sà urà " primary databases suggests that disorder in all 3 databases is very similar, but with those characterized by NMR and CD being the most similar, those by CD and X-ray being next, and those by NMR and X-ray being the least similar.
Intrinsically disordered, highly charged protein sequences act as entropic bristles (EBs), which, when translationally fused to partner proteins, serve as effective solubilizers by creating both large favorable surface area for water interactions and large excluded volumes around the partner. By extending away from the partner and sweeping out large molecules, EBs can enable the target protein to fold free from interference. Using both naturally-occurring and artificial polypeptides we demonstrate the successful implementation of intrinsically disordered fusions as protein solubilizers. The artificial fusions discussed herein have low sequence complexity and high net charge, but are diversified by means of distinctive amino acid compositions and lengths. Using 6xHis fusions as controls, soluble protein expression enhancements from 65% (EB60A) to 100% (EB250) were observed for a 20-protein portfolio. Additionally, these EBs were able to more effectively solubilize targets compared to frequently-used fusions such as maltose-binding-protein, glutathione S-transferase, thioredoxin, and N utilization substance A. Finally, although these EBs possess very distinct physio-chemical properties they did not perturb the structure, conformational stability nor function of the green fluorescent protein or the glutathione S-transferase protein. This work thus illustrates the successful de novo design of intrinsically-disordered fusions, and presents a promising technology and complementary resource for researchers attempting to solubilize recalcitrant proteins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.