Open source code implementing Probalign as well as for producing the simulated data, and all real and simulated data are freely available from http://www.cs.njit.edu/usman/probalign
A distance constraint model (DCM) is presented that identifies flexible regions within protein structure consistent with specified thermodynamic condition. The DCM is based on a rigorous free energy decomposition scheme representing structure as fluctuating constraint topologies. Entropy non-additivity is problematic for naive decompositions, limiting the success of heat capacity predictions. The DCM resolves non-additivity by summing over independent entropic components determined by an efficient network-rigidity algorithm. A minimal 3-parameter DCM is demonstrated to accurately reproduce experimental heat capacity curves. Free energy landscapes and quantitative stability-flexibility relationships are obtained in terms of global flexibility. Several connections to experiment are made.
Many reports qualitatively describe conserved stability and flexibility profiles across protein families, but biophysical modeling schemes have not been available to robustly quantify both. Here we investigate an orthologous RNase H pair by using a minimal distance constraint model (DCM). The DCM is an all atom microscopic model [Jacobs and Dallakyan, Biophys J 2005;88(2):903-915] that accurately reproduces heat capacity measurements [Livesay et al., FEBS Lett 2004;576(3):468-476], and is unique in its ability to harmoniously calculate thermodynamic stability and flexibility in practical computing times. Consequently, quantified stability/flexibility relationships (QSFR) can be determined using the DCM. For the first time, a comparative QSFR analysis is performed, serving as a paradigm study to illustrate the utility of a QSFR analysis for elucidating evolutionarily conserved stability and flexibility profiles. Despite global conservation of QSFR profiles, distinct enthalpy-entropy compensation mechanisms are identified between the RNase H pair. In both cases, local flexibility metrics parallel H/D exchange experiments by correctly identifying the folding core and several flexible regions. Remarkably, at appropriately shifted temperatures (e.g., melting temperature), these differences lead to a global conservation in Landau free energy landscapes, which directly relate thermodynamic stability to global flexibility. Using ensemble-based sampling within free energy basins, rigidly, and flexibly correlated regions are quantified through cooperativity correlation plots. Five conserved flexible regions are identified within the structures of the orthologous pair. Evolutionary conservation of these flexibly correlated regions is strongly suggestive of their catalytic importance. Conclusions made herein are demonstrated to be robust with respect to the DCM parameterization.
In this report, we demonstrate that phylogenetic motifs, sequence regions conserving the overall familial phylogeny, represent a promising approach to protein functional site prediction. Across our structurally and functionally heterogeneous data set, phylogenetic motifs consistently correspond to functional sites defined by both surface loops and active site clefts. Additionally, the partially buried prosthetic group regions of cytochrome P450 and succinate dehydrogenase are identified as phylogenetic motifs. In nearly all instances, phylogenetic motifs are structurally clustered, despite little overall sequence proximity, around key functional site features. Based on calculated false-positive expectations and standard motif identification methods, we show that phylogenetic motifs are generally conserved in sequence. This result implies that they can be considered motifs in the traditional sense as well. However, there are instances where phylogenetic motifs are not (overall) well conserved in sequence. This point is enticing, because it implies that phylogenetic motifs are able to identify key sequence regions that traditional motif-based approaches would not. Further, phylogenetic motif results are also shown to be consistent with evolutionary trace results, and bootstrapping is used to demonstrate tree significance.
BackgroundWe examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise in previous investigations, is used to identify important positions within the network. Closeness centrality, a global measure of network centrality, is calculated as the reciprocal of the average distance between vertex i and all other vertices.ResultsWe benchmark the approach against 283 structurally unique proteins within the Catalytic Site Atlas. Our results, which are inline with previous investigations of smaller datasets, indicate closeness centrality predictions are statistically significant. However, unlike previous approaches, we specifically focus on residues with the very best scores. Over the top five closeness centrality scores, we observe an average true to false positive rate ratio of 6.8 to 1. As demonstrated previously, adding a solvent accessibility filter significantly improves predictive power; the average ratio is increased to 15.3 to 1. We also demonstrate (for the first time) that filtering the predictions by residue identity improves the results even more than accessibility filtering. Here, we simply eliminate residues with physiochemical properties unlikely to be compatible with catalytic requirements from consideration. Residue identity filtering improves the average true to false positive rate ratio to 26.3 to 1. Combining the two filters together has little affect on the results. Calculated p-values for the three prediction schemes range from 2.7E-9 to less than 8.8E-134. Finally, the sensitivity of the predictions to structure choice and slight perturbations is examined.ConclusionOur results resolutely confirm that closeness centrality is a viable prediction scheme whose predictions are statistically significant. Simple filtering schemes substantially improve the method's predicted power. Moreover, no clear effect on performance is observed when comparing ligated and unligated structures. Similarly, the CC prediction results are robust to slight structural perturbations from molecular dynamics simulation.
We investigate changes in human c-type lysozyme flexibility upon mutation via a Distance Constraint Model, which gives a statistical mechanical treatment of network rigidity. Specifically, two dynamical metrics are tracked. Changes in flexibility index quantify differences within backbone flexibility, whereas changes in the cooperativity correlation quantify differences within pairwise mechanical couplings. Regardless of metric, the same general conclusions are drawn. That is, small structural perturbations introduced by single point mutations have a frequent and pronounced affect on lysozyme flexibility that can extend over long distances. Specifically, an appreciable change occurs in backbone flexibility for 48% of the residues, and a change in cooperativity occurs in 42% of residue pairs. The average distance from mutation to a site with a change in flexibility is 17–20 Å. Interestingly, the frequency and scale of the changes within single point mutant structures are generally larger than those observed in the hen egg white lysozyme (HEWL) ortholog, which shares 61% sequence identity with human lysozyme. For example, point mutations often lead to substantial flexibility increases within the β-subdomain, which is consistent with experimental results indicating that it is the nucleation site for amyloid formation. However, β-subdomain flexibility within the human and HEWL orthologs is more similar despite the lowered sequence identity. These results suggest compensating mutations in HEWL reestablish desired properties.
A computational method to identify residues likely to initiate allosteric signals has been developed. The method is based on differences within stability and flexibility profiles between wild-type and perturbed structures as computed by a distance constraint model. Application of the approach to three bacterial chemotaxis protein Y (CheY) orthologs provides a comparison of allosteric response across protein family divergence. Interestingly, we observe a rich mixture of both conservation and variability within the identified allosteric sites. While similarity within the overall response parallels the evolutionary relationships, >50% of the best scoring putative sites are only identified in a single ortholog. These results suggest that detailed descriptions of intraprotein communication are substantially more variable than structure and function, yet do maintain some evolutionary relationships. Finally, structural clusters of large response identify four allosteric hotspots, including the β4/α4 loop known to be critical to relaying the CheY phosphorylation signal.
Background: Gram-negative bacteria use periplasmic-binding proteins (bPBP) to transport nutrients through the periplasm. Despite immense diversity within the recognized substrates, all members of the family share a common fold that includes two domains that are separated by a conserved hinge. The hinge allows the protein to cycle between open (apo) and closed (ligated) conformations. Conformational changes within the proteins depend on a complex interplay of mechanical and thermodynamic response, which is manifested as an increase in thermal stability and decrease of flexibility upon ligand binding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.