Evaluating protein structures requires reliable free energies with good estimates of both potential energies and entropies. Although there are many demonstrated successes from using knowledge-based potential energies, computing entropies of proteins has lagged far behind. Here we take an entirely different approach and evaluate knowledge-based conformational entropies of proteins based on the observed frequencies of contact changes between amino acids in a set of 167 diverse proteins, each of which has two alternative structures. The results show that charged and polar interactions break more often than hydrophobic pairs. This pattern correlates strongly with the average solvent exposure of amino acids in globular proteins, as well as with polarity indices and the sizes of the amino acids. Knowledgebased entropies are derived by using the inverse Boltzmann relationship, in a manner analogous to the way that knowledge-based potentials have been extracted. Including these new knowledge-based entropies almost doubles the performance of knowledge-based potentials in selecting the native protein structures from decoy sets. Beyond the overall energy-entropy compensation, a similar compensation is seen for individual pairs of interacting amino acids. The entropies in this report have immediate applications for 3D structure prediction, protein model assessment, and protein engineering and design.knowledge-based | entropies | free energy | native structure | contact changes K nowledge of a protein's structure is required to understand its dynamics and function; so, improvements in protein structure prediction, especially template-free methods, are essential if the whole protein universe it to be fully comprehended. Computational methods of structure prediction typically yield large numbers of possible structure models (decoys), then require challenging discrimination in determining which of these models is most likely to be the native structure. This well-known bottleneck in protein structure prediction suffers from presentday limitations in structure evaluation. Because the folding of a protein into its native structure is dictated by its free-energy landscape (1), the development of accurate free-energy functions for native structure evaluation is an area of active research. The free energy ΔG of a protein structure can be represented as ΔG = ΔV -TΔS, where ΔV and ΔS represent the energetic (enthalpic) and entropic components, respectively, and T the temperature. In the conventional folding funnel hypothesis of protein folding, the energies and entropies are captured by the depth and by the width of the well (1). Both the energetic and entropic components are combinations of large numbers of contributions, and hence the accurate prediction of free energies is limited by the reliable ability to assess all of these contributions.The energetic contribution to free energy of proteins is usually captured by potential functions, either physics-based or knowledgebased. Physics-based force fields, such as CHARMM, AMBER, GROMOS...
Protein functional mechanisms usually require conformational changes, and often there are known structures for the different conformational states. However, usually neither the origin of the driving force nor the underlying pathways for these conformational transitions is known. Exothermic chemical reactions may be an important source of forces that drive conformational changes. Here we investigate this type of force originating from ATP hydrolysis in the chaperonin GroEL, by applying forces originating from the chemical reaction. Specifically, we apply directed forces to drive the GroEL conformational changes and learn that there is a highly specific direction for applied forces to drive the closed form to the open form. For this purpose, we utilize coarse-grained elastic network models. Principal component analysis on 38 GroEL experimental structures yields the most important motions, and these are used in structural interpolation for the construction of a coarse-grained free energy landscape. In addition, we investigate a more random application of forces with a Monte Carlo method and demonstrate pathways for the closed-open conformational transition in both directions by computing trajectories that are shown upon the free energy landscape. Initial root mean square deviation (RMSD) between the open and closed forms of the subunit is 14.7 Å and final forms from our simulations reach an average RMSD of 3.6 Å from the target forms, closely matching the level of resolution of the coarse-grained model.
There are several hundred million protein sequences, but the relationships among them are not fully available from existing homolog detection methods. There is an essential need for an improved method to push homolog detection to lower levels of sequence identity. The method used here relies on a language model to represent proteins numerically in a matrix (an embedding) and uses discrete cosine transforms to compress the data to extract the most essential part, significantly reducing the data size. This PRotein Ortholog Search Tool (PROST) is significantly faster with linear runtimes, and most importantly, computes the distances between pairs of protein sequences to yield homologs at significantly lower levels of sequence identity than previously. The extent of allosteric effects in proteins points out the importance of global aspects of structure and sequence. PROST excels at global homology detection but not at detecting local homologs. Results are validated by strong similarities between the corresponding pairs of structures. The number of remote homologs detected increased significantly and pushes the effective sequence matches more deeply into the twilight zone. Human protein sequences presently having no assigned function now find significant numbers of putative homologs for 93% of cases and structurally verified assigned functions for 76.4% of these cases. The data compression enables massive searches for homologs with short search times while yielding significant gains in the numbers of remote homologs detected. The method is sufficiently efficient to permit whole-genome/proteome comparisons. The PROST web server is accessible at https://mesihk.github.io/prost .
The essential aspects of the ribosome’s mechanism can be extracted from coarse-grained simulations, including the ratchet motion, the movement together of critical bases at the decoding center, as well as movements of the peptide tunnel lining that assist in the expulsion of the synthesized peptide. Because of its large size, coarse-graining helps to simplify and to aid in the understanding of its mechanism. Results presented here utilize coarse-grained elastic network modeling to extract the dynamics, and both RNAs and proteins are coarse-grained. We review our previous results, showing the well-known ratchet motions and the motions in the peptide tunnel and in the mRNA tunnel. The motions of the lining of the peptide tunnel appear to assist in the expulsion of the growing peptide chain, and clamps at the ends of the mRNA tunnel with three proteins, ensure that the mRNA is held tightly during decoding and essential for the helicase activity at the entrance. The entry clamp may also assist in base recognition to ensure proper selection of the incoming tRNA. The overall precision with which the ribosome operates as a machine is remarkable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.