A general method to derive site-site or united-residue potentials is presented. The basic principle of the method is the separation of the degrees of freedom of a system into the primary and secondary ones. The primary degrees of freedom describe the basic features of the system, while the secondary ones are averaged over when calculating the potential of mean force, which is hereafter referred to as the restricted free energy (RFE) function. The RFE can be factored into one-, two-, and multibody terms, using the cluster-cumulant expansion of Kubo. These factors can be assigned the functional forms of the corresponding lowest-order nonzero generalized cumulants, which can, in most cases, be evaluated analytically, after making some simplifying assumptions. This procedure to derive coarse-grain force fields is very valuable when applied to multibody terms, whose functional forms are hard to deduce in another way (e.g., from structural databases). After the functional forms have been derived, they can be parametrized based on the RFE surfaces of model systems obtained from all-atom models or on the statistics derived from structural databases. The approach has been applied to our united-residue force field for proteins. Analytical expressions were derived for the multibody terms pertaining to the correlation between local and electrostatic interactions within the polypeptide backbone; these expressions correspond to up to sixth-order terms in the cumulant expansion of the RFE. These expressions were subsequently parametrized by fitting to the RFEs of selected peptide fragments, calculated with the empirical conformational energy program for peptides force field. The new multibody terms enable not only the heretofore predictable α-helical segments, but also regular β-sheets, to form as the lowest-energy structures, as assessed by test calculations on a model helical protein A, as well as a model 20-residue polypeptide (betanova); the latter was not possible without introducing these new terms.
We report the modification and parameterization of the united-residue (UNRES) force field for energy-based protein-structure prediction and protein-folding simulations. We tested the approach on three training proteins separately: 1E0L (β), 1GAB (α), and 1E0G (α + β). Heretofore, the UNRES force field had been designed and parameterized to locate native-like structures of proteins as global minima of their effective potential-energy surfaces, which largely neglected the conformational entropy because decoys composed of only lowest-energy conformations were used to optimize the force field. Recently, we developed a mesoscopic dynamics procedure for UNRES, and applied it with success to simulate protein folding pathways. How ever, the force field turned out to be largely biased towards α-helical structures in canonical simulations because the conformational entropy had been neglected in the parameterization. We applied the hierarchical optimization method developed in our earlier work to optimize the force field, in which the conformational space of a training protein is divided into levels each corresponding to a certain degree of native-likeness. The levels are ordered according to increasing native-likeness; level 0 corresponds to structures with no native-like elements and the highest level corresponds to the fully native-like structures. The aim of optimization is to achieve the order of the free energies of levels, decreasing as their native-likeness increases. The procedure is iterative, and decoys of the training protein(s) generated with the energy-function parameters of the preceding iteration are used to optimize the force field in a current iteration. We applied the multiplexing replica exchange molecular dynamics (MREMD) method, recently implemented in UNRES, to generate decoys; with this modification, conformational entropy is taken into account. Moreover, we optimized the free-energy gaps between levels at temperatures corresponding to a predominance of folded or unfolded structures, as well as to structures at the putative folding-transition temperature, changing the sign of the gaps at the transition temperature. This enabled us to obtain force fields characterized by a single peak in the heat capacity at the transition temperature. Furthermore, we introduced temperature dependence to the UNRES force field; this is consistent with the fact that it is a free-energy and not a potential-energy function.
Based on the dipole model of peptide groups developed in our earlier work [Liwo et al., Prot. Sci., 2, 1697 (1993)], a cumulant expansion of the average free energy of the system of freely rotating peptide‐group dipoles tethered to a fixed α‐carbon trace is derived. A graphical approach is presented to find all nonvanishing terms in the cumulants. In particular, analytical expressions for three‐ and four‐body (correlation) terms in the averaged interaction potential of united peptide groups are derived. These expressions are similar to the cooperative forces in hydrogen bonding introduced by Koliński and Skolnick [J. Chem. Phys., 97, 9412 (1992)]. The cooperativity arises here naturally from the higher order terms in the power‐series expansion (in the inverse of the temperature) for the average energy. Test calculations have shown that addition of the derived four‐body term to the statistical united‐residue potential of our earlier work [Liwo et al., J. Comput. Chem., 18, 849, 874 (1997)] greatly improves its performance in folding poly‐l‐alanine into an α‐helix. © 1998 John Wiley & Sons, Inc. J Comput Chem 19: 259–276, 1998
We report the application of the hierarchical optimization method of protein potential-energy landscapes described in the accompanying papers (
The multibody terms pertaining to the correlation between backbone−local and backbone−electrostatic interactions in the UNRES force field for energy-based protein-structure prediction, developed in our laboratory, were reparametrized on the basis of the results of high-level ab initio calculations on relevant model systems. MP2/6-31G(d,p) ab initio calculations were carried out to evaluate the energy surfaces of pairs consisting of N-acetyl-N‘-methylacetamide molecules (AcNHMe, which model a regular peptide group) and N-acetyl-N‘,N‘-dimethylacetamide molecules (AcNMe2, which model a peptide group preceding proline) at various intermolecular distances and orientations. For each pair, the calculated ab initio energy surface was subsequently fitted by a sum of Coulombic and Lennard-Jones components. Then, the restricted free-energy (RFE) surfaces of pairs of free peptide groups as well as the RFE factors corresponding to the coupling of backbone−local and backbone−electrostatic interactions in model tetrapeptides were calculated by numerical integration, with the use of the ab initio-fitted simplified energy functions and the ab initio energy maps of model terminally blocked amino acid residues calculated recently (Ołdziej, S.; Kozłowska, U.; Liwo, A.; Scheraga, H. A. J. Phys. Chem. B, in press, 2003). Next, analytical expressions based on Kubo's generalized cumulant theory from our previous work were fitted to the resulting RFE surfaces to parametrize the backbone−electrostatic and multibody terms in the UNRES force field. The computed coefficients of the cumulant-based expressions are different from those derived earlier, which had been based on the ECEPP/3 force field. To complete the force-field parametrization, the weights of the energy terms were determined, and the coefficients of the cumulant-based expressions were refined simultaneously by using our recently developed method of hierarchical optimization of a protein energy landscape using the protein 1IGD. The resulting force field was able to predict significant portions of the structures of proteins with α, β as well as both α and β structure correctly.
A unified coarse-grained model of three major classes of biological molecules—proteins, nucleic acids, and polysaccharides—has been developed. It is based on the observations that the repeated units of biopolymers (peptide groups, nucleic acid bases, sugar rings) are highly polar and their charge distributions can be represented crudely as point multipoles. The model is an extension of the united residue (UNRES) coarse-grained model of proteins developed previously in our laboratory. The respective force fields are defined as the potentials of mean force of biomacromolecules immersed in water, where all degrees of freedom not considered in the model have been averaged out. Reducing the representation to one center per polar interaction site leads to the representation of average site–site interactions as mean-field dipole–dipole interactions. Further expansion of the potentials of mean force of biopolymer chains into Kubo’s cluster-cumulant series leads to the appearance of mean-field dipole–dipole interactions, averaged in the context of local interactions within a biopolymer unit. These mean-field interactions account for the formation of regular structures encountered in biomacromolecules, e.g., α-helices and β-sheets in proteins, double helices in nucleic acids, and helicoidally packed structures in polysaccharides, which enables us to use a greatly reduced number of interacting sites without sacrificing the ability to reproduce the correct architecture. This reduction results in an extension of the simulation timescale by more than four orders of magnitude compared to the all-atom representation. Examples of the performance of the model are presented.FigureComponents of the Unified Coarse Grained Model (UCGM) of biological macromolecules
A method for optimizing potential-energy functions of proteins is proposed. The method assumes a hierarchical structure of the energy landscape, which means that the energy decreases as the number of native-like elements in a structure increases, being lowest for structures from the native family and highest for structures with no native-like element. A level of the hierarchy is defined as a family of structures with the same number of nativelike elements (or degree of native likeness). Optimization of a potential-energy function is aimed at achieving such a hierarchical structure of the energy landscape by forcing appropriate freeenergy gaps between hierarchy levels to place their energies in ascending order. This procedure is different from methods developed thus far, in which the energy gap and͞or the Z score between the native structure and all non-native structures are maximized, regardless of the degree of native likeness of the non-native structures. The advantage of this approach lies in reducing the number of structures with decreasing energy, which should ensure the searchability of the potential. The method was tested on two proteins, PDB ID codes 1FSD and 1IGD, with an off-lattice unitedresidue force field. For 1FSD, the search of the conformational space with the use of the conformational space annealing method and the newly optimized potential-energy function found the native structure very quickly, as opposed to the potential-energy functions obtained by former optimization methods. After even incomplete optimization, the force field obtained by using 1IGD located the native-like structures of two peptides, 1FSD and betanova (a designed three-stranded -sheet peptide), as the lowestenergy conformations, whereas for the 46-residue N-terminal fragment of staphylococcal protein A, the native-like conformation was the second-lowest-energy conformation and had an energy 2 kcal͞mol above that of the lowest-energy structure.T he prediction of protein structure solely on the basis of amino acid sequence and potential-energy function is one of the greatest challenges of contemporary computational biology and biophysics (1). This method is based on the physics of protein folding, namely on the thermo-dynamic hypothesis formulated by Anfinsen (2), according to which the native structure of a protein corresponds to the global minimum of its free energy under given conditions. Thus, protein structure prediction with ab initio methods is accomplished by a search for a conformation corresponding to the global minimum of an appropriate potentialenergy function without the use of secondary structure prediction, homology modeling, threading, etc.The necessary condition for this approach to work is that the potential-energy function must locate the native structure of a protein as the one of lowest energy. Crippen and coworkers (3, 4) designed a method that optimized the potential-energy function to locate the native structures of selected training proteins as the lowest-energy structures. However, the condition of ...
We describe the application of our recently proposed method of hierarchical optimization of the protein energy landscape to optimize our off-lattice united-residue (UNRES) force field using single training proteins. First, the IgG-binding domain from streptococcal protein G (PDB code 1IGD) was treated; earlier attempts to use this protein to optimize the force field by optimizing the energy gap and Z score between the nativelike and non-native structures failed. The structure of this protein consists of an N-terminal antiparallel β-hairpin, a middle α-helix, and a C-terminal antiparallel β-hairpin, these elements being referred to as β1, α2, and β3, respectively, with the two hairpins forming a parallel β-sheet packed against the α-helix. In our earlier study, one of these elements was assumed to form at level 1, two at level 2, and three at level 3, and higher levels corresponded to the proper packing of two or more elements. This approach resulted in a structure with the wrong packing of the β-sheet, and attempts at further optimization failed. We therefore tried a hierarchy scheme that corresponds to the sequence of folding events deduced from NMR experiments. In this scheme, level 1 corresponds to structures with either β3 or α2, level 2 to structures with both β3 and α2, level 3 to structures with β3, α2, and the N-terminal strand packed against α2 (with β1 still not fully formed), and level 4 to structures with β1, α2, and β3, with β3 being packed to β1, which also implies the packing of β1 and β3 against α2. This optimization was successful and resulted in a reasonably transferable force field that led to well-foldable proteins. This corroborates the conclusion from our model on-lattice studies (Liwo, A.; Arłukowicz, P.; Ołdziej, S.; Czaplewski, C.; Makowski, M.; Scheraga, H. A. J. Phys. Chem. B 2004, 108, 16918) that a proper design of the structural hierarchy is of crucial importance to the foldability with the resulting potential-energy function. Moreover, in the off-lattice approach, the design of the hierarchy also appears to be important to the success of the optimization procedure itself. The next series of calculations was carried out with the LysM domain from the E. coli 1E0G (α + β) protein, which is smaller than 1IGD. In this case, no experimental information about the folding pathway is available; nevertheless, we were able to deduce the appropriate hierarchy by a trial-and-error method. The resulting force field performed worse in tests on α + β- and β-proteins than that derived on the basis of 1IGD with a correct hierarchy, which suggests that the structure of the 1IGD protein encodes more structure-determining interactions common to all proteins than the 1E0G protein does. For 1E0G, we also attempted to carry out a single energy gap and Z-score optimization; this effort resulted in an unsearchable force field. (The nativelike structures could not be found by a global search, although they were the lowest in energy). Technical details of the method, including the maintenance of proper seconda...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.