A unified coarse-grained model of three major classes of biological molecules—proteins, nucleic acids, and polysaccharides—has been developed. It is based on the observations that the repeated units of biopolymers (peptide groups, nucleic acid bases, sugar rings) are highly polar and their charge distributions can be represented crudely as point multipoles. The model is an extension of the united residue (UNRES) coarse-grained model of proteins developed previously in our laboratory. The respective force fields are defined as the potentials of mean force of biomacromolecules immersed in water, where all degrees of freedom not considered in the model have been averaged out. Reducing the representation to one center per polar interaction site leads to the representation of average site–site interactions as mean-field dipole–dipole interactions. Further expansion of the potentials of mean force of biopolymer chains into Kubo’s cluster-cumulant series leads to the appearance of mean-field dipole–dipole interactions, averaged in the context of local interactions within a biopolymer unit. These mean-field interactions account for the formation of regular structures encountered in biomacromolecules, e.g., α-helices and β-sheets in proteins, double helices in nucleic acids, and helicoidally packed structures in polysaccharides, which enables us to use a greatly reduced number of interacting sites without sacrificing the ability to reproduce the correct architecture. This reduction results in an extension of the simulation timescale by more than four orders of magnitude compared to the all-atom representation. Examples of the performance of the model are presented.FigureComponents of the Unified Coarse Grained Model (UCGM) of biological macromolecules
We present the results for CAPRI Round 46, the third joint CASP‐CAPRI protein assembly prediction challenge. The Round comprised a total of 20 targets including 14 homo‐oligomers and 6 heterocomplexes. Eight of the homo‐oligomer targets and one heterodimer comprised proteins that could be readily modeled using templates from the Protein Data Bank, often available for the full assembly. The remaining 11 targets comprised 5 homodimers, 3 heterodimers, and two higher‐order assemblies. These were more difficult to model, as their prediction mainly involved “ab‐initio” docking of subunit models derived from distantly related templates. A total of ~30 CAPRI groups, including 9 automatic servers, submitted on average ~2000 models per target. About 17 groups participated in the CAPRI scoring rounds, offered for most targets, submitting ~170 models per target. The prediction performance, measured by the fraction of models of acceptable quality or higher submitted across all predictors groups, was very good to excellent for the nine easy targets. Poorer performance was achieved by predictors for the 11 difficult targets, with medium and high quality models submitted for only 3 of these targets. A similar performance “gap” was displayed by scorer groups, highlighting yet again the unmet challenge of modeling the conformational changes of the protein components that occur upon binding or that must be accounted for in template‐based modeling. Our analysis also indicates that residues in binding interfaces were less well predicted in this set of targets than in previous Rounds, providing useful insights for directions of future improvements.
The performance of the physics-based protocol, whose main component is the United Residue (UNRES) physics-based coarse-grained force field, developed in our laboratory for the prediction of protein structure from amino acid sequence, is illustrated. Candidate models are selected, based on probabilities of the conformational families determined by multiplexed replica-exchange simulations, from the 10th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP10). For target T0663, classified as a new fold, which consists of two α + β domains homologous to those of known proteins, UNRES predicted the correct symmetry of packing, in which the domains are rotated with respect to each other by 180°in the experimental structure. By contrast, models obtained by knowledge-based methods, in which each domain is modeled very accurately but not rotated, resulted in incorrect packing. Two UNRES models of this target were featured by the assessors. Correct domain packing was also predicted by UNRES for the homologous target T0644, which has a similar structure to that of T0663, except that the two domains are not rotated. Predictions for two other targets, T0668 and T0684_D2, are among the best ones by global distance test score. These results suggest that our physicsbased method has substantial predictive power. In particular, it has the ability to predict domain-domain orientations, which is a significant advance in the state of the art.protein folding | structure symmetry | multi-domain packing P rediction of protein structures from amino acid sequence still remains an unsolved problem of computational biology. Although, since the famous experiments by Anfinsen (1), it is known that a protein adopts the structure which is the (kinetically reachable) global minimum of the free energy of a system, it is not straightforward to implement this physical principle in practice because of the inaccuracy of existing force fields and because of the enormous difficulty to search the conformational space of the system. Therefore, the most effective methods for protein-structure prediction nowadays are knowledge-based approaches, in which database information is incorporated explicitly into the procedure (2). These methods can be divided into three categories, namely, comparative (homology) modeling (3-5), in which the target sequence is compared with the sequences for which experimental structures are known and those structures are usually selected as candidate models for which the greatest similarity is observed; threading (6-8), in which the target sequence is superposed on structures from a database, and those which give the highest score (lowest pseudoenergy) are selected as candidate predictions; and, finally, the fragment-assembly or minithreading method developed by David Baker and colleagues (9, 10), in which the predicted structure is assembled from nine-residue fragments extracted from a protein-structure database, and knowledge-and physicsbased filters are applied at each asse...
The UNited RESidue (UNRES) model of polypeptide chains is a coarse-grained model in which each amino-acid residue is reduced to two interaction sites, namely a united peptide group (p) located halfway between the two neighboring α-carbon atoms (Cαs), which serve only as geometrical points, and a united side chain (SC) attached to the respective Cα. Owing to this simplification, millisecond Molecular Dynamics simulations of large systems can be performed. While UNRES predicts overall folds well, it reproduces the details of local chain conformation with lower accuracy. Recently, we implemented new knowledge-based torsional potentials (Krupa et. al. J. Chem. Theory Comput., 2013, 9, 4620–4632) that depend on the virtual-bond dihedral angles involving side chains: Cα ⋯ Cα ⋯ Cα ⋯ SC (τ(1)), SC ⋯ Cα ⋯ Cα ⋯ Cα (τ(2)), and SC ⋯ Cα ⋯ Cα ⋯ SC (τ(3)) in the UNRES force field. These potentials resulted in significant improvement of the simulated structures, especially in the loop regions. In this work, we introduce the physics-based counterparts of these potentials, which we derived from the all-atom energy surfaces of terminally-blocked amino-acid residues by Boltzmann integration over the angles λ(1) and λ(2) for rotation about the Cα ⋯ Cα virtual-bond angles and over the side-chain angles χ. The energy surfaces were, in turn, calculated by using the semiempirical AM1 method of molecular quantum mechanics. Entropy contribution was evaluated with use of the harmonic approximation from Hessian matrices. One-dimensional Fourier series in the respective virtual-bond-dihedral angles were fitted to the calculated potentials, and these expressions have been implemented in the UNRES force field. Basic calibration of the UNRES force field with the new potentials was carried out with eight training proteins, by selecting the optimal weight of the new energy terms and reducing the weight of the regular torsional terms. The force field was subsequently benchmarked with with a set of 22 proteins not used in the calibration. The new potentials result in a decrease of the root-mean-square deviation of the average conformation from the respective experimental structure by 0.86 Å on average; however, improvement of up to 5 Å was observed for some proteins.
The UNited RESidue (UNRES) coarse-grained model of polypeptide chains, developed in our laboratory, enables us to carry out millisecond-scale molecular-dynamics simulations of large proteins effectively. It performs well in ab initio predictions of protein structure, as demonstrated in the last Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP10). However, the resolution of the simulated structure is too coarse, especially in loop regions, which results from insufficient specificity of the model of local interactions. To improve the representation of local interactions, in this work we introduced new side-chain-backbone correlation potentials, derived from a statistical analysis of loop regions of 4585 proteins. To obtain sufficient statistics, we reduced the set of amino-acid-residue types to five groups, derived in our earlier work on structurally optimized reduced alphabets, based on a statistical analysis of the properties of amino-acid structures. The new correlation potentials are expressed as one-dimensional Fourier series in the virtual-bond-dihedral angles involving side-chain centroids. The weight of these new terms was determined by a trial-and-error method, in which Multiplexed Replica Exchange Molecular Dynamics (MREMD) simulations were run on selected test proteins. The best average root-mean-square deviations (RMSDs) of the calculated structures from the experimental structures below the folding-transition temperatures were obtained with the weight of the new side-chain-backbone correlation potentials equal to 0.57. The resulting conformational ensembles were analyzed in detail by using the Weighted Histogram Analysis Method (WHAM) and Ward's minimum-variance clustering. This analysis showed that the RMSDs from the experimental structures dropped by 0.5 Å on average, compared to simulations without the new terms, and the deviation of individual residues in the loop region of the computed structures from their counterparts in the experimental structures (after optimum superposition of the calculated and experimental structure) decreased by up to 8 Å. Consequently, the new terms improve the representation of local structure.
Despite years of intensive research, little is known about oligomeric structures present during Alzheimer’s disease (AD). Excess of amyloid beta (Aβ) peptides and their aggregation are the basis of the amyloid cascade hypothesis, which attempts to explain the causes of AD. Because of the intrinsically disordered nature of Aβ monomers and the high aggregation rate of oligomers, their structures are almost impossible to resolve using experimental methods. For this reason, we used a physics-based coarse-grained force field to extensively search for the conformational space of the Aβ42 tetramer, which is believed to be the smallest stable Aβ oligomer and the most toxic one. The resulting structures were subsequently optimized, tested for stability, and compared with the proposed experimental fibril models, using molecular dynamics simulations in two popular all-atom force fields. Our results show that the Aβ42 tetramer can form polymorphic stable structures, which may explain different pathways of Aβ aggregation. The models obtained comprise the outer and core chains and, therefore, are significantly different from the structure of mature fibrils. We found that interaction with water is the reason why the tetramer is more compact and less dry inside than fibrils. Physicochemical properties of the proposed all-atom structures are consistent with the available experimental observations and theoretical expectations. Therefore, we provide possible models for further study and design of higher order oligomers.
Summary: Participating as the Cornell-Gdansk group, we have used our physics-based coarsegrained UNited RESidue (UNRES) force field to predict protein structure in the 11th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP11). Our methodology involved extensive multiplexed replica exchange simulations of the target proteins with a recently improved UNRES force field to provide better reproductions of the local structures of polypeptide chains. All simulations were started from fully extended polypeptide chains, and no external information was included in the simulation process except for weak restraints on secondary structure to enable us to finish each prediction within the allowed 3-week time window. Because of simplified UNRES representation of polypeptide chains, use of enhanced sampling methods, code optimization and parallelization and sufficient computational resources, we were able to treat, for the first time, all 55 human prediction targets with sizes from 44 to 595 amino acid residues, the average size being 251 residues. Complete structures of six singledomain proteins were predicted accurately, with the highest accuracy being attained for the T0769, for which the CaRMSD was 3.8 Å for 97 residues of the experimental structure. Correct structures were also predicted for 13 domains of multi-domain proteins with accuracy comparable to that of the best template-based modeling methods. With further improvements of the UNRES force field that are now underway, our physics-based coarse-grained approach to protein-structure prediction will eventually reach global prediction capacity and, consequently, reliability in simulating protein structure and dynamics that are important in biochemical processes. Availability and Implementation: Freely available on the web at
By using the maximum likelihood method for force-field calibration recently developed in our laboratory, which is aimed at achieving the agreement between the simulated conformational ensembles of selected training proteins and the corresponding ensembles determined experimentally at various temperatures, the physics-based coarse-grained UNRES force field for simulations of protein structure and dynamics was optimized with seven small training proteins exhibiting a variety of secondary and tertiary structures. Four runs of optimization, in which the number of optimized force-field parameters was gradually increased, were carried out, and the resulting force fields were subsequently tested with a set of 22 α-, 12 β-, and 12 α + β-proteins not used in optimization. The variant in which energy-term weights, local, and correlation potentials, side-chain radii, and anisotropies were optimized turned out to be the most transferable and outperformed all previous versions of UNRES on the test set.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.