Abstract:Recent studies have shown that explicit solvent molecular dynamics (MD) simulation followed by structural averaging can consistently improve protein structure models. In this study, we investigate the origin of improvements from averaging. We first show that improvement upon averaging is not limited to explicit water MD simulation, as consistent improvements are also observed for more efficient implicit solvent MD or Monte Carlo minimization simulations. We next examine the changes in model accuracy brought ab… Show more
“…Model 1 from our human prediction has a GDT‐HA of 93 and Cα‐RMSD of 0.48 å. Post analysis on this target indicated a complementary role of Rosetta and explicit water MD as observed previously: while Rosetta successfully recovered a critical error at the C‐terminus and side‐chain orientations around it, MD successfully brought the positions of backbones and side‐chains even closer to the crystal structure. Our MD refinement alone on the input model did not result in this level of accuracy (GDT‐HA of 83 and RMSD 1.6 å).…”
Because proteins generally fold to their lowest free energy states, energy‐guided refinement in principle should be able to systematically improve the quality of protein structure models generated using homologous structure or co‐evolution derived information. However, because of the high dimensionality of the search space, there are far more ways to degrade the quality of a near native model than to improve it, and hence, refinement methods are very sensitive to energy function errors. In the 13th Critial Assessment of techniques for protein Structure Prediction (CASP13), we sought to carry out a thorough search for low energy states in the neighborhood of a starting model using restraints to avoid straying too far. The approach was reasonably successful in improving both regions largely incorrect in the starting models as well as core regions that started out closer to the correct structure. Models with GDT‐HA over 70 were obtained for five targets and for one of those, an accuracy of 0.5 å backbone root‐mean‐square deviation (RMSD) was achieved. An important current challenge is to improve performance in refining oligomers and larger proteins, for which the search problem remains extremely difficult.
“…Model 1 from our human prediction has a GDT‐HA of 93 and Cα‐RMSD of 0.48 å. Post analysis on this target indicated a complementary role of Rosetta and explicit water MD as observed previously: while Rosetta successfully recovered a critical error at the C‐terminus and side‐chain orientations around it, MD successfully brought the positions of backbones and side‐chains even closer to the crystal structure. Our MD refinement alone on the input model did not result in this level of accuracy (GDT‐HA of 83 and RMSD 1.6 å).…”
Because proteins generally fold to their lowest free energy states, energy‐guided refinement in principle should be able to systematically improve the quality of protein structure models generated using homologous structure or co‐evolution derived information. However, because of the high dimensionality of the search space, there are far more ways to degrade the quality of a near native model than to improve it, and hence, refinement methods are very sensitive to energy function errors. In the 13th Critial Assessment of techniques for protein Structure Prediction (CASP13), we sought to carry out a thorough search for low energy states in the neighborhood of a starting model using restraints to avoid straying too far. The approach was reasonably successful in improving both regions largely incorrect in the starting models as well as core regions that started out closer to the correct structure. Models with GDT‐HA over 70 were obtained for five targets and for one of those, an accuracy of 0.5 å backbone root‐mean‐square deviation (RMSD) was achieved. An important current challenge is to improve performance in refining oligomers and larger proteins, for which the search problem remains extremely difficult.
“…Core‐refinement is where recent progress has been found from MD‐based approaches . As we recently described, core‐refinement methods primarily improve residues that are nearly correct, but do not substantially improve regions with significant errors. Rosetta sampling methods could be useful for refining along the two other axes and so complement the limitations in core‐refinement approaches.…”
We report new Rosetta-based approaches to tackling the major issues that confound protein structure refinement, and the testing of these approaches in the CASP11 experiment. Automated refinement protocols were developed that integrate a range of sampling methods using parallel computation and multi-objective optimization. In CASP11, we used a more aggressive large scale structure rebuilding approach for poor starting models, and a less aggressive local rebuilding plus core refinement approach for starting models likely to be closer to the native structure. The more incorrectly modeled a structure was predicted to be, the more it was allowed to vary during refinement. The CASP11 experiment revealed strengths and weaknesses of the approaches: the high-resolution strategy incorporating local rebuilding with core refinement consistently improved starting structures, while the low-resolution strategy incorporating the reconstruction of large parts of the structures improved starting models in some cases but often considerably worsened them, largely because of model selection issues. Overall, the results suggest the high-resolution refinement protocol is a promising method orthogonal to other approaches, while the low-resolution refinement method clearly requires further development.
“…As described in detail previously, this success was attributed to a number of factors: extensive sampling with 30 × 20 ns = 600 ns per target under restraints to prevent large deviations from the initial models, a recently refined version of the CHARMM force field, generation of ensemble averages rather than selection of a single structure, and the use of quality assessment filters to remove decoy sets where scoring was likely not going to discriminate native‐like from non‐native structures. The averaging was especially important since it reproduces the ensemble averaging in experiment but also amplifies recurring native‐like features in a large set of structures over non‐native elements as discussed in detail recently . Achieving consistency in refining template‐based models was a significant milestone, but the extent of refinement remained rather modest with an average of 2.6 GDT‐HA units and a maximum improvement by 6.5 GDT‐HA units for model 1 submissions.…”
Protein structure refinement during CASP11 by the Feig group is described. Molecular dynamics simulations were used in combination with an improved selection and averaging protocol. On average, modest refinement was achieved with some targets improved significantly. Analysis of the CASP submission from our group focused on refinement success vs. amount of sampling, refinement of different secondary structure elements and whether refinement varied as a function of which group provided initial models. The refinement of local stereochemical features was examined via the MolProbity score and an updated protocol was developed that can generate high-quality structures with very low MolProbity scores for most starting structures with modest computational effort.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.