We describe an automated procedure for protein design, implemented in a flexible software package, called Proteus. System setup and calculation of an energy matrix are done with the XPLOR modeling program and its sophisticated command language, supporting several force fields and solvent models. A second program provides algorithms to search sequence space. It allows a decomposition of the system into groups, which can be combined in different ways in the energy function, for both positive and negative design. The whole procedure can be controlled by editing 2-4 scripts. Two applications consider the tyrosyl-tRNA synthetase enzyme and its successful redesign to bind both O-methyl-tyrosine and D-tyrosine. For the latter, we present Monte Carlo simulations where the D-tyrosine concentration is gradually increased, displacing L-tyrosine from the binding pocket and yielding the binding free energy difference, in good agreement with experiment. Complete redesign of the Crk SH3 domain is presented. The top 10000 sequences are all assigned to the correct fold by the SUPERFAMILY library of Hidden Markov Models. Finally, we report the acid/base behavior of the SNase protein. Sidechain protonation is treated as a form of mutation; it is then straightforward to perform constant-pH Monte Carlo simulations, which yield good agreement with experiment. Overall, the software can be used for a wide range of application, producing not only native-like sequences but also thermodynamic properties with errors that appear comparable to other current software packages.
Computational protein design will continue to improve as new implementations and parameterizations are explored. An automated protein design procedure is implemented and applied to the full redesign of 16 globular proteins. We combine established but simple ingredients: a molecular mechanics description of the protein where nonpolar hydrogens are implicit, a simple solvent model, a folded state where the backbone is fixed, and a tripeptide model of the unfolded state. Sequences are selected to optimize the folding free energy, using a simple heuristic algorithm to explore sequence and conformational space. We show that a balanced parametrization, obtained here and in our previous work, makes this procedure effective, despite the simplicity of the ingredients. Calculations were done using our Proteins @ Home distributed computing platform, with the help of several thousand volunteers. We describe the software implementation, the optimization of selected terms in the energy function, and the performance of the method. We allowed all amino acids to mutate except glycines, prolines, and cysteines. For 15 of the 16 test proteins, the scores of the computed sequences were comparable to those of natural homologues. Using the low energy computed sequences in a BLAST search of the SWISSPROT database, we could retrieve natural sequences for all protein families considered, with no high-ranking false-positives. The good stability of the designed sequences was supported by molecular dynamics simulations of selected sequences, which gave structures close to the experimental native structure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.