We describe an automated procedure for protein design, implemented in a flexible software package, called Proteus. System setup and calculation of an energy matrix are done with the XPLOR modeling program and its sophisticated command language, supporting several force fields and solvent models. A second program provides algorithms to search sequence space. It allows a decomposition of the system into groups, which can be combined in different ways in the energy function, for both positive and negative design. The whole procedure can be controlled by editing 2-4 scripts. Two applications consider the tyrosyl-tRNA synthetase enzyme and its successful redesign to bind both O-methyl-tyrosine and D-tyrosine. For the latter, we present Monte Carlo simulations where the D-tyrosine concentration is gradually increased, displacing L-tyrosine from the binding pocket and yielding the binding free energy difference, in good agreement with experiment. Complete redesign of the Crk SH3 domain is presented. The top 10000 sequences are all assigned to the correct fold by the SUPERFAMILY library of Hidden Markov Models. Finally, we report the acid/base behavior of the SNase protein. Sidechain protonation is treated as a form of mutation; it is then straightforward to perform constant-pH Monte Carlo simulations, which yield good agreement with experiment. Overall, the software can be used for a wide range of application, producing not only native-like sequences but also thermodynamic properties with errors that appear comparable to other current software packages.
Titratable residues determine the acid/base behavior of proteins, strongly influencing their function; in addition, proton binding is a valuable reporter on electrostatic interactions. We describe a method for pK(a) calculations, using constant-pH Monte Carlo (MC) simulations to explore the space of sidechain conformations and protonation states, with an efficient and accurate generalized Born model (GB) for the solvent effects. To overcome the many-body dependency of the GB model, we use a "Native Environment" approximation, whose accuracy is shown to be good. It allows the precalculation and storage of interactions between all sidechain pairs, a strategy borrowed from computational protein design, which makes the MC simulations themselves very fast. The method is tested for 12 proteins and 167 titratable sidechains. It gives an rms error of 1.1 pH units, similar to the trivial "Null" model. The only adjustable parameter is the protein dielectric constant. The best accuracy is achieved for values between 4 and 8, a range that is physically plausible for a protein interior. For sidechains with large pKa shifts, ≥2, the rms error is 1.6, compared to 2.5 with the Null model and 1.5 with the empirical PROPKA method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.