We describe a new computer algorithm for finding low-energy conformations of proteins. It is a chain-growth method that uses a heuristic bias function to help assemble a hydrophobic core. We call it the Core-directed chain Growth method (CG). We test the CG method on several well-known literature examples of HP lattice model proteins [in which proteins are modeled as sequences of hydrophobic (H) and polar (P) monomers], ranging from 20-64 monomers in two dimensions, and up to 88-mers in three dimensions. Previous nonexhaustive methods-Monte Carlo, a Genetic Algorithm, Hydrophobic Zippers. and ContactInteractions-have been tried on these same model sequences. CG is substantially better at finding the global optima, and avoiding local optima, and it does so in comparable or shorter times. CG finds the global minimum energy of the longest HP lattice model chain for which the global optimum is known, a 3D 88-mer that has only been reachable before by the CHCC complete search method. CG has the potential advantage that it should have nonexponential scaling with chain length. We believe this is a promising method for conformational searching in protein folding algorithms.Keywords: chain growth algorithm; conformational searching; lattice model; protein folding
The conformational search problemThere have been many important advances on the road to developing a computer protein folding algorithm (Levitt & Warshel, 1975;Kuntz et al., 1976;Wilson & Doniach, 1989;Skolnick & Kolinski, 1990;Covell, 1992Covell, , 1994Sippl et al., 1992;Vajda et al., 1993;Hinds & Levitt, 1994;Kolinski & Skolnick, 1994;Monge et al., 1994;Wallqvist et al., 1994;Boczko & Brooks, 1995;Srinivasan & Rose, 1995;Sun et al., 1995;Yue & Dill, 1996). In order to devise a computer method that can predict the native structure of a protein from its amino acid sequence alone, it is necessary to have an adequate energy function applied to an appropriate chain representation and searched with a fast conformational search method. Currently, the most popular conformational search methods are Molecular Dynamics (MD) and Monte Carlo (MC) and its variants-simulated annealing and genetic algorithms. But these conformational search methods are too slow and "inefficient;" that is, they get stuck in energy traps and are unable to reach the global minima of their energy functions in a reasonable amount of computer time (hours to weeks on workstations). Here we describe a Reprint requests to: Ken A. Dill, Department of Pharmaceutical Chemistry, Box 1204, University of California, San Francisco, California 94143-1204; e-mail: dill@maxwell.ucsf.edu. method that improves on the speed and efficiency of existing search methods.The main problem in developing a conformational search strategy for protein folding is that the energy landscape is large, and sometimes rugged, and we seek the global minima (rather than local minima), of which there are an exceedingly small number. We are searching for a needle in a haystack . The success of a search strategy can be judged ...