The fastest simple, single domain proteins fold a million times more rapidly than the slowest. Ultimately this broad kinetic spectrum is determined by the amino acid sequences that define these proteins, suggesting that the mechanisms that underlie folding may be almost as complex as the sequences that encode them. Here, however, we summarize recent experimental results which suggest that (1) despite a vast diversity of structures and functions, there are fundamental similarities in the folding mechanisms of single domain proteins and (2) rather than being highly sensitive to the finest details of sequence, their folding kinetics are determined primarily by the large-scale, redundant features of sequence that determine a protein's gross structural properties. That folding kinetics can be predicted using simple, empirical, structure-based rules suggests that the fundamental physics underlying folding may be quite straightforward and that a general and quantitative theory of protein folding rates and mechanisms (as opposed to unfolding rates and thus protein stability) may be near on the horizon.
To generate structures consistent with both the local and nonlocal interactions responsible for protein stability, 3 and 9 residue fragments of known structures with local sequences similar to the target sequence were assembled into complete tertiary structures using a Monte Carlo simulated annealing procedure (Simons et al., J Mol Biol 1997; 268:209-225). The scoring function used in the simulated annealing procedure consists of sequence-dependent terms representing hydrophobic burial and specific pair interactions such as electrostatics and disulfide bonding and sequence-independent terms representing hard sphere packing, alpha-helix and beta-strand packing, and the collection of beta-strands in beta-sheets (Simons et al., Proteins 1999;34:82-95). For each of 21 small, ab initio targets, 1,200 final structures were constructed, each the result of 100,000 attempted fragment substitutions. The five structures submitted for the CASP III experiment were chosen from the approximately 25 structures with the lowest scores in the broadest minima (assessed through the number of structural neighbors; Shortle et al., Proc Natl Acad Sci USA 1998;95:1158-1162). The results were encouraging: highlights of the predictions include a 99-residue segment for MarA with an rmsd of 6.4 A to the native structure, a 95-residue (full length) prediction for the EH2 domain of EPS15 with an rmsd of 6.0 A, a 75-residue segment of DNAB helicase with an rmsd of 4.7 A, and a 67-residue segment of ribosomal protein L30 with an rmsd of 3.8 A. These results suggest that ab initio methods may soon become useful for low-resolution structure prediction for proteins that lack a close homologue of known structure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.