A simple protein folding algorithm using a binary code and secondary structure constraints

Sun, Shenghuan; Thomas, Paul D.; Dill, Ken A.

doi:10.1093/protein/8.8.769

Cited by 108 publications

(73 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Such an effective energy function is the common prerequisite for all theoretical approaches to protein folding. Energy functions are often used to drive the conformational changes of a polypeptide chain as it folds through phase space (Wilson & Doniach, 1989;Covell, 1992Covell, , 1994Sun, 1993;Bowie & Eisenberg, 1994;Dandekar & Argos, 1994Wallqvist & Ullner, 1994;Vieth et al, 1994Vieth et al, , 1995Monge et al, 1995;Mumenthaler & Braun, 1995;Srinivasan & Rose, 1995;Sun et al, 1995). Alternatively, they are employed to discriminate amongst candidate (or`d ecoy'') folds generated by methods that are independent (or semi-independent) of the energy function.…”

Section: Introductionmentioning

confidence: 99%

Factors affecting the ability of energy functions to discriminate correct from incorrect folds

Park

Huang

Levitt

1997

Journal of Molecular Biology

158

133

View full text Add to dashboard Cite

Eighteen low and medium resolution empirical energy functions were tested for their ability to distinguish correct from incorrect folds from three test sets of decoy protein conformations. The energy functions included 13 pairwise potentials of mean force, covering a wide range of functional forms and methods of parameterization, four potentials that attempt to detect properly formed hydrophobic cores, and one environment-based potential. The ®rst of the three test sets consists of large ensembles of plausible conformations for eight small proteins, all of which have correct native secondary structure and are reasonably compact. The second is the set of all subconformations in a database of known protein structures applied to the sequences in that database (ungapped threading). The third is a set of ensembles of 1000 conformations each for seven small proteins taken from molecular dynamics simulations at 298 K and 498 K. Our results show that there are functions effective for each challenge set; moreover, success in one test is no guarantee of success in another. We examine the factors that seem to be important for accurate discrimination of correct structures in each of the test sets, and note that extremely simple functions are often as effective as more complex functions.# 1997 Academic Press Limited

show abstract

Section: Introductionmentioning

confidence: 99%

Factors affecting the ability of energy functions to discriminate correct from incorrect folds

Park

Huang

Levitt

1997

Journal of Molecular Biology

158

133

View full text Add to dashboard Cite

show abstract

“…An alternative approach is first to sample conformational space as exhaustively as possible, given computational limits, then to apply a scoring function to assess the fitness of each candidate structure. In either case, reduction of the available conformational space is achieved by discretization on a lattice (Covell, 1992(Covell, , 1994Hinds & Levitt, 1992, 1994Vieth et al, 1994) or sampling in torsion space (Wilson & Doniach, 1989;Bowie & Eisenberg, 1994;Dandekar & Argos, 1994Monge et al, 1995;Mumenthaler & Braun, 199.5;Srinivasan & Rose, 1995;Sun et al, 1995, Yue & Dill, 1996Simons et al, 1997). However, reduction of the search space also decreases the fidelity with which the native fold can be represented.…”

mentioning

confidence: 99%

Distance geometry generates native‐like folds for small helical proteins using the consensus distances of predicted protein structures

Huang¹,

Samudrala²,

Ponder³

1998

Protein Science

View full text Add to dashboard Cite

For successful ab initio protein structure prediction, a method is needed to identify native-like structures from a set containing both native and non-native protein-like conformations. In this regard, the use of distance geometry has shown promise when accurate inter-residue distances are available. We describe a method by which distance geometry restraints are culled from sets of 500 protein-like conformations for four small helical proteins generated by the method of Simons et al. (1997). A consensus-based approach was applied in which every inter-& distance was measured, and the most frequently occurring distances were used as input restraints for distance geometry. For each protein, a structure with lower coordinate root-mean-square (RMS) error than the mean of the original set was constructed; in three cases the topology of the fold resembled that of the native protein. When the fold sets were filtered for the best scoring conformations with respect to an all-atom knowledge-based scoring function, the remaining subset of SO structures yielded restraints of higher accuracy. A second round of distance geometry using these restraints resulted in an average coordinate RMS error of 4.38 A.Keywords: ab initio folding; distance geometry; energy functions; protein structure prediction How the sequence of a polypeptide determines its three-dimensional structure remains one of the most important unanswered questions in molecular biology. So-called "ab initio" computational approaches seek the overall fold of the polypeptide, often by starting with a random or extended chain and searching for the most energetically favorable, or statistically probable, conformation for the sequence. An alternative approach is first to sample conformational space as exhaustively as possible, given computational limits, then to apply a scoring function to assess the fitness of each candidate structure. In either case, reduction of the available conformational space is achieved by discretization on a lattice (Covell, 1992(Covell, , 1994Hinds & Levitt, 1992, 1994Vieth et al., 1994) or sampling in torsion space (Wilson & Doniach, 1989;Bowie & Eisenberg, 1994;Dandekar & Argos, 1994Monge et al., 1995; Mumenthaler & Braun, 199.5;Srinivasan & Rose, 1995;Sun et al., 1995, Yue & Dill, 1996Simons et al., 1997). However, reduction of the search space also decreases the fidelity with which the native fold can be represented. This is problematic, since knowledge-based scoring functions that recognize native folds have difficulty separating near-native folds from non-native folds (Park et al., 1997). Whereas ab initio

show abstract

“…A certain number of parameters is necessary to fit the general features of the data, but the number of parameters should be kept small to avoid over-fitting. In the case of fitting parameters for fold recognition force fields, literature estimates of the approximate number of parameters span the range from less than ten (Sun et al, 1995;Thomas & Dill, 1996) to tens of thousands (Hendlich et al, 1990;Jones & Thornton, 1993). We tried to explore the lower limit of the number of parameters in our fold recognition force field.…”

Section: Dependence On Number Of Parametersmentioning

confidence: 99%

Protein fold recognition without Boltzmann statistics or explicit physical basis

Huber

Torda

1998

Protein Science

View full text Add to dashboard Cite

We present a fast method for finding optimal parameters for a low-resolution (threading) force field intended to distinguish correct from incorrect folds for a given protein sequence. In contrast to other methods, the parameterization uses information from >lo7 misfolded structures as well as a set of native sequence-structure pairs.In addition to testing the resulting force field's performance on the protein sequence threading problem, results are shown that characterize the number of parameters necessary for effective structure recognition.Keywords: fold recognition; low-resolution force field; optimization algorithm; parameter determination; threading Currently, there is no shortage of low-resolution, protein fold recognition force fields (Lemer et al., 1995;Sippl, 1995;Bohm, 1996; Jernigan & Bahar, 1996; Jones & Thornton, 1996; Sippl & Flockner, 1996;Torda, 1997). These are nearly all designed to tackle the threading problem, where a sequence is tested for compatibility with a series of structures and a pseudo-potential energy function is applied to find the most appropriate structure for some sequence.It is not known whether a protein's fold can be explained simply by internal interactions or whether it is the result of complex interplay with the environment and folding history. Consequently, the optimal fold recognition function may not need to be based on real physical properties. Instead, it may simply reflect some common denominator among naturally expressed proteins (and solved structures).Originally, it was seen as an achievement for a method to be able to recognize a sequence's native structure from a large number of wrong, decoy structures (Bowie et al., 1991;Jones et al., 1992). Since then, the problem of self-recognition seems to have become a minimal requirement (Defay & Cohen, 1996; Jones & Thornton, 1996). With this baseline, a new force field is probably only interesting if there is evidence of remarkable performance or some cunning innovation. The work here may not satisfy either of these criteria, but it does have some interesting properties. There is no reliance on Boltzmann statistics (Jones et al., 1992) nor on any obvious physics. Rather than merely aim for self-recognition, the methodology optimizes the statistical significance of such recognition. This is based on the philosophy of defining a criterion for force field quality and then adjusting parameters to optimize this property (Seetharamulu & Crippen, 1991 khnovich, 1996; Ulrich et al., 1997). Next, the parameterization scheme includes the effect of structures generated by threading. Unlike earlier work (Ulrich et al., 1997), a tractable scheme has been devised whereby one can easily handle parameterization with more than 300 native structures and lo7 misfolded alternative structures. Most importantly, the force field functional forms were chosen so that one could guarantee convergence using simple gradient-based optimization. Finally, the method was applied to give some estimate of the force field's "leaming capacity," o...

show abstract

A simple protein folding algorithm using a binary code and secondary structure constraints

Cited by 108 publications

References 0 publications

Factors affecting the ability of energy functions to discriminate correct from incorrect folds

Factors affecting the ability of energy functions to discriminate correct from incorrect folds

Distance geometry generates native‐like folds for small helical proteins using the consensus distances of predicted protein structures

Protein fold recognition without Boltzmann statistics or explicit physical basis

Contact Info

Product

Resources

About