We present a fast method for finding optimal parameters for a low-resolution (threading) force field intended to distinguish correct from incorrect folds for a given protein sequence. In contrast to other methods, the parameterization uses information from >lo7 misfolded structures as well as a set of native sequence-structure pairs.In addition to testing the resulting force field's performance on the protein sequence threading problem, results are shown that characterize the number of parameters necessary for effective structure recognition.Keywords: fold recognition; low-resolution force field; optimization algorithm; parameter determination; threading Currently, there is no shortage of low-resolution, protein fold recognition force fields (Lemer et al., 1995;Sippl, 1995;Bohm, 1996; Jernigan & Bahar, 1996; Jones & Thornton, 1996; Sippl & Flockner, 1996;Torda, 1997). These are nearly all designed to tackle the threading problem, where a sequence is tested for compatibility with a series of structures and a pseudo-potential energy function is applied to find the most appropriate structure for some sequence.It is not known whether a protein's fold can be explained simply by internal interactions or whether it is the result of complex interplay with the environment and folding history. Consequently, the optimal fold recognition function may not need to be based on real physical properties. Instead, it may simply reflect some common denominator among naturally expressed proteins (and solved structures).Originally, it was seen as an achievement for a method to be able to recognize a sequence's native structure from a large number of wrong, decoy structures (Bowie et al., 1991;Jones et al., 1992). Since then, the problem of self-recognition seems to have become a minimal requirement (Defay & Cohen, 1996; Jones & Thornton, 1996). With this baseline, a new force field is probably only interesting if there is evidence of remarkable performance or some cunning innovation. The work here may not satisfy either of these criteria, but it does have some interesting properties. There is no reliance on Boltzmann statistics (Jones et al., 1992) nor on any obvious physics. Rather than merely aim for self-recognition, the methodology optimizes the statistical significance of such recognition. This is based on the philosophy of defining a criterion for force field quality and then adjusting parameters to optimize this property (Seetharamulu & Crippen, 1991 khnovich, 1996; Ulrich et al., 1997). Next, the parameterization scheme includes the effect of structures generated by threading. Unlike earlier work (Ulrich et al., 1997), a tractable scheme has been devised whereby one can easily handle parameterization with more than 300 native structures and lo7 misfolded alternative structures. Most importantly, the force field functional forms were chosen so that one could guarantee convergence using simple gradient-based optimization. Finally, the method was applied to give some estimate of the force field's "leaming capacity," o...