TASSER: An automated method for the prediction of protein tertiary structures in CASP6

Zhang, Yang; Arakaki, Adrián K.; Skolnick, Jeffrey

doi:10.1002/prot.20724

Cited by 183 publications

(195 citation statements)

References 35 publications

Supporting

Mentioning

191

Contrasting

Order By: Relevance

“…We remove proteins having more than one ligand in the binding pocket. Because proteins Ͼ400 residues cannot be modeled using TASSER (27)(28)(29) in a reasonable amount of computer time, these are excluded. No two proteins in the dataset share Ͼ35% sequence identity.…”

Section: Methodsmentioning

confidence: 99%

“…FINDSITE also specifies the chemical properties of the ligands that likely occupy detected binding sites. To assess its validity, we use a representative set of proteins that are weakly homologous to their templates and generate models using two state-of-the-art programs for protein structure modeling: TASSER (27)(28)(29), and MODELLER9v1 (30,31). We demonstrate that FINDSITE operates satisfactorily in the ''twilight zone'' of sequence similarity (32), which covers roughly two-thirds of known protein sequences (30).…”

mentioning

confidence: 99%

See 1 more Smart Citation

A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation

Bryliński

Skolnick

2008

Proc. Natl. Acad. Sci. U.S.A.

Self Cite

303

325

View full text Add to dashboard Cite

The detection of ligand-binding sites is often the starting point for protein function identification and drug discovery. Because of inaccuracies in predicted protein structures, extant binding pocketdetection methods are limited to experimentally solved structures. Here, FINDSITE, a method for ligand-binding site prediction and functional annotation based on binding-site similarity across groups of weakly homologous template structures identified from threading, is described. For crystal structures, considering a cutoff distance of 4 Å as the hit criterion, the success rate is 70.9% for identifying the best of top five predicted ligand-binding sites with a ranking accuracy of 76.0%. Both high prediction accuracy and ability to correctly rank identified binding sites are sustained when approximate protein models (<35% sequence identity to the closest template structure) are used, showing a 67.3% success rate with 75.5% ranking accuracy. In practice, FINDSITE tolerates structural inaccuracies in protein models up to a rmsd from the crystal structure of 8 -10 Å. This is because analysis of weakly homologous protein models reveals that about half have a rmsd from the native binding site <2 Å. Furthermore, the chemical properties of template-bound ligands can be used to select ligand templates associated with the binding site. In most cases, FINDSITE can accurately assign a molecular function to the protein model. pocket detection ͉ protein structure prediction ͉ ligand screening T o date, although the genomes of Ͼ500 organisms have been sequenced (1, 2), the biological function of many identified genes/gene products is unknown. This rapid accumulation of protein sequences of unknown structure and function has motivated the development of proteome-scale protocols for protein structure and function prediction (3-5). The detection of ligandbinding sites is often a starting point for structure-based function identification. Knowledge of the ligand-binding site is also essential for structure-based drug discovery (6). Existing approaches for ligand-binding site prediction can be roughly divided into sequence-and structure-based methods (see refs. 6 and 7). The main strength of sequence-based methods is their ability to identify a ligand-binding/interaction motif in proteins that may not have the same overall fold. However, motif-based searches frequently become ineffective if the binding region is nonlocal in sequence. Homology-based methods require related proteins with significant sequence identity to a protein in the Protein Data Bank (PDB) (8, 9) because the conservation of biochemical function drops rapidly for proteins sharing Ͻ35-40% sequence identity (10, 11). In that regard, a number of structure-based approaches have been developed to identify ligand-binding sites (6). Geometry-based methods locate binding residues by searching for cavities/pockets in a protein structure (12-15). Other methods consider blind docking of small molecules into the receptor structure (16, 17), calculate theoretical microscopic titr...

show abstract

Section: Methodsmentioning

confidence: 99%

mentioning

confidence: 99%

A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation

Bryliński

Skolnick

2008

Proc. Natl. Acad. Sci. U.S.A.

Self Cite

303

325

View full text Add to dashboard Cite

show abstract

“…In principle, an accurate energy function should always recognize near-native conformations and discriminate them from nonnative conformations. In practice, there are scoring function inaccuracies and structural clustering must be used by de novo structure prediction methods to identify native-like structures (2,4,13). This makes two assumptions: (i) that the native conformation should have more structural neighbors than any other conformation because of the loss in configurational entropy on folding; and (ii) that this near-native energy basin is detected by the knowledge-based scoring functions used in Rosetta in that the basin results from the long-range hydrophobic interactions associated with native globular proteins (17).…”

Section: Differences In Top Clustermentioning

confidence: 99%

“…Most widely used standard methods for de novo structure prediction are based on the variants of the Monte Carlo method (4)(5)(6) and are unable to explore low-energy regions efficiently because of the ruggedness of the potential energy surface. To overcome these problems, a number of generalized ensemble Monte Carlo methods have been developed (7)(8)(9)(10).…”

mentioning

confidence: 99%

“…For this reason, there are multiscale approaches that start with low-resolution or reduced-model energy functions and then use all-atom energy functions on a few selected conformations [often relying on additional steps such as use of sequence homologs (2) or clustering (3, 4)] been developed (4,6,12,13). These approaches often fail to generate low-resolution models within the ''radius of convergence'' (rmsd Ͻ3 Å) of the native state necessary for the success of subsequent full-atom refinement (2).…”

mentioning

confidence: 99%

See 1 more Smart Citation

Generalized ensemble methods for de novo structure prediction

Shmygelska

Levitt

2009

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

Current methods for predicting protein structure depend on two interrelated components: (i) an energy function that should have a low value near the correct structure and (ii) a method for searching through different conformations of the polypeptide chain. Identification of the most efficient search methods is essential if we are to be able to apply such methods broadly and with confidence. In addition, efficient search methods provide a rigorous test of existing energy functions, which are generally knowl- By using a set of nonnative low-energy structures found by our extensive sampling, we discovered that the long-range and short-range backbone hydrogen-bonding energy terms of the Rosetta energy discriminate between the nonnative and native-like structures significantly better than the low-resolution score used in Rosetta.conformational search ͉ protein folding ͉ Rosetta force field P redicting the functional 3-dimensional structure (the native state) of a protein from its amino acid sequences is of central importance to structural and functional biology and has enormous applications in alleviating human disease. Even if the structures of all proteins were known, we would still not be able to answer questions related to diseases directly caused by protein misfolding, such as certain types of cancer and Alzheimer's and Parkinson disease. For this we would need to understand the physical basis of the energy terms that make the native state so special. Such understanding of the energetics of the system would also lead to more efficient and comprehensive drug design. Structure prediction depends on solving two problems: (i) describing the energy function with sufficient accuracy and (ii) searching the conformational space sufficiently well. These problems are particularly severe for proteins of biologically relevant lengths (Ͼ150 aa).In this work we focus on conformational sampling, which has been recognized as the critical step in high-resolution structure prediction (1-3). Most widely used standard methods for de novo structure prediction are based on the variants of the Monte Carlo method (4-6) and are unable to explore low-energy regions efficiently because of the ruggedness of the potential energy surface. To overcome these problems, a number of generalized ensemble Monte Carlo methods have been developed (7-10). These methods strive to search energy space better by computing the density of states, sampling expanded ranges of temperatures, or computing other physical quantities affecting transitions between the states during the search. In particular, advanced methods such as Temperature Replica Exchange Monte Carlo (TREM) (8) and Hamiltonian Replica Exchange Monte Carlo (HREM) (10), have been shown to outperform standard Monte Carlo in terms of sampling for both simplified and all-atom force fields of small proteins (8,10,11).For longer proteins, the computational cost and ruggedness of the all-atom energy function makes solving this problem particularly challenging as evidenced by the modest success of full...

show abstract

Protein Structure Prediction

Skolnick

2007

Encyclopedia of Life Sciences

View full text Add to dashboard Cite

The state of the art of the field of protein structure prediction is reviewed. The strengths and weaknesses of the three general approaches, comparative modelling, threading and template‐free modelling, are discussed, and an overview of the results of the critical assessment of structure prediction (CASP) protein structure prediction experiments are summarized. The implications for protein structure prediction of the finding that the library of solved single domain protein structures is likely complete are examined. Recent advances in the modelling of membrane proteins and proteome scale protein structure predictions are presented. Finally, the key remaining unsolved proteins in protein structure prediction are described.

show abstract

TASSER: An automated method for the prediction of protein tertiary structures in CASP6

Cited by 183 publications

References 35 publications

A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation

A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation

Generalized ensemble methods for de novo structure prediction

Protein Structure Prediction

Contact Info

Product

Resources

About