The detection of ligand-binding sites is often the starting point for protein function identification and drug discovery. Because of inaccuracies in predicted protein structures, extant binding pocketdetection methods are limited to experimentally solved structures. Here, FINDSITE, a method for ligand-binding site prediction and functional annotation based on binding-site similarity across groups of weakly homologous template structures identified from threading, is described. For crystal structures, considering a cutoff distance of 4 Å as the hit criterion, the success rate is 70.9% for identifying the best of top five predicted ligand-binding sites with a ranking accuracy of 76.0%. Both high prediction accuracy and ability to correctly rank identified binding sites are sustained when approximate protein models (<35% sequence identity to the closest template structure) are used, showing a 67.3% success rate with 75.5% ranking accuracy. In practice, FINDSITE tolerates structural inaccuracies in protein models up to a rmsd from the crystal structure of 8 -10 Å. This is because analysis of weakly homologous protein models reveals that about half have a rmsd from the native binding site <2 Å. Furthermore, the chemical properties of template-bound ligands can be used to select ligand templates associated with the binding site. In most cases, FINDSITE can accurately assign a molecular function to the protein model. pocket detection ͉ protein structure prediction ͉ ligand screening T o date, although the genomes of Ͼ500 organisms have been sequenced (1, 2), the biological function of many identified genes/gene products is unknown. This rapid accumulation of protein sequences of unknown structure and function has motivated the development of proteome-scale protocols for protein structure and function prediction (3-5). The detection of ligandbinding sites is often a starting point for structure-based function identification. Knowledge of the ligand-binding site is also essential for structure-based drug discovery (6). Existing approaches for ligand-binding site prediction can be roughly divided into sequence-and structure-based methods (see refs. 6 and 7). The main strength of sequence-based methods is their ability to identify a ligand-binding/interaction motif in proteins that may not have the same overall fold. However, motif-based searches frequently become ineffective if the binding region is nonlocal in sequence. Homology-based methods require related proteins with significant sequence identity to a protein in the Protein Data Bank (PDB) (8, 9) because the conservation of biochemical function drops rapidly for proteins sharing Ͻ35-40% sequence identity (10, 11). In that regard, a number of structure-based approaches have been developed to identify ligand-binding sites (6). Geometry-based methods locate binding residues by searching for cavities/pockets in a protein structure (12-15). Other methods consider blind docking of small molecules into the receptor structure (16, 17), calculate theoretical microscopic titr...