In this article we introduce a new method for the identification and the accurate characterization of protein surface cavities. The method is encoded in the program SCREEN (Surface Cavity REcognition and EvaluatioN). As a first test of the utility of our approach we used SCREEN to locate and analyze the surface cavities of a nonredundant set of 99 proteins cocrystallized with drugs. We find that this set of proteins has on average about 14 distinct cavities per protein. In all cases, a drug is bound at one (and sometimes more than one) of these cavities. Using cavity size alone as a criterion for predicting drug-binding sites yields a high balanced error rate of 15.7%, with only 71.7% coverage. Here we characterize each surface cavity by computing a comprehensive set of 408 physicochemical, structural, and geometric attributes. By applying modern machine learning techniques (Random Forests) we were able to develop a classifier that can identify drug-binding cavities with a balanced error rate of 7.2% and coverage of 88.9%. Only 18 of the 408 cavity attributes had a statistically significant role in the prediction. Of these 18 important attributes, almost all involved size and shape rather than physicochemical properties of the surface cavity. The implications of these results are discussed. A SCREEN Web server is available at http://interface.bioc.columbia.edu/screen.
The crystal structure ofthe large fragment of the Thermus aquaticus DNA polymerase (Klentaql), determined at 2.5-A resolution, demonstrates a compact two-domain architecture. The C-terminal domain is identical in fold to the equivalent region of the Klenow fragment of Escherichia coli DNA polymerase I (Klenow pol I). Although the N-terminal domain of Klentaql differs greatly in sequence from its counterpart in Klenow pol I, it has clearly evolved from a common ancestor. The structure of Klentaql reveals the strategy utilized by this protein to maintain activity at high temperatures and provides the structural basis for future improvements of the enzyme.Amplification of DNA fragments by the polymerase chain reaction (PCR) has become an important and widespread tool of genetic analysis since the introduction of the thermostable DNA polymerase from Thermus aquaticus (Taq) (1-3). The enzyme, by enabling the amplification reaction to be performed at higher temperatures, allows the convenience of heat denaturation of DNA without enzyme inactivation. Purified Taq DNA polymerase, however, is devoid of 3'-5' exonuclease activity and thus cannot excise misincorporated nucleotides (4, 5). Consequently, DNA amplification by the Taq DNA polymerase is an error-prone process. Enzymes with N-terminal deletions show a reduced tendency toward errors, as do some recently discovered thermostable DNA polymerases which have an integral editing exonuclease activity (6, 7). The latter enzymes, however, are unable to amplify sequences in excess of 5.0 to 7.0 kb that full-length Taq DNA polymerase (8) or N-terminally deleted enzyme (7) can amplify readily. The amplification of very large I)NA fragments (up to 35 kb) was recently achieved by combining an N-terminally deleted Taq DNA polymerase called Klentaql with a low level of an archaebacterial thermostable DNA polymerase exhibiting 3'-5' exonuclease activity (9, 10). Taq DNA polymerase or forms of the enzyme with N-terminal deletions are also used in DNA sequencing (10-12). However, the quality of the data has been limited and the expense kept high by the poor affinity of the enzyme for dideoxynucleotides. Mutants with increased affinity for chain terminators would be of considerable interest.To understand the structural basis of thermostability and provide the foundation for the improvement of the Taq DNA polymerase, we present here the three-dimensional structure of Klentaql.t extension (MGKRKST) was used and yielded crystals diffracting to beyond 2.5-A resolution (Table 1). Crystals of Klentaql were obtained at room temperature by using vapor diffusion against a solution of 6% (wt/vol) polyethylene glycol 3350/50 mM MgCl2/100 mM Tris-HCl, pH 9.0, starting with equal mixtures of protein and polyethylene glycol solutions (13). Klentaql crystals are in space group P21212 (a = 109.4 A, b = 136.8 A, c = 45.6 A) with one molecule in the asymmetric unit.Structure Determination. Details of structure determination will be published elsewhere. In brief, heavy-atom derivatives were pr...
The coordination shell of Ca2+ ions in proteins contains almost exclusively oxygen atoms supported by an outer shell of carbon atoms. The bond-strength contribution of each ligating oxygen in the inner shell can be evaluated by using an empirical expression successfully applied in the analysis of crystals of metal oxides.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.