Two new programs have been developed for searching the Cambridge Structural Database (CSD) and visualizing database entries: ConQuest and Mercury. The former is a new search interface to the CSD, the latter is a high-performance crystal-structure visualizer with extensive facilities for exploring networks of intermolecular contacts. Particular emphasis has been placed on making the programs as intuitive as possible. Both ConQuest and Mercury run under Windows and various types of Unix, including Linux.
The Chemscore function was implemented as a scoring function for the protein-ligand docking program GOLD, and its performance compared to the original Goldscore function and two consensus docking protocols, "Goldscore-CS" and "Chemscore-GS," in terms of docking accuracy, prediction of binding affinities, and speed. In the "Goldscore-CS" protocol, dockings produced with the Goldscore function are scored and ranked with the Chemscore function; in the "Chemscore-GS" protocol, dockings produced with the Chemscore function are scored and ranked with the Goldscore function. Comparisons were made for a "clean" set of 224 protein-ligand complexes, and for two subsets of this set, one for which the ligands are "drug-like," the other for which they are "fragment-like." For "drug-like" and "fragment-like" ligands, the docking accuracies obtained with Chemscore and Goldscore functions are similar. For larger ligands, Goldscore gives superior results. Docking with the Chemscore function is up to three times faster than docking with the Goldscore function. Both combined docking protocols give significant improvements in docking accuracy over the use of the Goldscore or Chemscore function alone. "Goldscore-CS" gives success rates of up to 81% (top-ranked GOLD solution within 2.0 A of the experimental binding mode) for the "clean list," but at the cost of long search times. For most virtual screening applications, "Chemscore-GS" seems optimal; search settings that give docking speeds of around 0.25-1.3 min/compound have success rates of about 78% for "drug-like" compounds and 85% for "fragment-like" compounds. In terms of producing binding energy estimates, the Goldscore function appears to perform better than the Chemscore function and the two consensus protocols, particularly for faster search settings. Even at docking speeds of around 1-2 min/compound, the Goldscore function predicts binding energies with a standard deviation of approximately 10.5 kJ/mol.
The crystallographically determined bond length, valence angle, and torsion angle information in the Cambridge Structural Database (CSD) has many uses. However, accessing it by means of conventional substructure searching requires nontrivial user intervention. In consequence, these valuable data have been underutilized and have not been directly accessible to client applications. The situation has been remedied by development of a new program (Mogul) for automated retrieval of molecular geometry data from the CSD. The program uses a system of keys to encode the chemical environments of fragments (bonds, valence angles, and acyclic torsions) from CSD structures. Fragments with identical keys are deemed to be chemically identical and are grouped together, and the distribution of the appropriate geometrical parameter (bond length, valence angle, or torsion angle) is computed and stored. Use of a search tree indexed on key values, together with a novel similarity calculation, then enables the distribution matching any given query fragment (or the distributions most closely matching, if an adequate exact match is unavailable) to be found easily and with no user intervention. Validation experiments indicate that, with rare exceptions, search results afford precise and unbiased estimates of molecular geometrical preferences. Such estimates may be used, for example, to validate the geometries of libraries of modeled molecules or of newly determined crystal structures or to assist structure solution from low-resolution (e.g. powder diffraction) X-ray data.
DASH is a user-friendly graphical-user-interface-driven computer program for solving crystal structures from X-ray powder diffraction data, optimized for molecular structures. Algorithms for multiple peak fitting, unit-cell indexing and space-group determination are included as part of the program. Molecular models can be read in a number of formats and automatically converted to Z-matrices in which flexible torsion angles are automatically identified. Simulated annealing is used to search for the global minimum in the space that describes the agreement between observed and calculated structure factors. The simulated annealing process is very fast, which in part is due to the use of correlated integrated intensities rather than the full powder pattern. Automatic minimization of the structures obtained by simulated annealing and automatic overlay of solutions assist in assessing the reproducibility of the best solution, and therefore in determining the likelihood that the global minimum has been obtained.
The results of the sixth blind test of organic crystal structure prediction methods are presented and discussed, highlighting progress for salts, hydrates and bulky flexible molecules, as well as on-going challenges.
We present a large test set of protein-ligand complexes for the purpose of validating algorithms that rely on the prediction of protein-ligand interactions. The set consists of 305 complexes with protonation states assigned by manual inspection. The following checks have been carried out to identify unsuitable entries in this set: (1) assessing the involvement of crystallographically related protein units in ligand binding; (2) identification of bad clashes between protein side chains and ligand; and (3) assessment of structural errors, and/or inconsistency of ligand placement with crystal structure electron density. In addition, the set has been pruned to assure diversity in terms of protein-ligand structures, and subsets are supplied for different protein-structure resolution ranges. A classification of the set by protein type is available. As an illustration, validation results are shown for GOLD and SuperStar. GOLD is a program that performs flexible protein-ligand docking, and SuperStar is used for the prediction of favorable interaction sites in proteins. The new CCDC/Astex test set is freely available to the scientific community (http://www.ccdc.cam.ac.uk).
We implemented a novel approach to score water mediation and displacement in the protein-ligand docking program GOLD. The method allows water molecules to switch on and off and to rotate around their three principal axes. A constant penalty, sigma(p), representing the loss of rigid-body entropy, is added for water molecules that are switched on, hence rewarding water displacement. We tested the methodology in an extensive validation study. First, sigma(p) is optimized against a training set of 58 protein-ligand complexes. For this training set, our algorithm correctly predicts water mediation/displacement in approximately 92% of the cases. We observed small improvements in the quality of the predicted binding modes for water-mediated complexes. In the second part of this work, an entirely independent set of 225 complexes is used. For this test set, our algorithm correctly predicts water mediation/displacement in approximately 93% of the cases. Improvements in binding mode quality were observed for individual water-mediated complexes.
There is currently great interest in comparing protein-ligand docking programs. A review of recent comparisons shows that it is difficult to draw conclusions of general applicability. Statistical hypothesis testing is required to ensure that differences in pose-prediction success rates and enrichment rates are significant. Numerical measures such as root-mean-square deviation need careful interpretation and may profitably be supplemented by interaction-based measures and visual inspection of dockings. Test sets must be of appropriate diversity and of good experimental reliability. The effects of crystal-packing interactions may be important. The method used for generating starting ligand geometries and positions may have an appreciable effect on docking results. For fair comparison, programs must be given search problems of equal complexity (e.g. binding-site regions of the same size) and approximately equal time in which to solve them. Comparisons based on rescoring require local optimization of the ligand in the space of the new objective function. Re-implementations of published scoring functions may give significantly different results from the originals. Ostensibly minor details in methodology may have a profound influence on headline success rates.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite Inc. All rights reserved.
Made with 💙 for researchers