We have developed an evolutionary approach for flexible ligand docking. This approval, GEMDOCK, uses a Generic Evolutionary Method for molecular DOCKing and an empirical scoring function. The former combines both discrete and continuous global search strategies with local search strategies to speed up convergence, whereas the latter results in rapid recognition of potential ligands. GEMDOCK was tested on a diverse data set of 100 protein-ligand complexes from the Protein Data Bank. In 79% of these complexes, the docked lowest energy ligand structures had root-mean-square derivations (RMSDs) below 2.0 A with respect to the corresponding crystal structures. The success rate increased to 85% if the structure water molecules were retained. We evaluated GEMDOCK on two cross-docking experiments in which each ligand of a protein ensemble was docked into each protein of the ensemble. Seventy-six percent of the docked structures had RMSDs below 2.0 A when the ligands were docked into foreign structures. We analyzed and validated GEMDOCK with respect to various search spaces and scoring functions, and found that if the scoring function was perfect, then the predicted accuracy was also essentially perfect. This study suggests that GEMDOCK is a useful tool for molecular recognition and may be used to systematically evaluate and thus improve scoring functions.
Motivation: Virtual screening of molecular compound libraries is a potentially powerful and inexpensive method for the discovery of novel lead compounds for drug development. The major weakness of virtual screeningsthe inability to consistently identify true positives (leads)sis likely due to our incomplete understanding of the chemistry involved in ligand binding and the subsequently imprecise scoring algorithms. It has been demonstrated that combining multiple scoring functions (consensus scoring) improves the enrichment of true positives. Previous efforts at consensus scoring have largely focused on empirical results, but they have yet to provide a theoretical analysis that gives insight into real features of combinations and data fusion for virtual screening. Results: We demonstrate that combining multiple scoring functions improves the enrichment of true positives only if (a) each of the individual scoring functions has relatively high performance and (b) the individual scoring functions are distinctive. Notably, these two prediction variables are previously established criteria for the performance of data fusion approaches using either rank or score combinations. This work, thus, establishes a potential theoretical basis for the probable success of data fusion approaches to improve yields in in silico screening experiments. Furthermore, it is similarly established that the second criterion (b) can, in at least some cases, be functionally defined as the area between the rank versus score plots generated by the two (or more) algorithms. Because rank-score plots are independent of the performance of the individual scoring function, this establishes a second theoretically defined approach to determining the likely success of combining data from different predictive algorithms. This approach is, thus, useful in practical settings in the virtual screening process when the performance of at least two individual scoring functions (such as in criterion a) can be estimated as having a high likelihood of having high performance, even if no training sets are available. We provide initial validation of this theoretical approach using data from five scoring systems with two evolutionary docking algorithms on four targets, thymidine kinase, human dihydrofolate reductase, and estrogen receptors of antagonists and agonists. Our procedure is computationally efficient, able to adapt to different situations, and scalable to a large number of compounds as well as to a greater number of combinations. Results of the experiment show a fairly significant improvement (vs single algorithms) in several measures of scoring quality, specifically "goodness-of-hit" scores, false positive rates, and "enrichment". This approach (available online at http://gemdock.life. nctu.edu.tw/dock/download.php) has practical utility for cases where the basic tools are known or believed to be generally applicable, but where specific training sets are absent.
BackgroundPharmacological interactions are useful for understanding ligand binding mechanisms of a therapeutic target. These interactions are often inferred from a set of active compounds that were acquired experimentally. Moreover, most docking programs loosely coupled the stages (binding-site and ligand preparations, virtual screening, and post-screening analysis) of structure-based virtual screening (VS). An integrated VS environment, which provides the friendly interface to seamlessly combine these VS stages and to identify the pharmacological interactions directly from screening compounds, is valuable for drug discovery.ResultsWe developed an easy-to-use graphic environment, iGEMDOCK, integrating VS stages (from preparations to post-screening analysis). For post-screening analysis, iGEMDOCK provides biological insights by deriving the pharmacological interactions from screening compounds without relying on the experimental data of active compounds. The pharmacological interactions represent conserved interacting residues, which often form binding pockets with specific physico-chemical properties, to play the essential functions of a target protein. Our experimental results show that the pharmacological interactions derived by iGEMDOCK are often hot spots involving in the biological functions. In addition, iGEMDOCK provides the visualizations of the protein-compound interaction profiles and the hierarchical clustering dendrogram of the compounds for post-screening analysis.ConclusionsWe have developed iGEMDOCK to facilitate steps from preparations of target proteins and ligand libraries toward post-screening analysis. iGEMDOCK is especially useful for post-screening analysis and inferring pharmacological interactions from screening compounds. We believe that iGEMDOCK is useful for understanding the ligand binding mechanisms and discovering lead compounds. iGEMDOCK is available at http://gemdock.life.nctu.edu.tw/dock/igemdock.php.
As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at ]
Protein structure prediction provides valuable insights into function, and comparative modeling is one of the most reliable methods to predict 3D structures directly from amino acid sequences. However, critical problems arise during the selection of the correct templates and the alignment of query sequences therewith. We have developed an automatic protein structure prediction server, (PS)2, which uses an effective consensus strategy both in template selection, which combines PSI-BLAST and IMPALA, and target–template alignment integrating PSI-BLAST, IMPALA and T-Coffee. (PS)2 was evaluated for 47 comparative modeling targets in CASP6 (Critical Assessment of Techniques for Protein Structure Prediction). For the benchmark dataset, the predictive performance of (PS)2, based on the mean GTD_TS score, was superior to 10 other automatic servers. Our method is based solely on the consensus sequence and thus is considerably faster than other methods that rely on the additional structural consensus of templates. Our results show that (PS)2, coupled with suitable consensus strategies and a new similarity score, can significantly improve structure prediction. Our approach should be useful in structure prediction and modeling. The (PS)2 is available through the website at .
We developed a pharmacophorebased evolutionary approach for virtual screening. This tool, termed the Generic Evolutionary Method for molecular DOCKing (GEMDOCK), combines an evolutionary approach with a new pharmacophorebased scoring function. The former integrates discrete and continuous global search strategies with local search strategies to expedite convergence. The latter, integrating an empirical-based energy function and pharmacological preferences (binding-site pharmacological interactions and ligand preferences), simultaneously serves as the scoring function for both molecular docking and postdocking analyses to improve screening accuracy. We apply pharmacological interaction preferences to select the ligands that form pharmacological interactions with target proteins, and use the ligand preferences to eliminate the ligands that violate the electrostatic or hydrophilic constraints. We assessed the accuracy of our approach using human estrogen receptor (ER) and a ligand database from the comparative studies of Bissantz et al. (J Med Chem 2000;43:4759 -4767). Using GEMDOCK, the average goodness-of-hit (GH) score was 0.83 and the average false-positive rate was 0.13% for ER antagonists, and the average GH score was 0.48 and the average false-positive rate was 0.75% for ER agonists. The performance of GEMDOCK was superior to competing methods such as GOLD and DOCK. We found that our pharmacophore-based scoring function indeed was able to reduce the number of false positives; moreover, the resulting pharmacological interactions at the binding site, as well as ligand preferences, were important to the screening accuracy of our experiments. These results suggest that GEMDOCK constitutes a robust tool for virtual database screening. Proteins 2005;59:205-220.
The KDM4/JMJD2 Jumonji C-containing histone lysine demethylases (KDM4A–KDM4D), which selectively remove the methyl group(s) from tri/dimethylated lysine 9/36 of H3, modulate transcriptional activation and genome stability. The overexpression of KDM4A/KDM4B in prostate cancer and their association with androgen receptor suggest that KDM4A/KDM4B are potential progression factors for prostate cancer. Here, we report the crystal structure of the KDM4B·pyridine 2,4-dicarboxylic acid·H3K9me3 ternary complex, revealing the core active-site region and a selective K9/K36 site. A selective KDM4A/KDM4B inhibitor, 4, that occupies three subsites in the binding pocket is identified by virtual screening. Pharmacological and genetic inhibition of KDM4A/KDM4B significantly blocks the viability of cultured prostate cancer cells, which is accompanied by increased H3K9me3 staining and transcriptional silencing of growth-related genes. Significantly, a substantial portion of differentially expressed genes are AR-responsive, consistent with the roles of KDM4s as critical AR activators. Our results point to KDM4 as a useful therapeutic target and identify a new inhibitor scaffold.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.