In recent years, many virtual screening (VS) tools have been developed that employ different molecular representations and have different speed and accuracy characteristics. In this paper, we compare ten popular ligand-based VS tools using the publicly available Directory of Useful Decoys (DUD) dataset comprising over 100,000 compounds distributed across 40 protein targets. The DUD was developed initially to evaluate docking algorithms, but our results from an operational correlation analysis show that it is also well suited for comparing ligand-based VS tools. Although it is conventional wisdom that 3D molecular shape is an important determinant of biological activity, our results based on permutational significance tests of several commonly used VS metrics show that the 2D fingerprint-based methods generally give better VS performance than the 3D shape-based approaches for surprisingly many of the DUD targets. In order to help understand this finding, we have analysed the nature of the scoring functions used and the composition of the DUD dataset itself. We propose that in order to * To whom correspondence should be addressed † INRIA Nancy Grand Est, LORIA, 54506, Vandoeuvre-lès-Nancy, France ‡ Contributed equally to this work 1 improve the VS performance of current 3D methods, it will be necessary to devise screening queries which can represent multiple possible conformations and which can exploit knowledge of known actives that span multiple scaffold families.
HIV infection is initiated by fusion of the virus with the target cell through binding of the viral gp120 protein with the CD4 cell surface receptor protein and the CXCR4 or CCR5 co-receptors. There is currently considerable interest in developing novel ligands that can modulate the conformations of these co-receptors and, hence, ultimately block virus-cell fusion. This article describes a detailed comparison of the performance of receptor-based and ligand-based virtual screening approaches to find CXCR4 and CCR5 antagonists that could potentially serve as HIV entry inhibitors. Because no crystal structures for these proteins are available, homology models of CXCR4 and CCR5 have been built, using bovine rhodopsin as the template. For ligand-based virtual screening, several shape-based and property-based molecular comparison approaches have been compared, using high-affinity ligands as query molecules. These methods were compared by virtually screening a library assembled by us, consisting of 602 known CXCR4 and CCR5 inhibitors and some 4700 similar presumed inactive molecules. For each receptor, the library was queried using known binders, and the enrichment factors and diversity of the resulting virtual hit lists were analyzed. Overall, ligand-based shape-matching searches yielded higher enrichments than receptor-based docking, especially for CXCR4. The results obtained for CCR5 suggest the possibility that different active scaffolds bind in different ways within the CCR5 pocket.
A new interaction fingerprint (IF) called APIF (atom-pairs-based interaction fingerprint) has been developed for postprocessing protein-ligand docking results. Unlike other existing fingerprints which employ absolute locations of individual interactions, APIF considers the relative positions of pairs of interacting atoms. Docking-based virtual screening was performed with GOLD using the crystal structures of trypsin, rhinovirus, HIV protease, carboxypeptidase, and estrogen receptor-alpha as targets. A score derived from the similarity of the bit strings for each docking solution to that of a known reference binding mode was obtained. Comparisons between APIF, GoldScore function, and standard interaction fingerprint (CHIF) scores were performed using enrichment plots. Superior recovery rates were observed in the IF score cases. Comparable results were achieved by using either of the two interaction fingerprints, substantially improving GoldScore function enrichment factors. Binding mode analyses were also carried out in order to study the best method for selecting conformations with a binding mode similar to that of the reference crystallized complex. These showed that the first conformations retrieved by interaction fingerprint scores had a more similar binding mode to the reference complex than those retrieved by the GoldScore function.
Ligand-based shape matching approaches have become established as important and popular virtual screening (VS) techniques. However, despite their relative success, many authors have discussed how best to choose the initial query compounds and which of their conformations should be used. Furthermore, it is increasingly the case that pharmaceutical companies have multiple ligands for a given target and these may bind in different ways to the same pocket. Conversely, a given ligand can sometimes bind to multiple targets, and this is clearly of great importance when considering drug side-effects. We recently introduced the notion of spherical harmonic-based "consensus shapes" to help deal with these questions. Here, we apply a consensus shape clustering approach to the 40 protein-ligand targets in the DUD data set using PARASURF/PARAFIT. Results from clustering show that in some cases the ligands for a given target are split into two subgroups which could suggest they bind to different subsites of the same target. In other cases, our clustering approach sometimes groups together ligands from different targets, and this suggests that those ligands could bind to the same targets. Hence spherical harmonic-based clustering can rapidly give cross-docking information while avoiding the expense of performing all-against-all docking calculations. We also report on the effect of the query conformation on the performance of shape-based screening of the DUD data set and the potential gain in screening performance by using consensus shapes calculated in different ways. We provide details of our analysis of shape-based screening using both PARASURF/PARAFIT and ROCS, and we compare the results obtained with shape-based and conventional docking approaches using MSSH/SHEF and GOLD. The utility of each type of query is analyzed using commonly reported statistics such as enrichment factors (EF) and receiver-operator-characteristic (ROC) plots as well as other early performance metrics.
HIV entry inhibitors have emerged as a new generation of antiretroviral drugs that block viral fusion with the CXCR4 and CCR5 membrane coreceptors. Several small molecule antagonists for these coreceptors have been developed, some of which are currently in clinical trials. However, because no crystal structures for the coreceptor proteins are available, the binding modes of the known inhibitors within the coreceptor extracellular pockets need to be analyzed by means of site-directed mutagenesis and computational experiments. Previous studies have indicated that there is more than one binding site within the CCR5 extracellular pocket. This article investigates and develops this hypothesis using a novel spherical harmonic-based consensus shape clustering approach. The consensus shape approach is evaluated using retrospective virtual screening of CXCR4 and CCR5 inhibitors. Multiple combinations of CCR5 ligands in multiple trial superpositions are constructed to find consensus queries that give high virtual screening enrichments. Receiver-operator-characteristic performance analyses for both CXCR4 and CCR5 inhibitors show that the new consensus shape matching approach gives better virtual screening enrichments than existing shape matching and docking virtual screening techniques. The results obtained also provide strong evidence to support the notion that there are three main binding sites within the CCR5 extracellular cavity.
Polypharmacology describes the binding of a ligand to multiple protein targets (a promiscuous ligand) or multiple diverse ligands binding to a given target (a promiscuous target). Pharmaceutical companies are discovering increasing numbers of both promiscuous drugs and drug targets. Hence, polypharmacology is now recognized as an important aspect of drug design. Here, we describe a new and fast way to predict polypharmacological relationships between drug classes quantitatively, which we call Gaussian Ensemble Screening (GES). This approach represents a cluster of molecules with similar spherical harmonic surface shapes as a Gaussian distribution with respect to a selected center molecule. Calculating the Gaussian overlap between pairs of such clusters allows the similarity between drug classes to be calculated analytically without requiring thousands of bootstrap comparisons, as in current promiscuity prediction approaches. We find that such cluster similarity scores also follow a Gaussian distribution. Hence, a cluster similarity score may be transformed into a probability value, or "p-value", in order to quantify the relationships between drug classes. We present results obtained when using the GES approach to predict relationships between drug classes in a subset of the MDL Drug Data Report (MDDR) database. Our results indicate that GES is a useful way to study polypharmacology relationships, and it could provide a novel way to propose new targets for drug repositioning.
The process of HIV entry begins with the binding of the viral envelope glycoprotein gp120 to both the CD4 receptor and one of CXCR4 or CCR5 chemokine coreceptors. There is currently considerable interest in developing novel ligands which can attach to these coreceptors and hence block virus-cell fusion. This article compares the application of structure-based (docking) and ligand-based (QSAR analyses, pharmacophore modeling, and shape matching) virtual screening tools to find new potential HIV entry inhibitors for the CXCR4 receptor. The comparison is based on retrospective virtual screening of a library containing different known CXCR4 inhibitors from the literature, a smaller set of active CXCR4 inhibitors selected from a large combinatorial virtual library and synthesized by us, and some druglike presumed inactive molecules as the reference set. The enrichment factors and diversity of the retrieved molecular scaffolds in the virtual hit lists was determined. Once the different virtual screening approaches had been validated and the best parameters had been selected, prospective virtual screening of our virtual library was applied to identify new anti-HIV compounds using the same protocol as in the retrospective virtual screening analysis. The compounds selected using these computational tools were subsequently synthesized and assayed and showed activity values ranging from 4 to 0.022 microg/mL.
HIV cell fusion and entry have been validated as targets for therapeutic intervention against infection. Bicyclams were the first low-molecular-weight compounds to show specific interaction with CXCR4. The most potent bicyclam was AMD3100, in which the two cyclam moieties are tethered by a 1,4-phenylenebis(methylene) bridge. It was withdrawn from clinical trials owing to its lack of oral bioavailability and cardiotoxicity. We have designed a combinatorial library of non-cyclam polynitrogenated compounds by preserving the main features of AMD3100. At least two nitrogen atoms on each side of the p-phenylene moiety, one in the benzylic position and the other(s) in the heterocyclic system were maintained, and the distances between them were similar to the nitrogen atom distances in cyclam. A selection of diverse compounds from this library were prepared, and their in vitro activity was tested in cell cultures against HIV strains. This led to the identification of novel potent CXCR4 coreceptor inhibitors without cytotoxicity at the tested concentrations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.