If no structural information about a particular target protein is available, methods of rational drug design try to superimpose putative ligands with a given reference, e.g., an endogenous ligand. The goal of such structural alignments is, on the one hand, to approximate the binding geometry and, on the other hand, to provide a relative ranking of the ligands with respect to their similarity. An accurate superposition is the prerequisite of subsequent exploitation of ligand data by either 3D QSAR analyses, pharmacophore hypotheses, or receptor modeling. We present the automatic method FLEXS for structurally superimposing pairs of ligands, approximating their putative binding site geometry. One of the ligands is treated as flexible, while the other one, used as a reference, is kept rigid. FLEXS is an incremental construction procedure. The molecules to be superimposed are partitioned into fragments. Starting with placements of a selected anchor fragment, computed by two alternative approaches, the remaining fragments are added iteratively. At each step, flexibility is considered by allowing the respective added fragment to adopt a discrete set of conformations. The mean computing time per test case is about 1:30 min on a common-day workstation. FLEXS is fast enough to be used as a tool for virtual ligand screening. A database of typical drug molecules has been screened for potential fibrinogen receptor antagonists. FLEXS is capable of retrieving all ligands assigned to platelet aggregation properties among the first 20 hits. Furthermore, the program suggests additional interesting candidates, likely to be active at the same receptor. FLEXS proves to be superior to commonly used retrieval techniques based on 2D fingerprint similarities. The accuracy of computed superpositions determines the relevance of subsequently performed ligand analyses. In order to validate the quality of FLEXS alignments, we attempted to reproduce a set of 284 mutual superpositions derived from experimental data on 76 protein-ligand complexes of 14 proteins. The ligands considered cover the whole range of drug-size molecules from 18 to 158 atoms (PDB codes: 3ptb, 2er7). The performance of the algorithm critically depends on the sizes of the molecules to be superimposed. The limitations are clearly demonstrated with large peptidic inhibitors in the HIV and the endothiapepsin data set. Problems also occur in the presence of multiple binding modes (e.g., elastase and human rhinovirus). The most convincing results are achieved with small- and medium-sized molecules (as, e.g., the ligands of trypsin, thrombin, and dihydrofolate reductase). In more than half of the entire test set, we achieve rms deviations between computed and observed alignment of below 1.5 A. This underlines the reliability of FLEXS-generated alignments.
Computers in chemistryComputers in chemistry V 0380 Active Learning with Support Vector Machines in the Drug Discovery Process. -(WARMUTH*, M. K.; LIAO, J.; RAETSCH, G.; MATHIESON, M.; PUTTA, S.; LEMMEN, C.; J. Chem. Inf. Comput. Sci. 43 (2003) 2, 667-673; Comp. Sci. Dep., Univ. Calif., Santa Cruz, CA 95064, USA; Eng.) -Lindner 22-232
We investigate the following data mining problem from computer-aided drug design: From a large collection of compounds, find those that bind to a target molecule in as few iterations of biochemical testing as possible. In each iteration a comparatively small batch of compounds is screened for binding activity toward this target. We employed the so-called "active learning paradigm" from Machine Learning for selecting the successive batches. Our main selection strategy is based on the maximum margin hyperplane-generated by "Support Vector Machines". This hyperplane separates the current set of active from the inactive compounds and has the largest possible distance from any labeled compound. We perform a thorough comparative study of various other selection strategies on data sets provided by DuPont Pharmaceuticals and show that the strategies based on the maximum margin hyperplane clearly outperform the simpler ones.
The HYDE scoring function consistently describes hydrogen bonding, the hydrophobic effect and desolvation. It relies on HYdration and DEsolvation terms which are calibrated using octanol/water partition coefficients of small molecules. We do not use affinity data for calibration, therefore HYDE is generally applicable to all protein targets. HYDE reflects the Gibbs free energy of binding while only considering the essential interactions of protein-ligand complexes. The greatest benefit of HYDE is that it yields a very intuitive atom-based score, which can be mapped onto the ligand and protein atoms. This allows the direct visualization of the score and consequently facilitates analysis of protein-ligand complexes during the lead optimization process. In this study, we validated our new scoring function by applying it in large-scale docking experiments. We could successfully predict the correct binding mode in 93% of complexes in redocking calculations on the Astex diverse set, while our performance in virtual screening experiments using the DUD dataset showed significant enrichment values with a mean AUC of 0.77 across all protein targets with little or no structural defects. As part of these studies, we also carried out a very detailed analysis of the data that revealed interesting pitfalls, which we highlight here and which should be addressed in future benchmark datasets.
In drug design, often enough, no structural information on a particular receptor protein is available. However, frequently a considerable number of different ligands is known together with their measured binding affinities towards a receptor under consideration. In such a situation, a set of plausible relative superpositions of different ligands, hopefully approximating their putative binding geometry, is usually the method of choice for preparing data for the subsequent application of 3D methods that analyze the similarity or diversity of the ligands. Examples are 3D-QSAR studies, pharmacophore elucidation, and receptor modeling. An aggravating fact is that ligands are usually quite flexible and a rigorous analysis has to incorporate molecular flexibility. We review the past six years of scientific publishing on molecular superposition. Our focus lies on automatic procedures to be performed on arbitrary molecular structures. Methodical aspects are our main concern here. Accordingly, plain application studies with few methodical elements are omitted in this presentation. While this review cannot mention every contribution to this actively developing field, we intend to provide pointers to the recent literature providing important contributions to computational methods for the structural alignment of molecules. Finally we provide a perspective on how superposition methods can effectively be used for the purpose of virtual database screening. In our opinion it is the ultimate goal to detect analogues in structure databases of nontrivial size in order to narrow down the search space for subsequent experiments.
Large collections of combinatorial libraries are an integral element in today's pharmaceutical industry. It is of great interest to perform similarity searches against all virtual compounds that are synthetically accessible by any such library. Here we describe the successful application of a new software tool CoLibri on 358 combinatorial libraries based on validated reaction protocols to create a single chemistry space containing over 10 (12) possible products. Similarity searching with FTrees-FS allows the systematic exploration of this space without the need to enumerate all product structures. The search result is a set of virtual hits which are synthetically accessible by one or more of the existing reaction protocols. Grouping these virtual hits by their synthetic protocols allows the rapid design and synthesis of multiple follow-up libraries. Such library ideas support hit-to-lead design efforts for tasks like follow-up from high-throughput screening hits or scaffold hopping from one hit to another attractive series.
With the ever-increasing number of synthesis-on-demand compounds for drug lead discovery, there is a great need for efficient search technologies. We present the successful application of a virtual screening method that combines two advances: (1) it avoids full library enumeration (2) products are evaluated by molecular docking, leveraging protein structural information. Crucially, these advances enable a structure-based technique that can efficiently explore libraries with billions of molecules and beyond. We apply this method to identify inhibitors of ROCK1 from almost one billion commercially available compounds. Out of 69 purchased compounds, 27 (39%) have Ki values < 10 µM. X-ray structures of two leads confirm their docked poses. This approach to docking scales roughly with the number of reagents that span a chemical space and is therefore multiple orders of magnitude faster than traditional docking.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.