Fingerprint-based similarity searching is widely used for virtual screening when only a single bioactive reference structure is available. This paper reviews three distinct ways of carrying out such searches when multiple bioactive reference structures are available: merging the individual fingerprints into a single combined fingerprint; applying data fusion to the similarity rankings resulting from individual similarity searches; and approximations to substructural analysis. Extended searches on the MDL Drug Data Report database suggest that fusing similarity scores is the most effective general approach, with the best individual results coming from the binary kernel discrimination technique.
This paper reports a detailed comparison of a range of different types of 2D fingerprints when used for similarity-based virtual screening with multiple reference structures. Experiments with the MDL Drug Data Report database demonstrate the effectiveness of fingerprints that encode circular substructure descriptors generated using the Morgan algorithm. These fingerprints are notably more effective than fingerprints based on a fragment dictionary, on hashing and on topological pharmacophores. The combination of these fingerprints with data fusion based on similarity scores provides both an effective and an efficient approach to virtual screening in lead-discovery programmes.
Similarity searching using a single bioactive reference structure is a well-established technique for accessing chemical structure databases. This paper describes two extensions of the basic approach. First, we discuss the use of group fusion to combine the results of similarity searches when multiple reference structures are available. We demonstrate that this technique is notably more effective than conventional similarity searching in scaffold-hopping searches for structurally diverse sets of active molecules; conversely, the technique will do little to improve the search performance if the actives are structurally homogeneous. Second, we make the assumption that the nearest neighbors resulting from a similarity search, using a single bioactive reference structure, are also active and use this assumption to implement approximate forms of group fusion, substructural analysis, and binary kernel discrimination. This approach, called turbo similarity searching, is notably more effective than conventional similarity searching.
Preclinical Safety Pharmacology (PSP) attempts to anticipate adverse drug reactions (ADRs) during early phases of drug discovery by testing compounds in simple, in vitro binding assays (that is, preclinical profiling). The selection of PSP targets is based largely on circumstantial evidence of their contribution to known clinical ADRs, inferred from findings in clinical trials, animal experiments, and molecular studies going back more than forty years. In this work we explore PSP chemical space and its relevance for the prediction of adverse drug reactions. Firstly, in silico (computational) Bayesian models for 70 PSP-related targets were built, which are able to detect 93% of the ligands binding at IC(50) < or = 10 microM at an overall correct classification rate of about 94%. Secondly, employing the World Drug Index (WDI), a model for adverse drug reactions was built directly based on normalized side-effect annotations in the WDI, which does not require any underlying functional knowledge. This is, to our knowledge, the first attempt to predict adverse drug reactions across hundreds of categories from chemical structure alone. On average 90% of the adverse drug reactions observed with known, clinically used compounds were detected, an overall correct classification rate of 92%. Drugs withdrawn from the market (Rapacuronium, Suprofen) were tested in the model and their predicted ADRs align well with known ADRs. The analysis was repeated for acetylsalicylic acid and Benperidol which are still on the market. Importantly, features of the models are interpretable and back-projectable to chemical structure, raising the possibility of rationally engineering out adverse effects. By combining PSP and ADR models new hypotheses linking targets and adverse effects can be proposed and examples for the opioid mu and the muscarinic M2 receptors, as well as for cyclooxygenase-1 are presented. It is hoped that the generation of predictive models for adverse drug reactions is able to help support early SAR to accelerate drug discovery and decrease late stage attrition in drug discovery projects. In addition, models such as the ones presented here can be used for compound profiling in all development stages.
This study describes a method for mining and modeling binding data obtained from a large panel of targets (in vitro safety pharmacology) to distinguish differences between promiscuous and selective compounds. Two naïve Bayes models for promiscuity and selectivity were generated and validated on a test set as well as publicly available drug databases. The model shows a higher score (lower promiscuity) for marketed drugs than for compounds in early development or compounds that failed during clinical development. Such models can be used in triaging high-throughput screening data or for lead optimization.
We present a workflow that leverages data from chemogenomics based target predictions with Systems Biology databases to better understand off-target related toxicities. By analyzing a set of compounds that share a common toxic phenotype and by comparing the pathways they affect with pathways modulated by nontoxic compounds we are able to establish links between pathways and particular adverse effects. We further link these predictive results with literature data in order to explain why a certain pathway is predicted. Specifically, relevant pathways are elucidated for the side effects rhabdomyolysis and hypotension. Prospectively, our approach is valuable not only to better understand toxicities of novel compounds early on but also for drug repurposing exercises to find novel uses for known drugs.
We test the hypothesis that fusing the outputs of similarity searches based on a single bioactive reference structure and on its nearest neighbors (of unknown activity) is more effective (in terms of numbers of high-ranked active structures) than a similarity search involving just the reference structure. This turbo similarity searching approach provides a simple way to enhance the effectiveness of simulated virtual screening searches of the MDL Drug Data Report database.
We present a novel method to better investigate adverse drug reactions in chemical space. By integrating data sources about adverse drug reactions of drugs with an established cheminformatics modeling method, we generate a data set that is then visualized with a systems biology tool. Thereby new insights into undesired drug effects are gained. In this work, we present a global analysis linking chemical features to adverse drug reactions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.