In silico methods of phenotypic screening are necessary to reduce the time and cost of the experimental in vivo screening of anticancer agents through dozens of millions of natural and synthetic chemical compounds. We used the previously developed PASS (Prediction of Activity Spectra for Substances) algorithm to create and validate the classification SAR models for predicting the cytotoxicity of chemicals against different types of human cell lines using ChEMBL experimental data. A training set from 59,882 structures of compounds was created based on the experimental data (IG50, IC50, and % inhibition values) from ChEMBL. The average accuracy of prediction (AUC) calculated by leave-one-out and a 20-fold cross-validation procedure during the training was 0.930 and 0.927 for 278 cancer cell lines, respectively, and 0.948 and 0.947 for cytotoxicity prediction for 27 normal cell lines, respectively. Using the given SAR models, we developed a freely available web-service for cell-line cytotoxicity profile prediction (CLC-Pred: Cell-Line Cytotoxicity Predictor) based on the following structural formula: http://way2drug.com/Cell-line/.
An essential characteristic of chemical compounds is their biological activity since its presence can become the basis for the use of the substance for therapeutic purposes, or, on the contrary, limit the possibilities of its practical application due to the manifestation of side action and toxic effects. Computer assessment of the biological activity spectra makes it possible to determine the most promising directions for the study of the pharmacological action of particular substances, and to filter out potentially dangerous molecules at the early stages of research. For more than 25 years, we have been developing and improving the computer program PASS (Prediction of Activity Spectra for Substances), designed to predict the biological activity spectrum of substance based on the structural formula of its molecules. The prediction is carried out by the analysis of structure-activity relationships for the training set, which currently contains information on structures and known biological activities for more than one million molecules. The structure of the organic compound is represented in PASS using Multilevel Neighborhoods of Atoms descriptors; the activity prediction for new compounds is performed by the naive Bayes classifier and the structure-activity relationships determined by the analysis of the training set. We have created and improved both local versions of the PASS program and freely available web resources based on PASS (http://www.way2drug.com). They predict several thousand biological activities (pharmacological effects, molecular mechanisms of action, specific toxicity and adverse effects, interaction with the unwanted targets, metabolism and action on molecular transport), cytotoxicity for tumor and non-tumor cell lines, carcinogenicity, induced changes of gene expression profiles, metabolic sites of the major enzymes of the first and second phases of xenobiotics biotransformation, and belonging to substrates and/or metabolites of metabolic enzymes. The web resource Way2Drug is used by over 18,000 researchers from more than 90 countries around the world, which allowed them to obtain over 600,000 predictions and publish about 500 papers describing the obtained results. The analysis of the published works shows that in some cases the interpretation of the prediction results presented by the authors of these publications requires an adjustment. In this work, we provide the theoretical basis and consider, on particular examples, the opportunities and limitations of computer-aided prediction of biological activity spectra.
In silico approaches have been widely recognised to be useful for drug discovery. Here, we consider the significance of available databases of medicinal plants and chemo- and bioinformatics tools for in silico drug discovery beyond the traditional use of folk medicines. This review contains a practical example of the application of combined chemo- and bioinformatics methods to study pleiotropic therapeutic effects (known and novel) of 50 medicinal plants from Traditional Indian Medicine.
In silico methods of phenotypic screening are necessary to reduce the time and cost of the experimental in vivo screening of anticancer agents through dozens of millions of natural and synthetic chemical compounds. We used the previously developed PASS (Prediction of Activity Spectra for Substances) algorithm to create and validate the classification SAR models for predicting the cytotoxicity of chemicals against different types of human cell lines using ChEMBL experimental data. A training set from 59,882 structures of compounds was created based on the experimental data (IG50, IC50, and % inhibition values) from ChEMBL. The average accuracy of prediction (AUC) calculated by leave-one-out and a 20-fold cross-validation procedure during the training was 0.930 and 0.927 for 278 cancer cell lines, respectively, and 0.948 and 0.947 for cytotoxicity prediction for 27 normal cell lines, respectively. Using the given SAR models, we developed a freely available web-service for cell-line cytotoxicity profile prediction (CLC-Pred: Cell-Line Cytotoxicity Predictor) based on the following structural formula: http://way2drug.com/Cell-line/.
Application of structure-activity relationships (SARs) for the prediction of adverse effects of drugs (ADEs) has been reported in many published studies. Training sets for the creation of SAR models are usually based on drug label information which allows for the generation of data sets for many hundreds of drugs. Since many ADEs may not be related to drug consumption, one of the main problems in such studies is the quality of data on drug-ADE pairs obtained from labels. The information on ADEs may be included in three sections of the drug labels: "Boxed warning," "Warnings and Precautions," and "Adverse reactions." The first two sections, especially Boxed warning, usually contain the most frequent and severe ADEs that have either known or probable relationships to drug consumption. Using this information, we have created manually curated data sets for the five most frequent and severe ADEs: myocardial infarction, arrhythmia, cardiac failure, severe hepatotoxicity, and nephrotoxicity, with more than 850 drugs on average for each effect. The corresponding SARs were built with PASS (Prediction of Activity Spectra for Substances) software and had balanced accuracy values of 0.74, 0.7, 0.77, 0.67, and 0.75, respectively. They were implemented in a freely available ADVERPred web service ( http://www.way2drug.com/adverpred/ ), which enables a user to predict five ADEs based on the structural formula of compound. This web service can be applied for estimation of the corresponding ADEs for hits and lead compounds at the early stages of drug discovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.