In this work we report the first attempt to study the effect of activity cliffs over the generalization ability of machine learning (ML) based QSAR classifiers, using as study case a previously reported diverse and noisy dataset focused on drug induced liver injury (DILI) and more than 40 ML classification algorithms. Here, the hypothesis of structure-activity relationship (SAR) continuity restoration by activity cliffs removal is tested as a potential solution to overcome such limitation. Previously, a parallelism was established between activity cliffs generators (ACGs) and instances that should be misclassified (ISMs), a related concept from the field of machine learning. Based on this concept we comparatively studied the classification performance of multiple machine learning classifiers as well as the consensus classifier derived from predictive classifiers obtained from training sets including or excluding ACGs. The influence of the removal of ACGs from the training set over the virtual screening performance was also studied for the respective consensus classifiers algorithms. In general terms, the removal of the ACGs from the training process slightly decreased the overall accuracy of the ML classifiers and multi-classifiers, improving their sensitivity (the weakest feature of ML classifiers trained with ACGs) but decreasing their specificity. Although these results do not support a positive effect of the removal of ACGs over the classification performance of ML classifiers, the "balancing effect" of ACG removal demonstrated to positively influence the virtual screening performance of multi-classifiers based on valid base ML classifiers. Specially, the early recognition ability was significantly favored after ACGs removal. The results presented and discussed in this work represent the first step towards the application of a remedial solution to the activity cliffs problem in QSAR studies.
Background: In the context of the current drug discovery efforts to find disease modifying therapies for Parkinson´s disease (PD) the current single target strategy has proved inefficient. Consequently, the search for multi-potent agents is attracting more and more attention due to the multiple pathogenetic factors implicated in PD. Multiple evidences points to the dual inhibition of the monoamine oxidase B (MAO-B), as well as adenosine A2A receptor (A2AAR) blockade, as a promising approach to prevent the neurodegeneration involved in PD. Currently, only two chemical scaffolds has been proposed as potential dual MAO-B inhibitors/A2AAR antagonists (caffeine derivatives and benzothiazinones).Methods: In this study, we conduct a series of chemoinformatics analysis in order to evaluate and advance the potential of the chromone nucleus as a MAO-B/A2AAR dual binding scaffold.Results: The information provided by SAR data mining analysis based on network similarity graphs and molecular docking studies support the suitability of the chromone nucleus as a potential MAO-B/A2AAR dual binding scaffold. Additionally, a virtual screening tool based on a group fusion similarity search approach was developed for the prioritization of potential MAO-B/A2AAR dual binder candidates. Among several data fusion schemes evaluated, the MEAN-SIM and MIN-RANK GFSS approaches demonstrated to be efficient virtual screening tools. Then, a combinatorial library potentially enriched with MAO-B/A2AAR dual binding chromone derivatives was assembled and sorted by using the MIN-RANK and then the MEAN-SIM GFSS VS approaches.Conclusion: The information and tools provided in this work represent valuable decision making elements in the search of novel chromone derivatives with a favorable dual binding profile as MAO-B inhibitors and A2AAR antagonists with the potential to act as a disease-modifying therapeutic for Parkinson´s disease.
Thermolysin is a bacterial proteolytic enzyme, considered by many authors as a pharmacological and biological model of other mammalian enzymes, with similar structural characteristics, such as Angiotensin Converting Enzyme and Neutral Endopeptidase. Inhibitors of these enzymes are considered therapeutic targets for common diseases, such as hypertension and heart failure. In this report, a mathematical model of Multiple Linear Regression, for ordinary least squares, and genetic algorithm, for selection of variables, are developed and implemented in QSARINS software, with appropriate parameters for its fitting. The model is extensively validated according to OECD standards, so that its robustness, stability, low correlation of descriptors and good predictive power are proven. In addition, it is found that the model fit is not the product of a random correlation. Two possible outliers are identified in the model application domain but, in a molecular docking study, they show good activity, so we decide to keep both in our database. The obtained model can be used for the virtual screening of compounds, in order to identify new active molecules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.