A variety of machine learning methods such as Naïve Bayesian, support vector machines and more recently deep neural networks are demonstrating their utility for drug discovery and development. These leverage the generally bigger data sets created from high throughput screening data and allow prediction of bioactivities for targets and molecular properties with increased levels of accuracy. We have only just begun to exploit the potential of these techniques but they may already be fundamentally changing the research process for identifying new molecules and/or repurposing old drugs. The integrated application of such machine learning models for end-to-end (E2E) application is broadly relevant and has considerable implications for developing future therapies and their targeting. Learning from history 'Those who do not remember the past are condemned to repeat it' (Santayana). This observation applies as much to drug discovery as it does to other aspects of human endeavor 1. The history of drug discovery is a prelude to the emerging potential of computerassisted data exploration. One constant in drug discovery is that every few years the estimated cost to develop drugs rises further. Less than 20 years ago, developing a drug took ~12 years, cost under a billion dollars, and the biggest challenges were failures due to efficacy or toxicity-induced attrition 2. in vitro pharmacological profiling implemented earlier in the drug discovery process helped to identify some predictable undesirable off-*
Tuberculosis is a global health dilemma. In 2016, the WHO reported 10.4 million incidences and 1.7 million deaths. The need to develop new treatments for those infected with Mycobacterium tuberculosis ( Mtb) has led to many large-scale phenotypic screens and many thousands of new active compounds identified in vitro. However, with limited funding, efforts to discover new active molecules against Mtb needs to be more efficient. Several computational machine learning approaches have been shown to have good enrichment and hit rates. We have curated small molecule Mtb data and developed new models with a total of 18,886 molecules with activity cutoffs of 10 μM, 1 μM, and 100 nM. These data sets were used to evaluate different machine learning methods (including deep learning) and metrics and to generate predictions for additional molecules published in 2017. One Mtb model, a combined in vitro and in vivo data Bayesian model at a 100 nM activity yielded the following metrics for 5-fold cross validation: accuracy = 0.88, precision = 0.22, recall = 0.91, specificity = 0.88, kappa = 0.31, and MCC = 0.41. We have also curated an evaluation set ( n = 153 compounds) published in 2017, and when used to test our model, it showed the comparable statistics (accuracy = 0.83, precision = 0.27, recall = 1.00, specificity = 0.81, kappa = 0.36, and MCC = 0.47). We have also compared these models with additional machine learning algorithms showing Bayesian machine learning models constructed with literature Mtb data generated by different laboratories generally were equivalent to or outperformed deep neural networks with external test sets. Finally, we have also compared our training and test sets to show they were suitably diverse and different in order to represent useful evaluation sets. Such Mtb machine learning models could help prioritize compounds for testing in vitro and in vivo.
The human immunodeficiency virus (HIV) causes over a million deaths every year and has a huge economic impact in many countries. The first class of drugs approved were nucleoside reverse transcriptase inhibitors. A newer generation of reverse transcriptase inhibitors have become susceptible to drug resistant strains of HIV, and hence, alternatives are urgently needed. We have recently pioneered the use of Bayesian machine learning to generate models with public data to identify new compounds for testing against different disease targets. The current study has used the NIAID ChemDB HIV, Opportunistic Infection and Tuberculosis Therapeutics Database for machine learning studies. We curated and cleaned data from HIV-1 wild-type cell-based and reverse transcriptase (RT) DNA polymerase inhibition assays. Compounds from this database with ≤1 μM HIV-1 RT DNA polymerase activity inhibition and cell-based HIV-1 inhibition are correlated (Pearson r = 0.44, n = 1137, p < 0.0001). Models were trained using multiple machine learning approaches (Bernoulli Naive Bayes, AdaBoost Decision Tree, Random Forest, support vector classification, k-Nearest Neighbors, and deep neural networks as well as consensus approaches) and then their predictive abilities were compared. Our comparison of different machine learning methods demonstrated that support vector classification, deep learning, and a consensus were generally comparable and not significantly different from each other using 5-fold cross validation and using 24 training and test set combinations. This study demonstrates findings in line with our previous studies for various targets that training and testing with multiple data sets does not demonstrate a significant difference between support vector machine and deep neural networks.
Recent outbreaks of the Ebola virus (EBOV) have focused attention on the dire need for antivirals to treat these patients. We identified pyronaridine tetraphosphate as a potential candidate as it is an approved drug in the European Union which is currently used in combination with artesunate as a treatment for malaria (EC50 between 420 nM—1.14 μM against EBOV in HeLa cells). Range-finding studies in mice directed us to a single 75 mg/kg i.p. dose 1 hr after infection which resulted in 100% survival and statistically significantly reduced viremia at study day 3 from a lethal challenge with mouse-adapted EBOV (maEBOV). Further, an EBOV window study suggested we could dose pyronaridine 2 or 24 hrs post-exposure to result in similar efficacy. Analysis of cytokine and chemokine panels suggests that pyronaridine may act as an immunomodulator during an EBOV infection. Our studies with pyronaridine clearly demonstrate potential utility for its repurposing as an antiviral against EBOV and merits further study in larger animal models with the added benefit of already being used as a treatment against malaria.
We have previously described the first Bayesian machine learning models from FDA-approved drug screens, for identifying compounds active against the Ebola virus (EBOV). These models led to the identification of three active molecules in vitro: tilorone, pyronaridine, and quinacrine. A follow-up study demonstrated that one of these compounds, tilorone, has 100% in vivo efficacy in mice infected with mouse-adapted EBOV at 30 mg/kg/day intraperitoneal. This suggested that we can learn from the published data on EBOV inhibition and use it to select new compounds for testing that are active in vivo. We used these previously built Bayesian machine learning EBOV models alongside our chemical insights for the selection of 12 molecules, absent from the training set, to test for in vitro EBOV inhibition. Nine molecules were directly selected using the model, and eight of these molecules possessed a promising in vitro activity (EC50 < 15 μM). Three further compounds were selected for an in vitro evaluation because they were antimalarials, and compounds of this class like pyronaridine and quinacrine have previously been shown to inhibit EBOV. We identified the antimalarial drug arterolane (IC50 = 4.53 μM) and the anticancer clinical candidate lucanthone (IC50 = 3.27 μM) as novel compounds that have EBOV inhibitory activity in HeLa cells and generally lack cytotoxicity. This work provides further validation for using machine learning and medicinal chemistry expertize to prioritize compounds for testing in vitro prior to more costly in vivo tests. These studies provide further corroboration of this strategy and suggest that it can likely be applied to other pathogens in the future.
Severe acute respiratory coronavirus 2 (SARS-CoV-2) is a newly identified virus that has resulted in over 2.5 million deaths globally and over 116 million cases globally in March, 2021. Small-molecule inhibitors that reverse disease severity have proven difficult to discover. One of the key approaches that has been widely applied in an effort to speed up the translation of drugs is drug repurposing. A few drugs have shown in vitro activity against Ebola viruses and demonstrated activity against SARS-CoV-2 in vivo. Most notably, the RNA polymerase targeting remdesivir demonstrated activity in vitro and efficacy in the early stage of the disease in humans. Testing other small-molecule drugs that are active against Ebola viruses (EBOVs) would appear a reasonable strategy to evaluate their potential for SARS-CoV-2. We have previously repurposed pyronaridine, tilorone, and quinacrine (from malaria, influenza, and antiprotozoal uses, respectively) as inhibitors of Ebola and Marburg viruses in vitro in HeLa cells and mouse-adapted EBOV in mice in vivo. We have now tested these three drugs in various cell lines (VeroE6, Vero76, Caco-2, Calu-3, A549-ACE2, HUH-7, and monocytes) infected with SARS-CoV-2 as well as other viruses (including MHV and HCoV 229E). The compilation of these results indicated considerable variability in antiviral activity observed across cell lines. We found that tilorone and pyronaridine inhibited the virus replication in A549-ACE2 cells with IC50 values of 180 nM and IC50 198 nM, respectively. We used microscale thermophoresis to test the binding of these molecules to the spike protein, and tilorone and pyronaridine bind to the spike receptor binding domain protein with K d values of 339 and 647 nM, respectively. Human Cmax for pyronaridine and quinacrine is greater than the IC50 observed in A549-ACE2 cells. We also provide novel insights into the mechanism of these compounds which is likely lysosomotropic.
For the last 50 years we have known of a broad-spectrum agent tilorone dihydrochloride (Tilorone)
Drug-induced liver injury (DILI) is one the most unpredictable adverse reactions to xenobiotics in humans and the leading cause of postmarketing withdrawals of approved drugs. To date, these drugs have been collated by the FDA to form the DILIRank database, which classifies DILI severity and potential. These classifications have been used by various research groups in generating computational predictions for this type of liver injury. Recently, groups from Pfizer and AstraZeneca have collated DILI in vitro data and physicochemical properties for compounds that can be used along with data from the FDA to build machine learning models for DILI. In this study, we have used these data sets, as well as the Biopharmaceutics Drug Disposition Classification System data set, to generate Bayesian machine learning models with our in-house software, Assay Central. The performance of all machine learning models was assessed through both the internal 5-fold cross-validation metrics and prediction accuracy of an external test set of compounds with known hepatotoxicity. The best-performing Bayesian model was based on the DILI-concern category from the DILIRank database with an ROC of 0.814, a sensitivity of 0.741, a specificity of 0.755, and an accuracy of 0.746. A comparison of alternative machine learning algorithms, such as k-nearest neighbors, support vector classification, AdaBoosted decision trees, and deep learning methods, produced similar statistics to those generated with the Bayesian algorithm in Assay Central. This study demonstrates machine learning models grouped in a tool called MegaTox that can be used to predict early-stage clinical compounds, as well as recent FDA-approved drugs, to identify potential DILI.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.