Jun Liao scite author profile

We investigate the following data mining problem from computer-aided drug design: From a large collection of compounds, find those that bind to a target molecule in as few iterations of biochemical testing as possible. In each iteration a comparatively small batch of compounds is screened for binding activity toward this target. We employed the so-called "active learning paradigm" from Machine Learning for selecting the successive batches. Our main selection strategy is based on the maximum margin hyperplane-generated by "Support Vector Machines". This hyperplane separates the current set of active from the inactive compounds and has the largest possible distance from any labeled compound. We perform a thorough comparative study of various other selection strategies on data sets provided by DuPont Pharmaceuticals and show that the strategies based on the maximum margin hyperplane clearly outperform the simpler ones.

show abstract

Engineering proteinase K using machine learning and synthetic genes

Liao

Warmuth

Govindarajan

et al. 2007

BMC Biotechnol

View full text Add to dashboard Cite

Background: Altering a protein's function by changing its sequence allows natural proteins to be converted into useful molecular tools. Current protein engineering methods are limited by a lack of high throughput physical or computational tests that can accurately predict protein activity under conditions relevant to its final application. Here we describe a new synthetic biology approach to protein engineering that avoids these limitations by combining high throughput gene synthesis with machine learning-based design algorithms.

show abstract

Totally corrective boosting algorithms that maximize the margin

Warmuth

Liao

Rätsch

2006

View full text Add to dashboard Cite

We consider boosting algorithms that maintain a distribution over a set of examples. At each iteration a weak hypothesis is received and the distribution is updated. We motivate these updates as minimizing the relative entropy subject to linear constraints. For example AdaBoost constrains the edge of the last hypothesis w.r.t. the updated distribution to be at most γ = 0. In some sense, AdaBoost is "corrective" w.r.t. the last hypothesis. A cleaner boosting method is to be "totally corrective": the edges of all past hypotheses are constrained to be at most γ, where γ is suitably adapted.Using new techniques, we prove the same iteration bounds for the totally corrective algorithms as for their corrective versions. Moreover with adaptive γ, the algorithms provably maximizes the margin. Experimentally, the totally corrective versions return smaller convex combinations of weak hypotheses than the corrective ones and are competitive with LPBoost, a totally corrective boosting algorithm with no regularization, for which there is no iteration bound known.

show abstract

Quercetin restrains TGF-β1-induced epithelial–mesenchymal transition by inhibiting Twist1 and regulating E-cadherin expression

Feng

Song

Jiang

et al. 2018

Biochemical and Biophysical Research Communications

View full text Add to dashboard Cite

Histopathology classification and localization of colorectal cancer using global labels by weakly supervised deep learning

Zhou

Jin

Chen

et al. 2021

Computerized Medical Imaging and Graphics

View full text Add to dashboard Cite

Similarity-based machine learning support vector machine predictor of drug-drug interactions with improved accuracies

et al. 2018

View full text Add to dashboard Cite

Summary What is known and objective Drug‐drug interactions (DDI) are frequent causes of adverse clinical drug reactions. Efforts have been directed at the early stage to achieve accurate identification of DDI for drug safety assessments, including the development of in silico predictive methods. In particular, similarity‐based in silico methods have been developed to assess DDI with good accuracies, and machine learning methods have been employed to further extend the predictive range of similarity‐based approaches. However, the performance of a developed machine learning method is lower than expectations partly because of the use of less diverse DDI training data sets and a less optimal set of similarity measures. Method In this work, we developed a machine learning model using support vector machines (SVMs) based on the literature‐reported established set of similarity measures and comprehensive training data sets. The established similarity measures include the 2D molecular structure similarity, 3D pharmacophoric similarity, interaction profile fingerprint (IPF) similarity, target similarity and adverse drug effect (ADE) similarity, which were extracted from well‐known databases, such as DrugBank and Side Effect Resource (SIDER). A pairwise kernel was constructed for the known and possible drug pairs based on the five established similarity measures and then used as the input vector of the SVM. Result The 10‐fold cross‐validation studies showed a predictive performance of AUROC >0.97, which is significantly improved compared with the AUROC of 0.67 of an analogously developed machine learning model. Our study suggested that a similarity‐based SVM prediction is highly useful for identifying DDI. Conclusion in silico methods based on multifarious drug similarities have been suggested to be feasible for DDI prediction in various studies. In this way, our pairwise kernel SVM model had better accuracies than some previous works, which can be used as a pharmacovigilance tool to detect potential DDI.

show abstract

Scutellarin ameliorates high glucose-induced vascular endothelial cells injury by activating PINK1/Parkin-mediated mitophagy

Rong

Zhao

et al. 2021

Journal of Ethnopharmacology

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jun Liao

Active Learning with Support Vector Machines in the Drug Discovery Process.

Active Learning with Support Vector Machines in the Drug Discovery Process

Engineering proteinase K using machine learning and synthetic genes

Totally corrective boosting algorithms that maximize the margin

Quercetin restrains TGF-β1-induced epithelial–mesenchymal transition by inhibiting Twist1 and regulating E-cadherin expression

Histopathology classification and localization of colorectal cancer using global labels by weakly supervised deep learning

Similarity-based machine learning support vector machine predictor of drug-drug interactions with improved accuracies

Scutellarin ameliorates high glucose-induced vascular endothelial cells injury by activating PINK1/Parkin-mediated mitophagy

Contact Info

Product

Resources

About