2019
DOI: 10.1111/cbdd.13494
|View full text |Cite
|
Sign up to set email alerts
|

A combined drug discovery strategy based on machine learning and molecular docking

Abstract: Data mining methods based on machine learning play an increasingly important role in drug design and discovery. In the current work, eight machine learning methods including decision trees, k‐Nearest neighbor, support vector machines, random forests, extremely randomized trees, AdaBoost, gradient boosting trees, and XGBoost were evaluated comprehensively through a case study of ACC inhibitor data sets. Internal and external data sets were employed for cross‐validation of the eight machine learning methods. Res… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(18 citation statements)
references
References 54 publications
0
18
0
Order By: Relevance
“…Therefore, it has higher precision and F1 score for inactive categories. Traditional binary classification methods have reported similar results (usually divided into inactive and active; Zhang et al., 2019). But the structural similarity in high and weakly active compounds may make it difficult to distinguish strongly active compounds from serious structural analogues.…”
Section: Discussionmentioning
confidence: 95%
See 1 more Smart Citation
“…Therefore, it has higher precision and F1 score for inactive categories. Traditional binary classification methods have reported similar results (usually divided into inactive and active; Zhang et al., 2019). But the structural similarity in high and weakly active compounds may make it difficult to distinguish strongly active compounds from serious structural analogues.…”
Section: Discussionmentioning
confidence: 95%
“…In order to solve the problem of less data in the drug screening process, here we propose a novel three‐classification strategy (inactive, weakly active, strongly active) by constructing a “hypothetical” negative data set. This strategy is distinguished from the previous machine‐based two‐classification method (active and inactive), focusing more on strongly active compounds (Zhang et al, 2019). In addition, the new test set evaluates the accuracy of multiple machine learning‐based classification models and traditional structure‐based virtual screening methods.…”
Section: Introductionmentioning
confidence: 99%
“…Molecular docking is a computational tool for predicting the binding ability and binding mode of proteins and ligands [53]. Its principle is based on the "lock key model" of the interaction between proteins and small ligands, calculating and predicting the conformation and orientation of ligands at protein active sites, so as to judge the binding degree and play an important role in the target prediction of drug organisms [54]. Autodock is a common molecular docking method.…”
Section: Discussionmentioning
confidence: 99%
“…However, recent work indicates that this approach may not adequately reduce overfitting in some cases, resulting in higher accuracy estimates than those obtained with other approaches (72). On balance, some studies indicate that bootstrap optimism correction methods perform similarly to other internal validation methods (73,74), random forest models can generalize well to new data (75,76), and random forest combined with bootstrap optimism correction performs similarly to other internal validation methods and other machine learning techniques (73,77,78). There is also evidence that Walsh et al's algorithm (64) using this approach generalizes well to new samples and new suicide-related outcomes (79,80).…”
Section: Internal Validationmentioning
confidence: 99%