2019
DOI: 10.1021/acs.jcim.9b00633
|View full text |Cite
|
Sign up to set email alerts
|

LightGBM: An Effective and Scalable Algorithm for Prediction of Chemical Toxicity–Application to the Tox21 and Mutagenicity Data Sets

Abstract: Machine learning algorithms have attained widespread use in assessing the potential toxicities of pharmaceuticals and industrial chemicals because of their faster-speed and lowercost compared to experimental bioassays. Gradient boosting is an effective algorithm that often achieves high predictivity, but historically the relative long computational time limited its applications in predicting large compound libraries or developing in silico predictive models that require frequent retraining. LightGBM, a recent … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
112
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 138 publications
(113 citation statements)
references
References 33 publications
1
112
0
Order By: Relevance
“…To develop significantly accurate models and prevent overfitting to the training datasets, hyperparameters for LightGBM were optimized using stratified fivefold cross-validation with Bayesian optimization. 29 Suitable hyperparameters were determined by searching the largest average area under the curve value in the receiver operating characteristic curve (ROC-AUC) of the five models. Furthermore, each of the final three models with the closest ROC-AUC value to the average ROC-AUC value in fivefold crossvalidation was applied to the test datasets.…”
Section: E Model Buildingmentioning
confidence: 99%
“…To develop significantly accurate models and prevent overfitting to the training datasets, hyperparameters for LightGBM were optimized using stratified fivefold cross-validation with Bayesian optimization. 29 Suitable hyperparameters were determined by searching the largest average area under the curve value in the receiver operating characteristic curve (ROC-AUC) of the five models. Furthermore, each of the final three models with the closest ROC-AUC value to the average ROC-AUC value in fivefold crossvalidation was applied to the test datasets.…”
Section: E Model Buildingmentioning
confidence: 99%
“…Therefore, a growing interest exists in a comprehensive in silico approach to detect the potential toxicity of chemicals. The literature presents the results of successful examples of alternative in silico toxicity screening methods and their applications using the Tox21 10K library [ 19 , 20 , 21 ]. However, even though there are 59 types of well-confirmed assay results of agonist/antagonist activities for toxicity targets in the Tox21 10K library, several studies have built models for only a small number of toxicity targets.…”
Section: Introductionmentioning
confidence: 99%
“…In order to address this issue, it is advisable to adopt Bayesian optimization strategies to perform hyperparameter tuning, as it has better performance on the test set and requires fewer iterations compared to grid and random searches. 29,30 Our results demonstrated that the baseline lightGBM model could be improved via Bayesian optimization by over 8.0% in ACC value, 11.1% in F-value, 21.9% in MCC value, and 8.5% in AUC value, on the validation set, respectively, thereby verifying the effectiveness of Bayesian optimization.…”
Section: Discussionmentioning
confidence: 58%