2017
DOI: 10.1080/1062936x.2017.1397056
|View full text |Cite
|
Sign up to set email alerts
|

Impact assessment of the rational selection of training and test sets on the predictive ability of QSAR models

Abstract: This study performed an analysis of the influence of the training and test set rational selection on the quality and predictively of the quantitative structure-activity relationship (QSAR) model. The study was carried out on three different datasets of Influenza Neuraminidase (H1N1) inhibitors. The three datasets were divided into training and test sets using three rational selection methods: based on k-means, Kennard-Stone algorithm and Activity and the results were compared with Random selection. Then, a tot… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
27
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(29 citation statements)
references
References 22 publications
2
27
0
Order By: Relevance
“…Several groups investigated random vs. rational selection of optimal test/training sets, e.g. using cluster-or activity-based splits, with the goal of better reflecting the true predictive power of established models [10][11][12][13][14]. Martin et al [11] showed that rational selection of training and test sets-compared to random splitsgenerated better statistical results on the (internal) test sets.…”
Section: Introductionmentioning
confidence: 99%
“…Several groups investigated random vs. rational selection of optimal test/training sets, e.g. using cluster-or activity-based splits, with the goal of better reflecting the true predictive power of established models [10][11][12][13][14]. Martin et al [11] showed that rational selection of training and test sets-compared to random splitsgenerated better statistical results on the (internal) test sets.…”
Section: Introductionmentioning
confidence: 99%
“…Then the remained samples were divided into calibration set and prediction set at a ratio of 2 : 1 according to the K–S algorithm. 28 The PLSR models were first built by the calibration set and tested by the prediction set to testify the feasibility of spectral detection on Dendrobiums . For both MN and DP, the R P 2 values of PLSR models are greater than 0.84 ( Table 1 ).…”
Section: Resultsmentioning
confidence: 99%
“…The set of compounds was independently divided twice to yield two different training and test sets by random selection. Additionally, a third division into training and test set was obtained based on maximum dissimilarity using the Kennard-Stone algorithm [29,30,31,32]. The ratio training/test set was kept constant in all three sets ( N = 26 for training and N = 8 for test set).…”
Section: Resultsmentioning
confidence: 99%