Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
2006
DOI: 10.1002/qsar.200510161
|View full text |Cite
|
Sign up to set email alerts
|

On Selection of Training and Test Sets for the Development of Predictive QSAR models

Abstract: The development of predictive QSAR models depends not only on the statistical method but also on the algorithm used for the selection of training and test sets. Here, we describe the validation of QSAR models for three data sets with different sizes (n ¼ 35, 56 and 87) based on random division, sorted biological activity data and K-means clusters for the factor scores of the original variable matrix along with/without biological activity values. When the training and test sets were generated by random division… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

3
112
0
2

Year Published

2008
2008
2017
2017

Publication Types

Select...
7
3

Relationship

2
8

Authors

Journals

citations
Cited by 230 publications
(117 citation statements)
references
References 26 publications
3
112
0
2
Order By: Relevance
“…Since the performance of a QSAR model depends on the numerical values of the NOELs used in training the model (Leonard and Roy, 2006), training of a QSAR model for assessing the toxicity of cosmetic ingredients should ideally be done with the aid of datasets of potential or actual cosmetic ingredients (e.g. chemicals from the International Nomenclature of Cosmetic Ingredients (INCI) list).…”
Section: Discussionmentioning
confidence: 99%
“…Since the performance of a QSAR model depends on the numerical values of the NOELs used in training the model (Leonard and Roy, 2006), training of a QSAR model for assessing the toxicity of cosmetic ingredients should ideally be done with the aid of datasets of potential or actual cosmetic ingredients (e.g. chemicals from the International Nomenclature of Cosmetic Ingredients (INCI) list).…”
Section: Discussionmentioning
confidence: 99%
“…So, the selection of the training set is significantly important in QSAR analysis. Predictive potential of a model on the new data set is influenced by the similarity of chemical nature between training set and test set [28][29][30]. The test set molecules will be predicted well when these molecules are very similar to the training set compounds.…”
Section: Cluster Analysis and Validationmentioning
confidence: 99%
“…It has been indicated that to achieve the optimal model, the selection of training and test sets should be based on some rational algorithms; otherwise, poor predictive ability of QSAR models may be obtained [7]. Therefore, it is also an important step to select the group of molecules that represent the most critical structural and physicochemical features associated with activity.…”
Section: Introductionmentioning
confidence: 99%