2007
DOI: 10.1002/cem.1086
|View full text |Cite
|
Sign up to set email alerts
|

A randomization test for PLS component selection

Abstract: During the last two decades, a number of methods have been developed and evaluated for selecting the optimal number of components in a PLS model. In this paper, a new method is introduced that is based on a randomization test. The advantage of using a randomization test is that in contrast to cross validation (CV), it requires no exclusion of data, thus avoiding problems related to data exclusion, for example in designed experiments. The method is tested using simulated data sets for which the true dimensional… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
123
0
1

Year Published

2008
2008
2015
2015

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 127 publications
(126 citation statements)
references
References 56 publications
0
123
0
1
Order By: Relevance
“…To guard against model overfitting, permutation tests with 100 iterations were performed (Fig. 3D) to compare the goodness of fit of the original model with that of randomly permuted models (28). The criteria for validity included the following: all the permuted R 2 and Q 2 values to the left were lower than the original points to the right and the regression line of the Q 2 points intersects the vertical axis (on the left) at or below zero (29).…”
Section: Metabonomic Profiling Of Rplc-ms and Hilic-ms-bothmentioning
confidence: 99%
“…To guard against model overfitting, permutation tests with 100 iterations were performed (Fig. 3D) to compare the goodness of fit of the original model with that of randomly permuted models (28). The criteria for validity included the following: all the permuted R 2 and Q 2 values to the left were lower than the original points to the right and the regression line of the Q 2 points intersects the vertical axis (on the left) at or below zero (29).…”
Section: Metabonomic Profiling Of Rplc-ms and Hilic-ms-bothmentioning
confidence: 99%
“…The selected number of components using k-fold CV correctly find this range, the actual value of the number of components is immaterial as long as the prediction error is close to its minimum (Wiklund et al, 2007;Ibrahim and Wibowo, 2012). We used 10-fold CV to obtain the appropriate model for predicting water level at Galas River of Kuala Krai using two types of pre-processing data.…”
Section: Model Selectionmentioning
confidence: 95%
“…In this study, we will restrict ourselves to the common variants of CV called K-fold CV, where the calibration objects are divided in k segments and for this experiment we use k = 10 (Breiman, 1984;Wiklund et al, 2007;Ibrahim and Wibowo, 2012). The selected number of components using k-fold CV correctly find this range, the actual value of the number of components is immaterial as long as the prediction error is close to its minimum (Wiklund et al, 2007;Ibrahim and Wibowo, 2012).…”
Section: Model Selectionmentioning
confidence: 99%
“…The PLS models were validated using response of permutation test through 100 permutations. The permutation test assesses the statistical significance of the estimated predictive power previously calculated by cross validation test 40 .…”
Section: Statistical Validation Of the Modelmentioning
confidence: 99%