2006
DOI: 10.1007/s11306-006-0022-6
|View full text |Cite
|
Sign up to set email alerts
|

Assessing the performance of statistical validation tools for megavariate metabolomics data

Abstract: Statistical model validation tools such as cross-validation, jack-knifing model parameters and permutation tests are meant to obtain an objective assessment of the performance and stability of a statistical model. However, little is known about the performance of these tools for megavariate data sets, having, for instance, a number of variables larger than 10 times the number of subjects. The performance is assessed for megavariate metabolomics data, but the conclusions also carry over to proteomics, transcrip… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
95
0
2

Year Published

2010
2010
2022
2022

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 139 publications
(98 citation statements)
references
References 29 publications
1
95
0
2
Order By: Relevance
“…The PLS-DA included a measured matrix X and a response matrix Y consisting of dummy variables. The quality of the PLS-DA model was assessed using a leave-one-out cross-validation method (Rubingh et al, 2006). The latent variables producing increases in Q 2 were considered as significant components, where:…”
Section: Spectral Data Pre-processing and Multivariate Analysismentioning
confidence: 99%
“…The PLS-DA included a measured matrix X and a response matrix Y consisting of dummy variables. The quality of the PLS-DA model was assessed using a leave-one-out cross-validation method (Rubingh et al, 2006). The latent variables producing increases in Q 2 were considered as significant components, where:…”
Section: Spectral Data Pre-processing and Multivariate Analysismentioning
confidence: 99%
“…In PLS-DA, the X matrix is the measured matrix, i.e., the NMR data, and the Y matrix is composed of dummy variables (represented by ones and zeros) that indicate the class for each treatment . The prediction accuracy of the PLS-DA model was assessed using cross-validation with leave-one-out (Rubingh et al, 2006;Westerhuis et al, 2008). A Q 2 score > 0.08 indicates that the model classification is significantly better than chance, while a score greater than 0.4 indicates that the model is practically robust (Lindon et al, 1999).…”
Section: Spectral Pre-processing and Multivariate Data Analysismentioning
confidence: 99%
“…Other model validation techniques commonly employed include Leave one out cross validation or up to 10-fold cross validation [60,61]. Pérez-Guaita et al [62] more recently evaluated the use of permutation testing, commonly used in metabolomics [63] and proteomics [64], which employs a random reallocation of class labels in order to establish the statistical significance of a cross-validation figure of merit of a classifier. Ultimately, however, the validation of the integrated techniques of spectroscopy and multivariate classifiers will have to comply with the rigours of the clinical environment, including large scale blind datasets and randomised trials [1].…”
Section: Discussionmentioning
confidence: 99%