2013
DOI: 10.1002/em.21797
|View full text |Cite
|
Sign up to set email alerts
|

Deciphering the complex: Methodological overview of statistical models to derive OMICS‐based biomarkers

Abstract: Recent technological advances in molecular biology have given rise to numerous large-scale datasets whose analysis imposes serious methodological challenges mainly relating to the size and complex structure of the data. Considerable experience in analyzing such data has been gained over the past decade, mainly in genetics, from the Genome-Wide Association Study era, and more recently in transcriptomics and metabolomics. Building upon the corresponding literature, we provide here a nontechnical overview of well… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
87
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 111 publications
(87 citation statements)
references
References 116 publications
(128 reference statements)
0
87
0
Order By: Relevance
“…A caveat of sPLS is that penalisation methods optimise prediction and will tend to select one of two highly correlated exposures with nearly equal coefficient sizes 19. Competing multipollutant modelling approaches exist22 (eg, tree-based models, elastic net penalised regression); however, simulation and validation assessments, specifically for data structures relevant for environmental epidemiology (often low dimensional), are lacking.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…A caveat of sPLS is that penalisation methods optimise prediction and will tend to select one of two highly correlated exposures with nearly equal coefficient sizes 19. Competing multipollutant modelling approaches exist22 (eg, tree-based models, elastic net penalised regression); however, simulation and validation assessments, specifically for data structures relevant for environmental epidemiology (often low dimensional), are lacking.…”
Section: Discussionmentioning
confidence: 99%
“…To address potential confounding, we employed a two-stage regression approach22: first, each outcome and each exposure were separately regressed on potential confounders, and second, sPLS regression models were fit inputting the residuals. We a priori selected the set of potential confounders,9 including study population and serum cotinine level, and variably age, body mass index (BMI), abstinence period and time of blood sampling, depending on the outcome (as specified in table 1).…”
Section: Methodsmentioning
confidence: 99%
“…This can lead to a high rate of false positives. Several statistical methods have been proposed to cope with this issue [20] and some of them have been applied [18,21]. However, systematic simulation studies are needed to characterise the efficiency of these approaches in the context of the exposome.…”
Section: Questionnairesmentioning
confidence: 99%
“…Partly this is a logical consequence of the fact that most models are based on linear techniques such as Partial Least Squares (PLS) [1]. However, the presence of interaction between biological or chemical variables could easily be envisaged.…”
Section: Introductionmentioning
confidence: 99%