2015
DOI: 10.1186/s12859-015-0575-3
|View full text |Cite
|
Sign up to set email alerts
|

Controlling false discoveries in high-dimensional situations: boosting with stability selection

Abstract: BackgroundModern biotechnologies often result in high-dimensional data sets with many more variables than observations (n≪p). These data sets pose new challenges to statistical analysis: Variable selection becomes one of the most important tasks in this setting. Similar challenges arise if in modern data sets from observational studies, e.g., in ecology, where flexible, non-linear models are fitted to high-dimensional data. We assess the recently proposed flexible framework for variable selection called stabil… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
154
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 132 publications
(161 citation statements)
references
References 48 publications
2
154
0
Order By: Relevance
“…To determine whether feature selection is sensitive to choice of data reduction procedure, we also performed the MDD/control and continuous prediction without feature selection and obtained similar results. Further, a recently proposed variable selection algorithm, stability selection, was also applied (data not shown) (Hofner, Boccuto, & Goker, 2015; Hofner & Hothorn, 2017; Meinshausen & Bühlmann, 2010; Shah & Samworth, 2013). All features except one selected by this method for different outcomes fall within 39 features in Table A1.…”
Section: Discussionmentioning
confidence: 99%
“…To determine whether feature selection is sensitive to choice of data reduction procedure, we also performed the MDD/control and continuous prediction without feature selection and obtained similar results. Further, a recently proposed variable selection algorithm, stability selection, was also applied (data not shown) (Hofner, Boccuto, & Goker, 2015; Hofner & Hothorn, 2017; Meinshausen & Bühlmann, 2010; Shah & Samworth, 2013). All features except one selected by this method for different outcomes fall within 39 features in Table A1.…”
Section: Discussionmentioning
confidence: 99%
“…Furthermore, this procedure allows for the assessment of the selection stability of variables while controlling for sample error. Hofner et al () suggests the upper limit of the pairwise family error rate (PFER) to be set at α < PFER max < mα ,, where m represents the number of predictors and α represents the respective significance level ( m factor α = 9 × 0.05 = 0.45 and m facet α ) =34 × 0.05 = 1.7 in our case). Based on this recommendation, we used an even lower PFER of 0.20.…”
Section: Methodsmentioning
confidence: 98%
“…We fitted GAM and GAMLSS using an iterative machine‐learning approach, component‐wise functional gradient descent boosting (Bühlmann & Hothorn, ; Hothorn et al, ; Mayr et al, ; Hofner, Boccuto, & Göker, ; Mayr & Hofner, ) in a cyclical framework (Thomas et al, ). The first step of this process was to compute the negative gradient of a pre‐selected loss function, which acts as a working residual by giving more weight to observations not properly predicted in previous iterations.…”
Section: Methodsmentioning
confidence: 99%