2009
DOI: 10.1214/08-aos646
|View full text |Cite
|
Sign up to set email alerts
|

High-dimensional variable selection

Abstract: This paper explores the following question: what kind of statistical guarantees can be given when doing variable selection in high dimensional models? In particular, we look at the error rates and power of some multi-stage regression methods. In the first stage we fit a set of candidate models. In the second stage we select one model by cross-validation. In the third stage we use hypothesis testing to eliminate some variables. We refer to the first two stages as “screening” and the last stage as “cleaning.” We… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

8
489
0

Year Published

2010
2010
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 476 publications
(497 citation statements)
references
References 20 publications
8
489
0
Order By: Relevance
“…consistency results; here, we use honesty following, e.g., Wasserman and Roeder [2009]. We note that there have been some recent theoretical investigations of non-honest forests, including Scornet et al [2015] and Wager and Walther [2015].…”
mentioning
confidence: 99%
“…consistency results; here, we use honesty following, e.g., Wasserman and Roeder [2009]. We note that there have been some recent theoretical investigations of non-honest forests, including Scornet et al [2015] and Wager and Walther [2015].…”
mentioning
confidence: 99%
“…Conceptually similar screening approaches have also been proposed for variable selection in high-dimensional regression models (13,14). In filtering for microarray applications, the data are first used to identify and remove a set of genes which seem to generate uninformative signal.…”
mentioning
confidence: 99%
“…A lasso 'false find' (ffind) occurs when a non-zero coefficient is assigned to a noncausal predictor. We additionally compare with the 'screen and clean' (S&C) method of Wasserman and Roeder [2009], where the strength of the penalty in the 'screen' stage is chosen via BIC. In Wasserman and Roeder [2009] cross validation is used to determine the penalty for the 'screen' stage -this leads to more variables being carried forward to the 'clean' stage, compared with BIC, and hence more true and false finds.…”
Section: Simulated Datamentioning
confidence: 99%
“…We additionally compare with the 'screen and clean' (S&C) method of Wasserman and Roeder [2009], where the strength of the penalty in the 'screen' stage is chosen via BIC. In Wasserman and Roeder [2009] cross validation is used to determine the penalty for the 'screen' stage -this leads to more variables being carried forward to the 'clean' stage, compared with BIC, and hence more true and false finds. Due to the high multicollinearity in this particular dataset, the increase in false finds was particularly damaging for both the lasso and the 'screen and clean' methods, so using BIC seemed to give more favourable results for these methods.…”
Section: Simulated Datamentioning
confidence: 99%
See 1 more Smart Citation