2020
DOI: 10.1080/00949655.2020.1739286
|View full text |Cite
|
Sign up to set email alerts
|

An empirical threshold of selection probability for analysis of high-dimensional correlated data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
3

Relationship

2
4

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 19 publications
0
10
0
Order By: Relevance
“…Alternatively, we also conducted regularization methods, such as LASSO, to identify candidate regions associated with the seed aspect ratio ( Figure 6 ) [ 71 ]. The regularization method was performed using an entire dataset at a time and could select several putative markers most likely related to the trait based on the value of selection probability, whereas the ECMLM analysis only tested one marker at a time.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Alternatively, we also conducted regularization methods, such as LASSO, to identify candidate regions associated with the seed aspect ratio ( Figure 6 ) [ 71 ]. The regularization method was performed using an entire dataset at a time and could select several putative markers most likely related to the trait based on the value of selection probability, whereas the ECMLM analysis only tested one marker at a time.…”
Section: Resultsmentioning
confidence: 99%
“…The first one is the theoretical threshold proposed by [ 70 ]. The second one is the empirical threshold [ 71 ] which basically computes the quantile value of an empirical distribution of selection probability based on permutation. In their extensive simulation studies, it was demonstrated that the number of falsely selected SNPs can be controlled when the empirical threshold is applied to high-dimensional genomic data.…”
Section: Methodsmentioning
confidence: 99%
“…Whilst it is known that the most stable variables (those selected in most subsampled models) are least likely to be false positives 9 , the optimal threshold for stability, above which a covariate should be deemed 'important' or 'significant', has not been determined. A stability threshold originally proposed for use with lasso regression 9 has been shown to be too conservative, resulting in true causal variables being missed 12,13 . An empirical method of stability selection, proposed for genetic data using elastic net regression 13 , appeared to improve upon the method proposed by Meinshausen and Bühlmann 9 , but the issue of missing many true causal variables (false negative results) remained.…”
Section: Introduction 47mentioning
confidence: 99%
“…A stability threshold originally proposed for use with lasso regression 9 has been shown to be too conservative, resulting in true causal variables being missed 12,13 . An empirical method of stability selection, proposed for genetic data using elastic net regression 13 , appeared to improve upon the method proposed by Meinshausen and Bühlmann 9 , but the issue of missing many true causal variables (false negative results) remained. To date, since a clear, generalisable method to identify a stability threshold is unavailable, arbitrary thresholds have been employed for stability analyses in veterinary epidemiology 14,15 and the need for research to establish a suitable cut-off for the covariate selection in high dimensional data is clear and has been recently re-emphasised in human epidemiology 16 .…”
Section: Introduction 47mentioning
confidence: 99%
See 1 more Smart Citation