2019
DOI: 10.1111/biom.13112
|View full text |Cite
|
Sign up to set email alerts
|

Simulation-Selection-Extrapolation: Estimation in High-Dimensional Errors-in-Variables Models

Abstract: Errors-in-variables models in high-dimensional settings pose two challenges in application. First, the number of observed covariates is larger than the sample size, while only a small number of covariates are true predictors under an assumption of model sparsity. Second, the presence of measurement error can result in severely biased parameter estimates, and also affects the ability of penalized methods such as the lasso to recover the true sparsity pattern. A new estimation procedure called SIMulation-SELecti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
18
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 15 publications
(21 citation statements)
references
References 26 publications
1
18
0
Order By: Relevance
“…We followed the procedures described in Sørensen et al (2015), Nghiem and Potgieter (2019) and Romeo and Thoresen (2019) to process the raw data using the BGX package of Hein et al (2005), and assumed the measurement error on each gene was mutually independent from that on the other. As a result, the measurement error covariance matrix Σ u was set to be diagonal.…”
Section: Simulation Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…We followed the procedures described in Sørensen et al (2015), Nghiem and Potgieter (2019) and Romeo and Thoresen (2019) to process the raw data using the BGX package of Hein et al (2005), and assumed the measurement error on each gene was mutually independent from that on the other. As a result, the measurement error covariance matrix Σ u was set to be diagonal.…”
Section: Simulation Resultsmentioning
confidence: 99%
“…The gene expression measurements tend to be noisy, where measurement errors come from many sources such as sample preparation, labeling, and hybridization; for example, see Rocke and Durbin (2001) and Zakharkin et al (2005). The gene measurements are also often analyzed on the log scale, making the assumption of additive measurement errors more plausible (Nghiem and Potgieter, 2019). Furthermore, as in common genome wide association studies (Do et al, 2011;Zhou et al, 2018, among others), it is usually assumed that only a few genes are related to the outcome of interest, i.e a sparsity assumption on the statistical model.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Combinations of biomarkers have been integrated using various methods in order to create diagnostic or prognostic tools in CAD. These methods include the least absolute shrinkage and selection operator (LASSO), 10 random forest (RF) classifier, 11 support vector machine (SVM), 12 and the gene set variation analysis (GSVA) score. 13 LASSO is a commonly used penalty regression method, which can be applied for selection of variables in high-dimensional data.…”
Section: Introductionmentioning
confidence: 99%
“…13 LASSO is a commonly used penalty regression method, which can be applied for selection of variables in high-dimensional data. 10 LASSO performs via a continuous shrinking operation, minimizing regression coefficients in order to reduce the likelihood of overfitting. 14 RF is a model that can deal with unbalanced sample distribution, generating less biased classifiers, 11 but it often fails to be robust and is vulnerable to overfitting.…”
Section: Introductionmentioning
confidence: 99%