2017
DOI: 10.1093/bioinformatics/btx265
|View full text |Cite
|
Sign up to set email alerts
|

Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression

Abstract: MotivationThe discovery of relationships between gene expression measurements and phenotypic responses is hampered by both computational and statistical impediments. Conventional statistical methods are less than ideal because they either fail to select relevant genes, predict poorly, ignore the unknown interaction structure between genes, or are computationally intractable. Thus, the creation of new methods which can handle many expression measurements on relatively small numbers of patients while also uncove… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 34 publications
(45 reference statements)
0
6
0
Order By: Relevance
“…More recent work has attempted to avoid this assumption. Ding and McDonald (2017) uses the initially selected set of features to approximate the information lost in the screening step via techniques from numerical linear algebra. An alternative discussed in Piironen and Vehtari (2018) iterates the screening step with the prediction step, adding back features which correlate with the residual.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…More recent work has attempted to avoid this assumption. Ding and McDonald (2017) uses the initially selected set of features to approximate the information lost in the screening step via techniques from numerical linear algebra. An alternative discussed in Piironen and Vehtari (2018) iterates the screening step with the prediction step, adding back features which correlate with the residual.…”
Section: Methodsmentioning
confidence: 99%
“…The Oracle estimator uses OLS on the true features and serves as a natural baseline: it uses information unavailable to the analyst (the true genes) but represents the best method were that information available. We also present results for Lasso ( Tibshirani, 1996 ), Ridge ( Hoerl and Kennard, 1970 ), Elastic Net ( Zou and Hastie, 2005 ), SPC ( Bair et al , 2006 ), AIMER ( Ding and McDonald, 2017 ), ISPCA ( Piironen and Vehtari, 2018 ) and PCR using FPS directly without feature screening (using Algorithm 1 without Steps 9–11). For ISPCA, we use the R package to estimate the principal components before performing regression.…”
Section: Methodsmentioning
confidence: 99%
“…The Oracle estimator uses ordinary least squares (OLS) on the true features and serves as a natural baseline: it uses information unavailable to the analyst (the true genes) but represents the best method were that information available. We also present results for Lasso [46], Ridge [26], Elastic Net [56], SPC [6], AIMER [15], ISPCA [43], and PCR using FPS directly without feature screening (using Algorithm 1 without step 9, 10 and 11). For ISPCA, we use the dimreduce R package to estimate the principal components before performing regression.…”
Section: Synthetic Data Experimentsmentioning
confidence: 99%
“…The first setting is designed to show the advantages of SuffPCR relative to alternative methods, especially SPC. We note that other methods that employ screening by the marginal correlation [15,43] will have similar deficiencies. Because SPC works well if Equation (1) holds, we design Σ to violate this condition.…”
Section: Conditions Favorable To Suffpcrmentioning
confidence: 99%
“…There exist different methods aimed at signal reconstruction and deconvolution of the resulting high dimensional and complex datasets, but these methods almost always contain parameters that need to be estimated or present other types of ad hoc features. Developed specifically for Omics data and more particularly gene expression data such methods include the gene shaving method (Hastie et al , 2000), tree harvesting (Hastie et al , 2001), supervised principal components (Bair and Tibshirani, 2004) and amplified marginal eigenvector regression (Ding and McDonald, 2017). They employ widely different strategies to deal with the ubiquitous P ≫ N (many more variables than samples) problem in omics data.…”
Section: Introductionmentioning
confidence: 99%