2006
DOI: 10.1198/016214505000000628
|View full text |Cite
|
Sign up to set email alerts
|

Prediction by Supervised Principal Components

Abstract: In regression problems where the number of predictors greatly exceeds the number of observations, conventional regression techniques may produce unsatisfactory results. We describe a technique called supervised principal components that can be applied to this type of problem. Supervised principal components is similar to conventional principal components analysis except that it uses a subset of the predictors selected based on their association with the outcome. Supervised principal components can be applied t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
586
0
3

Year Published

2007
2007
2022
2022

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 670 publications
(609 citation statements)
references
References 28 publications
2
586
0
3
Order By: Relevance
“…In this section, we will develop a framework to directly make use of an outcome in sparse CCA. Our method for sparse supervised CCA (sparse sCCA) is closely related to the supervised principal components analysis (supervised PCA) method of Bair & Tibshirani (2004) and Bair et al (2006), and so we begin with an overview of that method. …”
Section: Application Of Sparse Mcca To Dlbcl Copy Number Datamentioning
confidence: 99%
“…In this section, we will develop a framework to directly make use of an outcome in sparse CCA. Our method for sparse supervised CCA (sparse sCCA) is closely related to the supervised principal components analysis (supervised PCA) method of Bair & Tibshirani (2004) and Bair et al (2006), and so we begin with an overview of that method. …”
Section: Application Of Sparse Mcca To Dlbcl Copy Number Datamentioning
confidence: 99%
“…Estimated factors are constructed to explain a large amount of variation between predictor variables but do not necessarily account for relationships between the estimated factors and the target relevant variable (i.e., the stock market return). Studies such as Bair, Hastie, Paul, and Tibshirani (2006), Boivin and Ng (2006), Bai and Ng (2008) or Cakmakli and van Dijk (2016) adopt a preselection filter approach to decide which variables should be used in latent factor estimation. Such filter techniques select a subset of predictor variables according to their previous forecast performance.…”
Section: Three-pass Regression Filter (3prf)mentioning
confidence: 99%
“…Patients in clinical remission, or with a second malignancy, or with a toxic death as a first event were censored at the date of last contact. As described in detail in supplemental Sections 4C and 5 to 9, a Cox score was used to rank genes based on their association with RFS, and a Cox proportional hazards model-based supervised principal components analysis 21 was used to build the gene expression classifier for RFS from the rank-ordered gene list. Similarly, for the development of the gene expression classifier predictive of end-induction MRD, a modified t test was used to rank genes expressed in pretreatment cells according to their association with day 29 flow MRD, defined as positive or negative at a threshold of 0.01%.…”
Section: Statistical Analysesmentioning
confidence: 99%
“…To develop a gene expression-based classifier predictive of RFS, each of the 23 775 informative probe sets on the gene expression microarrays was ranked based on strength of association with RFS (Cox score). 21 As detailed in supplemental Sections 4C, 5, and 8, a Cox proportional hazards model-based supervised principal component analysis was used to build the expression classifier for RFS, which was optimized by performing 20 iterations of 5-fold cross-validation. 21 The final model incorporated the top 42 Affymetrix microarray probe sets corresponding to 38 unique genes (see supplemental Table 4 for the gene list; false discovery rate ϭ 8.45%, significance analysis of microarrays [SAM]).…”
Section: A Gene Expression Classifier Predictive Of Survivalmentioning
confidence: 99%
See 1 more Smart Citation