2017
DOI: 10.5705/ss.202015.0088
|View full text |Cite
|
Sign up to set email alerts
|

The statistics and mathematics of high dimension low sample size asymptotics

Abstract: The aim of this paper is to establish several deep theoretical properties of principal component analysis for multiple-component spike covariance models. Our new results reveal an asymptotic conical structure in critical sample eigendirections under the spike models with distinguishable (or indistinguishable) eigenvalues, when the sample size and/or the number of variables (or dimension) tend to infinity. The consistency of the sample eigenvectors relative to their population counterparts is determined by the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

2
52
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 42 publications
(58 citation statements)
references
References 38 publications
(66 reference statements)
2
52
0
1
Order By: Relevance
“…The proof, which is given in in Appendix D, is based on Weyl's inequality and a useful variant of the Davis-Kahan theorem . We notice that some preceding works (Onatski, 2012;Shen et al, 2016;Wang and Fan, 2017) have provided similar results under a weaker pervasiveness assumption which allows p/n → ∞ in any manner and the spiked eigenvalues {λ } K =1 are allowed to grow slower than p so long as c = p/(nλ ) is bounded.…”
Section: U -Type Covariance Estimationmentioning
confidence: 56%
“…The proof, which is given in in Appendix D, is based on Weyl's inequality and a useful variant of the Davis-Kahan theorem . We notice that some preceding works (Onatski, 2012;Shen et al, 2016;Wang and Fan, 2017) have provided similar results under a weaker pervasiveness assumption which allows p/n → ∞ in any manner and the spiked eigenvalues {λ } K =1 are allowed to grow slower than p so long as c = p/(nλ ) is bounded.…”
Section: U -Type Covariance Estimationmentioning
confidence: 56%
“…A more important point, however, is that PCA does not require more individuals than variables when it is calculated with singular value decomposition (SVD) as opposed to Eigenvector decomposition (Shen et al, 2016;Yata and Aoshima, 2010) for implementation (see the prcomp function in R). These statistical procedures in fact produce equivalent results, although SVD is more numerically accurate for some data and so preferable on those grounds.…”
Section: Vaccinesmentioning
confidence: 99%
“…Microarray technology is a high-throughput platform for analyzing gene expression profiles in tissues and makes it possible to examine the expression of thousands of genes simultaneously. However, in most cases, a single microarray dataset is high-dimensional low-sample size (HDLSS) data, which indeed presents serious statistical challenges [14]. Therefore, the integration of multiple microarray data sets can increase the sample size and may improve the statistical power.…”
Section: Introductionmentioning
confidence: 99%