Optimal detection of sparse principal components in high dimension

Berthet, Quentin; Rigollet, Philippe

doi:10.1214/13-aos1127

Cited by 201 publications

(264 citation statements)

References 75 publications

Supporting

Mentioning

256

Contrasting

Order By: Relevance

“…In passing, we note that there is a very interesting line of work on exact or approximate support reconstruction for sparse PCA, i.e., estimating correctly or consistently the positions of non-zeros in v, in a regime where the size of the support is sublinear in n (see e.g., [28,5,12,32,21] and references therein) 1 . In an influential paper [28], it was shown that while the estimate via the classical PCA is inconsistent, a simple diagonal thresholding procedure consistently estimates v provided that v is sufficiently sparse.…”

Section: Sparse Pcamentioning

confidence: 99%

Information-Theoretic Bounds and Phase Transitions in Clustering, Sparse PCA, and Submatrix Localization

Banks

Moore

Vershynin

et al. 2018

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

We study the problem of detecting a structured, low-rank signal matrix corrupted with additive Gaussian noise. This includes clustering in a Gaussian mixture model, sparse PCA, and submatrix localization. Each of these problems is conjectured to exhibit a sharp information-theoretic threshold, below which the signal is too weak for any algorithm to detect. We derive upper and lower bounds on these thresholds by applying the first and second moment methods to the likelihood ratio between these "planted models" and null models where the signal matrix is zero. For sparse PCA and submatrix localization, we determine this threshold exactly in the limit where the number of blocks is large or the signal matrix is very sparse; for the clustering problem, our bounds differ by a factor of √ 2 when the number of clusters is large. Moreover, our upper bounds show that for each of these problems there is a significant regime where reliable detection is information-theoretically possible but where known algorithms such as PCA fail completely, since the spectrum of the observed matrix is uninformative. This regime is analogous to the conjectured 'hard but detectable' regime for community detection in sparse graphs.

show abstract

Section: Sparse Pcamentioning

confidence: 99%

Information-Theoretic Bounds and Phase Transitions in Clustering, Sparse PCA, and Submatrix Localization

Banks

Moore

Vershynin

et al. 2018

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

show abstract

“…Consider the expression for the KL divergence given in (10). Using (3), we obtain KL(P 0 | A || P S | A ) = KL(P 0 || P S∩A )…”

Section: B Proof Of Bound On Kl Divergencementioning

confidence: 99%

“…Besides anomaly detection, detection of correlations is also of interest to assess to what extent dimensionality reduction can be performed on a data stream. Reduction of dimensionality is a workhorse of data analysis, and there has been a strong recent interest in modifying principal component analysis to deal with high-dimensional data [10], [12], [26]. Testing when this type of transformation is justified is thus an important problem.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Detection of Correlations With Adaptive Sensing

Castro

Lugosi

Savalle

2014

IEEE Trans. Inform. Theory

View full text Add to dashboard Cite

Abstract-The problem of detecting correlations from samples of a high-dimensional Gaussian vector has recently received a lot of attention. In most existing work, detection procedures are provided with a full sample. However, following common wisdom in experimental design, the experimenter may have the capacity to make targeted measurements in an on-line and adaptive manner. In this paper, we investigate such adaptive sensing procedures for detecting positive correlations. It is shown that, using the same number of measurements, adaptive procedures are able to detect significantly weaker correlations than their nonadaptive counterparts. We also establish minimax lower bounds that show the limitations of any procedure.

show abstract

“…[2,15] suggested heuristics when the detection levels are unknown, but they are not proven to achieve the optimal detection levels. Berthet et al [47,53] proved whether there exists a polynomial-time computable statistic for reliably detecting the presence of a single spike of 0 l -sparsity. They proved that no polynomial algorithm will reconstruct the support unless kn  .…”

mentioning

confidence: 99%