2018
DOI: 10.1093/biomet/asy011
|View full text |Cite
|
Sign up to set email alerts
|

Robust estimation of high-dimensional covariance and precision matrices

Abstract: Summary High-dimensional data are often most plausibly generated from distributions with complex structure and leptokurtosis in some or all components. Covariance and precision matrices provide a useful summary of such structure, yet the performance of popular matrix estimators typically hinges upon a sub-Gaussianity assumption. This paper presents robust matrix estimators whose performance is guaranteed for a much richer class of distributions. The proposed estimators, under a bounded fourth moment assumption… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
56
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 69 publications
(58 citation statements)
references
References 30 publications
0
56
0
Order By: Relevance
“…All of these applications present some shared challenges: In most cases, the number of features (genes, brain regions, microbial taxa) far exceed the number of data samples; It is generally impossible, without making additional assumptions or incorporating domain knowledge, to distinguish between direct and indirect correlations; The choice of the correlation or similarity measure is often application-dependent. Methods for microbial ecology network estimation from metagenomic data could benefit greatly from recent advances in high dimensional correlation matrix estimation [67][68][69][70] . Work in progress is aimed at evaluating the applicability of such methods in constructing stable microbial ecology networks from metagenomic data.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…All of these applications present some shared challenges: In most cases, the number of features (genes, brain regions, microbial taxa) far exceed the number of data samples; It is generally impossible, without making additional assumptions or incorporating domain knowledge, to distinguish between direct and indirect correlations; The choice of the correlation or similarity measure is often application-dependent. Methods for microbial ecology network estimation from metagenomic data could benefit greatly from recent advances in high dimensional correlation matrix estimation [67][68][69][70] . Work in progress is aimed at evaluating the applicability of such methods in constructing stable microbial ecology networks from metagenomic data.…”
Section: Discussionmentioning
confidence: 99%
“…Furthermore, the proposed method is able to achieve its best observed performance using only only 50 samples for feature selection. Work in progress is aimed at further improving the two key components of NBBD, e.g., by incorporating recent advances in high dimensional correlation matrix estimation [67][68][69][70] to improve the reliability and the stability of the resulting networks, exploring improved node scoring methods. Other promising directions for future research include systematic evaluation of the NBBD framework for biomarker discovery from different types of omics data, integrative analyses of multi-omics data 71,72 , e.g., using information-preserving low-dimensional network embeddings 73 .…”
Section: Discussionmentioning
confidence: 99%
“…Both constructions, however, involve brute-force search over every direction in a d-dimensional ε-net, and thus are computationally intractable. From an element-wise perspective, Avella-Medina et al (2018) combined robust estimates of the first and second moments to obtain variance estimators. In practice, three potential drawbacks of this approach are: (i) the accumulated error consists of those from estimating the first and second moments, which may be significant; (ii) the diagonal variance estimators are not necessarily positive and therefore additional adjustments are required; and (iii) using the cross-validation to calibrate a total number of O(d 2 ) tuning parameters is computationally expensive.…”
Section: Overview Of the Previous Workmentioning
confidence: 99%
“…Building on the ideas of and Avella-Medina et al (2018), we propose user-friendly tail-robust covariance estimators that enjoy desirable finite-sample deviation bounds under weak moment conditions. The constructed estimators only involve simple truncation techniques and are computationally friendly.…”
Section: Overview Of the Previous Workmentioning
confidence: 99%
See 1 more Smart Citation