2018
DOI: 10.48550/arxiv.1812.08078
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sharp optimal recovery in the two-component Gaussian Mixture Model

Abstract: In this paper, we study the problem of clustering in the Two component Gaussian mixture model when the centers are separated by some ∆ > 0. We present a non-asymptotic lower bound for the corresponding minimax Hamming risk improving on existing results. We also propose an optimal, efficient and adaptive procedure that is minimax rate optimal. The rate optimality is moreover sharp in the asymptotics when the sample size goes to infinity. Our procedure is based on a variant of Lloyd's iterations initialized by a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
35
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
8
1

Relationship

2
7

Authors

Journals

citations
Cited by 16 publications
(37 citation statements)
references
References 12 publications
(32 reference statements)
2
35
0
Order By: Relevance
“…When Σ is known and the noise is normal, is it easy to see that Σ −1/2 Y follows the isotropic Gaussian Mixture Model where the signal vector is given by Σ −1/2 θ. Following the reasoning similar to [12], we can show in the general case that inf…”
Section: Statement Of the Problemmentioning
confidence: 83%
See 1 more Smart Citation
“…When Σ is known and the noise is normal, is it easy to see that Σ −1/2 Y follows the isotropic Gaussian Mixture Model where the signal vector is given by Σ −1/2 θ. Following the reasoning similar to [12], we can show in the general case that inf…”
Section: Statement Of the Problemmentioning
confidence: 83%
“…This particular choice is motivated by the fact that we are seeking bounds that hold adaptively for large classes of Σ which is typically unknown in practical applications. The proof of the lower bound is inspired by the argument in [12] that only holds for isotropic noise. As for the upper bound, we show that the "averaging" linear classifier defined as ηave (y) := sign n i=1 Y i η i , y , is minimax optimal.…”
Section: Minimax Clustering: the Supervised Casementioning
confidence: 99%
“…, ĜK . On the other hand, it is known that rounding is not necessary as the relaxed SDP solution can be directly used to recover the K-means in (1), when the separation of cluster centers is large enough, a property often referred in literature as the hidden integrality [13,10,25,6].…”
Section: Sdp Relaxed K-meansmentioning
confidence: 99%
“…This framework is indeed a good benchmark, since on the one hand, it is sufficiently simple to allow us to understand the nature of the target (β 0,1 , ..., β 0,K ) -with K = 2 and β 0,1 = −β 0,2 in our bipartite framework -and to investigate the rate of convergence of estimators of the form of (2), suitably regularized by a 1 -penalty. On the other hand, the two-component high-dimensional Gaussian mixture has received recently at lot of attention [7,2,35,28,24,15,12,3,25,8,30]. Let us emphasize that our goal is not a priori to provide a state-of-the-art method, specifically designed to solve the high-dimensional…”
Section: Introductionmentioning
confidence: 99%