2019
DOI: 10.1214/18-aos1711
|View full text |Cite
|
Sign up to set email alerts
|

CHIME: Clustering of high-dimensional Gaussian mixtures with EM algorithm and its optimality

Abstract: Unsupervised learning is an important problem in statistics and machine learning with a wide range of applications. In this paper, we study clustering of high-dimensional Gaussian mixtures and propose a procedure, called CHIME, that is based on the EM algorithm and a direct estimation method for the sparse discriminant vector. Both theoretical and numerical properties of CHIME are investigated. We establish the optimal rate of convergence for the excess mis-clustering error and show that CHIME is minimax rate … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
60
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
1
1

Relationship

1
9

Authors

Journals

citations
Cited by 58 publications
(62 citation statements)
references
References 27 publications
0
60
0
Order By: Relevance
“…The 2D images of 89 cross sections (89 CS), 92 cross sections (95 CS), and 95 cross sections (95 CS) in the T1-weighted brain MRI image were selected for segmentation. The comparison algorithms used were FCM, CoFKM (Cai et al, 2019), two-layer automatic weighted clustering algorithm (TW-k-means) (Singh et al, 2020), multitask-based K-means (CombKM) (Chen et al, 2013), and collaborative clustering based on sample and feature space (coclustering) (Gu and Zhou, 2009). In the experiment, the iteration stop threshold ε of each algorithm is set to 0.001, and the maximum number of iterations l was set to 100.…”
Section: Simulation Experiments Analysis Experimental Backgroundmentioning
confidence: 99%
“…The 2D images of 89 cross sections (89 CS), 92 cross sections (95 CS), and 95 cross sections (95 CS) in the T1-weighted brain MRI image were selected for segmentation. The comparison algorithms used were FCM, CoFKM (Cai et al, 2019), two-layer automatic weighted clustering algorithm (TW-k-means) (Singh et al, 2020), multitask-based K-means (CombKM) (Chen et al, 2013), and collaborative clustering based on sample and feature space (coclustering) (Gu and Zhou, 2009). In the experiment, the iteration stop threshold ε of each algorithm is set to 0.001, and the maximum number of iterations l was set to 100.…”
Section: Simulation Experiments Analysis Experimental Backgroundmentioning
confidence: 99%
“…Furthermore, we also assume that the eigenvalues of the covariance matrix Σ are bounded from below and above. This assumption is commonly used in high dimensional statistics, ranging from high dimensional linear regression (Javanmard and Montanari, ), covariance matrix estimation (Cai and Yuan, ), classification (Cai and Liu, ) and clustering (Cai et al ., ).…”
Section: Introductionmentioning
confidence: 97%
“…Despite the recent progress, we are not aware of any results that characterize the local convergence behavior of the EM algorithm on mixtures of two or more Gaussians. For variants of the EM algorithm for fitting high-dimensional mixture models, we refer readers to Dasgupta and Schulman [10], Cai, Ma and Zhang [6], Wang et al [28], Yi and Caramanis [32].…”
Section: Related Workmentioning
confidence: 99%