2020
DOI: 10.1214/19-ejs1660
|View full text |Cite
|
Sign up to set email alerts
|

Statistical convergence of the EM algorithm on Gaussian mixture models

Abstract: We study the convergence behavior of the Expectation Maximization (EM) algorithm on Gaussian mixture models with an arbitrary number of mixture components and mixing weights. We show that as long as the means of the components are separated by at least Ω( min{M, d}), where M is the number of components and d is the dimension, the EM algorithm converges locally to the global optimum of the log-likelihood. Further, we show that the convergence rate is linear and characterize the size of the basin of attraction t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
38
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(38 citation statements)
references
References 26 publications
(41 reference statements)
0
38
0
Order By: Relevance
“…Let denote the ground truth means of Gaussian components in a -dimensional Gaussian mixture dataset and let where and denote the ground truth means of component and , denote the minimum distance between the means of any two components. Zhao et al (2020) provides a convergence guarantee of the EM algorithm to the global optima, as long as the following condition is true: …”
Section: Methodsmentioning
confidence: 99%
“…Let denote the ground truth means of Gaussian components in a -dimensional Gaussian mixture dataset and let where and denote the ground truth means of component and , denote the minimum distance between the means of any two components. Zhao et al (2020) provides a convergence guarantee of the EM algorithm to the global optima, as long as the following condition is true: …”
Section: Methodsmentioning
confidence: 99%
“…Currently, the best known sample requirements for EM are n =Ω(K 3 dR 2 max /R 2 min ), whereas for gradient EM, n =Ω(K 6 dR 6 max /R 2 min ). In addition, the bounds for the resulting errors increase linearly with R max , see [22,23]. Note that in these two results, the required number of samples increases at least quadratically with the maximal separation between clusters, even though increasing R max should make the problem easier.…”
Section: Introductionmentioning
confidence: 93%
“…Let us compare our results to several recent works that derived convergence guarantees for EM and gradient EM. [23] and [22] proved local convergence to the global optima under a much larger minimal separation of R min ≥ C min(d, K) log K. In addition, the requirement on the initial estimates had a dependence on the maximal separation, μ i − μ * i ≤ 1 2 R min − C 1 min(d, K) log max(R max , K 3 ) for a universal constant C 1 . These results were significantly improved by [13], who proved the local convergence of the EM algorithm for the more general case of spherical Gaussians with unknown weights and variances.…”
Section: Introductionmentioning
confidence: 99%
“…The convergence of the EM algorithm under Gaussian mixture model is well studied and therefore we can impose some alternative assumptions to substitute condition (A1) under this case. For example, if the conditions in Theorem 3.3 of Zhao et al [47] hold, the EM algorithm in Examples 2 and 3 converges to the MLE.…”
Section: Asymptoticsmentioning
confidence: 99%