Monotonically Overrelaxed EM Algorithms

Yu, Yaming

doi:10.1080/10618600.2012.672115

Cited by 14 publications

(6 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It replaces the M‐step by conducting one iteration of Newton's method. Alternative approaches, such as surrogate functions (Lange et al ., 2000) and overrelaxed EM algorithm (Yu, 2012), have also been introduced in the literature. Pan and Shen (2007) introduced ℓ 1 ‐penalty to the mean parameters for mixture of univariate normal models.…”

Section: Methodsmentioning

confidence: 99%

Regularized matrix data clustering and its application to image analysis

Shen

Zhang

et al. 2020

Biometrics

View full text Add to dashboard Cite

We propose a novel regularized mixture model for clustering matrix-valued data. The proposed method assumes a separable covariance structure for each cluster and imposes a sparsity structure (e.g., low rankness, spatial sparsity) for the mean signal of each cluster. We formulate the problem as a finite mixture model of matrix-normal distributions with regularization terms, and then develop an EM-type of algorithm for efficient computation. In theory, we show that the proposed estimators are strongly consistent for various choices of penalty functions. Simulation and two applications on brain signal studies confirm the excellent performance of the proposed method including a better prediction accuracy than the competitors and the scientific interpretability of the solution.

show abstract

Section: Methodsmentioning

confidence: 99%

Regularized matrix data clustering and its application to image analysis

Shen

Zhang

et al. 2020

Biometrics

View full text Add to dashboard Cite

show abstract

“…The order-one AA scheme ( 8) is similar but not equivalent to successive overrelaxation (SOR) used in iterative methods for solving large linear systems (e.g., Young (1971)), and it is also similar to the monotonic overrelaxed EM algorithm detailed in Yu (2012). Order-one AA also resembles the STEM procedures described in Varadhan & Roland (2008) which are schemes that utilize Steffensen-type methods to accelerate EM.…”

Section: Anderson Acceleration When M =mentioning

confidence: 99%

Damped Anderson Acceleration With Restarts and Monotonicity Control for Accelerating EM and EM-like Algorithms

Henderson

Varadhan

2019

Journal of Computational and Graphical Statistics

View full text Add to dashboard Cite

The expectation-maximization (EM) algorithm is a well-known iterative method for computing maximum likelihood estimates in a variety of statistical problems. Despite its numerous advantages, a main drawback of the EM algorithm is its frequently observed slow convergence which often hinders the application of EM algorithms in high-dimensional problems or in other complex settings. To address the need for more rapidly convergent EM algorithms, we describe a new class of acceleration schemes that build on the Anderson acceleration technique for speeding fixed-point iterations. Our approach is effective at greatly accelerating the convergence of EM algorithms and is automatically scalable to high dimensional settings. Through the introduction of periodic algorithm restarts and a damping factor, our acceleration scheme provides faster and more robust convergence when compared to un-modified Anderson acceleration, while also improving global convergence. Crucially, our method works as an "off-the-shelf" method in that it may be directly used to accelerate any EM algorithm without relying on the use of any model-specific features or insights. Through a series of simulation studies involving five representative problems, we show that our algorithm is substantially faster than the existing state-of-art acceleration schemes.

show abstract

“…Second, if the data require many machines for storage, then extensive communication among all the machines further increases the time of each iteration; therefore, EM (Dempster et al 1977) and the family of EM-type algorithms, such as ECM, ECME, AECM, PXEM, and DECME (Meng & Rubin 1993, Liu & Rubin 1994, Meng & van Dyk 1997, Liu et al 1998, He & Liu 2012, are inefficient in massive data settings simply due to the time consuming E step or possibly due to the communication cost. The same is also true for EM extensions that modify the M step by borrowing ideas from optimization (Lange 1995, Jamshidian & Jennrich 1997, Neal & Hinton 1998, Salakhutdinov & Roweis 2003, Varadhan & Roland 2008, Yu 2012.…”

Section: Introductionmentioning

confidence: 92%

An Asynchronous Distributed Expectation Maximization Algorithm for Massive Data: The DEM Algorithm

Srivastava

DePalma

Liu

2018

Journal of Computational and Graphical Statistics

View full text Add to dashboard Cite

The family of Expectation-Maximization (EM) algorithms provides a general approach to fitting flexible models for large and complex data. The expectation (E) step of EM-type algorithms is time consuming in massive data applications because it requires multiple passes through the full data. We address this problem by proposing an asynchronous and distributed generalization of the EM called the Distributed EM (DEM). Using DEM, existing EM-type algorithms are easily extended to massive data settings by exploiting the divide-and-conquer technique and widely available computing power, such as grid computing. The DEM algorithm reserves two groups of computing processes called workers and managers for performing the E step and the maximization step (M step), respectively. The samples are randomly partitioned into a large number of disjoint subsets and are stored on the worker processes. The E step of DEM algorithm is performed in parallel on all the workers, and every worker communicates its results to the managers at the end of local E step. The managers perform the M step after they have received results from a γ-fraction of the workers, where γ is a fixed constant in (0, 1].The sequence of parameter estimates generated by the DEM algorithm retains the attractive properties of EM: convergence of the sequence of parameter estimates to a local mode and linear global rate of convergence. Across diverse simulations focused on linear mixed-effects models, the DEM algorithm is significantly faster than competing EM-type algorithms while having a

show abstract

Monotonically Overrelaxed EM Algorithms

Cited by 14 publications

References 34 publications

Regularized matrix data clustering and its application to image analysis

Regularized matrix data clustering and its application to image analysis

Damped Anderson Acceleration With Restarts and Monotonicity Control for Accelerating EM and EM-like Algorithms

An Asynchronous Distributed Expectation Maximization Algorithm for Massive Data: The DEM Algorithm

Contact Info

Product

Resources

About