“…Second, if the data require many machines for storage, then extensive communication among all the machines further increases the time of each iteration; therefore, EM (Dempster et al 1977) and the family of EM-type algorithms, such as ECM, ECME, AECM, PXEM, and DECME (Meng & Rubin 1993, Liu & Rubin 1994, Meng & van Dyk 1997, Liu et al 1998, He & Liu 2012, are inefficient in massive data settings simply due to the time consuming E step or possibly due to the communication cost. The same is also true for EM extensions that modify the M step by borrowing ideas from optimization (Lange 1995, Jamshidian & Jennrich 1997, Neal & Hinton 1998, Salakhutdinov & Roweis 2003, Varadhan & Roland 2008, Yu 2012.…”