Paper of the Month: October 2018

Once a month during the academic year, the statistics faculty select a paper for our students to read and discuss. Papers are selected based on their impact or historical value, or because they contain useful techniques or results.

Notes preparer: Haim Bar

The Expectation Maximization (EM) algorithm was introduced by Dempster, Laird, and Rubin in 19771. As the title of the paper suggests, the EM algorithm is a method to obtain maximum likelihood estimates in cases where the data are incomplete. It has been widely used in many applications, such as data imputation, fitting mixture models, and clustering. The EM algorithm consists of two steps: in the E-step, each missing value is replaced with its expected value, using the current estimates of the parameters in the model. In the M-step, using the available data and the imputed data for missing values, the likelihood function is maximized with respect to each of the model’s parameters in order to obtain new (better) estimates. The algorithm continues until a convergence criterion is met (for example, the improvement in the likelihood of the model is less than some user-defined threshold.) One of the challenges is to identify conditions that ensure that the algorithm will converge. Jeff Wu pointed out in his 1983 Annals of Statistics paper2 a flaw in the convergence proof in the Dempster, Laird, and Rubin paper, and offered not only a correct proof, but also extended the applicability of the algorithm beyond the exponential family models. For our monthly gathering to discuss the paper of the month you may start by reading Jeff Bilmes’ “gentle introduction” to EM algorithm, available here.


  • Dempster, A.P.; Laird, N.M.; Rubin, D.B. (1977). “Maximum Likelihood from Incomplete Data via the EM Algorithm”. Journal of the Royal Statistical Society, Series B, 39 (1): 1–38.
  • Wu, C. F. Jeff (Mar 1983), “On the Convergence Properties of the EM Algorithm”, Annals of Statistics, 11 (1): 95–103.