Some Simplifications for the Expectation-Maximization (EM) Algorithm: The Linear Regression Model Case
Abstract: The EM algorithm is a generic tool that offers maximum likelihood solutions when datasets are incomplete with data values missing at random or completely at random. At least for its simplest form, the algorithm can be rewritten in terms of an ANCOVA regression specification. This formulation allows several analytical results to be derived that permit the EM algorithm solution to be expressed in terms of new observation predictions and their variances. Implementations can be made with a linear regression or a nonlinear regression model routine, allowing missing value imputations, even when they must satisfy constraints. Fourteen example datasets gleaned from the EM algorithm literature are reanalyzed. Imputation results have been verified with SAS PROC MI. Six theorems are proved that broadly contextualize imputation findings in terms of the theory, methodology, and practice of statistical science.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.