Expectation Maximization

The Expectation Maximization (EM) algorithm is a fast maximum likelihood parameter estimation algorithm for partially observed data.

Suppose that D is some observed data, U is some unobserved data, and T is a parameterisation for a joint probabilistic model for (D,U). The goal is to choose T so as to maximize P(D|T)=sum(U) P(D,U|T).

Starting from an initial guess at T, EM obtains the next estimate, T', via a two-step procedure:

(1: "The E-step") Estimate the posterior probability distribution, P(U|D,T), for the missing data given the current parameterisation.

(2: "The M-step") Maximize the following functional with respect to T'

Q(T'|D,T) = sum(U) P(U|D,T) log P(D,U|T')

(That is, maximize the expected log-likelihood of the observed and unobserved data, using the posterior distribution over U computed during the E-step.)

Often the E-step reduces to the computation of certain expectation statistics that characterise the unobserved data sufficiently to proceed with the M-step.

The K-means clustering algorithm can be viewed as an EM algorithm applied to a mixture-of-Gaussians model.

The EM algorithm can also be viewed as a two-step optimization of a single variational functional (see e.g. work of Radford Neal).

CategoryMath CategoryAlgorithm