The Expectation Maximization (EM) algorithm is a fast maximum likelihood parameter estimation algorithm for partially observed data.
Suppose that D is some observed data, U is some unobserved data, and T is a parameterisation for a joint probabilistic model for (D,U). The goal is to choose T so as to maximize P(D|T)=sum(U) P(D,U|T).
Starting from an initial guess at T, EM obtains the next estimate, T', via a two-step procedure:
(1: "The E-step") Estimate the posterior probability distribution, P(U|D,T), for the missing data given the current parameterisation.
(2: "The M-step") Maximize the following functional with respect to T'
Q(T'|D,T) = sum(U) P(U|D,T) log P(D,U|T')
(That is, maximize the expected log-likelihood of the observed and unobserved data, using the posterior distribution over U computed during the E-step.)
Often the E-step reduces to the computation of certain expectation statistics that characterise the unobserved data sufficiently to proceed with the M-step.
The K-means clustering algorithm can be viewed as an EM algorithm applied to a mixture-of-Gaussians model.
The EM algorithm can also be viewed as a two-step optimization of a single variational functional (see e.g. work of Radford Neal).