4.2.1 Basic Procedure of MLE
In some case the MVUE may not exist or it cannot be found by any of the methods discussed so far. The maximum likelihood estimation (MLE) approach is an alternative method in cases where the PDF or the PMF is known. This PDF or PMF involves the unknown parameter θ and is called the likelihood function. With MLE the unknown parameter is estimated by maximizing the likelihood function for the observed data. The MLE is defined as:
where x is the vector of observed data (of N samples).
It can be shown that
is asymptotically unbiased:
and asymptotically efficient:
An important result is that if an MVUE exists, then the MLE procedure will produce it.
Proof
Assume a scaler parameter case, if an MVUE exists, then the log-likelihood function can be factorized as
where
.
On maximizing the likelihood function, by setting its derivative to zero, yields the MLE
Another important observation is that unlike the previous estimates the MLE does not require an explicit expression for p(x; θ)! Indeed given a histogram plot of the PDF as a function of θ one can numerically search for the θ that maximizes the PDF.
4.2.2 Example
Consider the problem of a DC signal embedded in noise:
where w[n] is WGN with zero mean and known variance σ2.
We know that the MVU estimator for θ is the sample-mean. To see that this is also the MLE, we consider the PDF:
and maximize the log likelihood function by setting it to zero:
Thus
which is the sample-mean.
4.2.3 Example
Consider the problem of a DC signal embedded in noise:
where w[n] is WGN with zero mean but unknown variance which is also A, that is the unknown parameter, θ = A, manifests itself both as the unknown signal and the variance of the noise. Although a highly unlikely scenario, this simple example demonstrates the power of the MLE approach since finding the MVUE by the procedures is not easy. Consider the likelihood function for x is given by:
Now consider p(x; θ) as a function of θ, thus it is a likelihood function and we need to maximize it with respect to θ. For Gaussian PDFs it is easier to find the maximum of the log-likelihood function (since logarithm is a monotonic function):
On differentiating we have:

on setting the derivative to zero and solving for θ, produces the MLE:

where we have assumed θ > 0. It can be shown that:

and:
