Digital image processing

Module 3.4: Vector Quantization

An Iterative Procedure for designing an optimal codebook.

Suppose we begin with an initial estimate of

.Given

and the distortion measure,

we can determine C_i, at least in theory, by determining the corresponding

for all possible values of

using condition (1).Given an estimate of C_i, we can determine

by computing the centroid of C_i using condition (2). The

obtained is a new estimate of the reconstruction levels and the procedure is continued. The iterative procedure has two difficulties:

(i) Requires the determination of

for all possible

and (ii) The probability density function

required to compute centroid of C_i is usually not known. Instead we have training vectors that are representative of the data to be coded. A modification of this method is the k-means algorithm.

K-Means Algorithm:-

To describe the k-means algorithm, let us suppose we have M-training vectors

Since we estimate L reconstruction levels from M training vectors, we assume

Typically M is of the order of 10L to 50L or more.

Randomly selecting the initial codevectors from the training set often does not provide sufficient diversity to achieve a good locally optimal codebook. It is often effective to choose initial codevectors from training set that are farthest apart in terms of distortion measure. This assures initial codevectors are widely distributed in N-dimensional space.

The reconstruction levels are determined by minimizing average distortion .In the k-means algorithm we begin with an initial estimate of for .

We then classify the M training vectors into L-different groups or clusters, corresponding to each reconstruction level using condition (1).

This can be done by comparing a training vector with each

and choosing the level that result in a minimum distortion.

It is to be noted that we quantise only the given training vector, not all possible vectors . A new reconstruction level is determined from the vectors in each cluster.