Modules / Lectures
Video Player is loading.
Current Time 0:00
Duration -:-
Loaded: 0%
Stream Type LIVE
Remaining Time -:-
 
1x

Video Transcript:

In this lecture we will discuss about, different Fundamental operations of Image Processing
and an overview of image processing.
So, let us understand first how images are represented in computer. Let us consider an
image which is displayed on your screen; and consider a small portion of the image as it
is shown by this rectangle, if I zoom this portion then you will find the enlarging portions
of the image. And you can see that some of the details are more visible here and, but
there are certain kind of rectangular regions of the pixels regions which are also visible.
And, if I further zoom this area, then you will observe that further zooming those areas
there are small squares of uniform illumination, those values are shown here by the numbers
in those squared. So, finally, as you can see that, in the image at the very bottom
level of representation, every point has a number and that number is representing the
brightness value at that particular point. However, while displaying it on a screen it
is a small area which is represented; which is representing that point and this brightness
value is also shown at that point. So, the image is represented as a 2D array
of integers. So, these numbers are all those integers; and in a 2 dimensional array as
you know you require to also mention the array size. So, those array size will be the width
and height of the image for example, in this case for this image the width is 256, which
means there are 256 points along its width and height is 384, there are 384 points along
its height.
When we consider representing a colour image, there would be three such 2 dimensional arrays
each for representing one of the primary colours like red, green and blue. If you consider
any particular point of these array, all the respective locations all the respective array
elements corresponding array elements with the same array indices, they will represent
a combination of colour by this three primary colours. So, you will have a red component
of the colour, if I can display only with the red colour in the screen.
And a green component of the image, which is displayed here by the green colour; and
the blue component of the image which is displayed by blue colour. Now, once we superimpose all
these colours on a screen, then you can get the colour representation of the image itself.
So, an image in this case is represented by three, 2 dimensional array.
And when this information is stored in a computer hard disk, in the secondary storage you know
that any information in the secondary storage is stored as a file. Similarly, image has
to be stored as a file and a file it consists of a stream of bytes. So, in this case every
byte or a collection of byte will represent a pixels and we can consider that is as a
stream of pixels. However, to represent the image in the in
your program in your computation, you require other associated information’s of the images
of that image and which has to be stored also with this stream. Usually, they are stored
a head of the stream in a very predefined format and which is called header of the image
file. For example, a header can consist of this kind of information’s like, it could
it should contain the width information, height then number of components which means in this
case for a colour channel it should be three for a colour image.
Number of bytes per pixel as I mentioned that, a pixel could be represented by number of
bytes; in a very elementary representations usually it is 1 byte per pixel. And where
the values it you know unsigned integer values vary from, 0 to 255 per pixel. of course,
there should be an end of file representation for a file.
And there are different standard file formats which are available for an image representation,
they are very much standardized and documents are available. So, when you get an image file
in these formats, you should know the corresponding format and you should parts the header and
get the images. So, these are some example formats like TIFF, BMP, GIF etc.
So, let us consider, how an image is formed in an optical camera? Let us consider in a
camera which is represented here, a lens of the camera which plays a very critical role.
And, there is a plane where the image points are projected where the points of the 3 dimensional
scene, they are projected on a 2 dimensional plane and that is actually the focal plane
usually of these particular lens. And where the images are found and those points are
finally, sensed by corresponding sensors and they are digitised and you get a digital representation.
So, if I consider that, how a point in a 3 dimensional space is mapped to an image point;
so, let us see how it can take place, if you consider a point P where a light is reflected
and that light is passing through the centre of the lens and intersecting at the plane
of projections or the image plane. So, this point which is represented here by small p
is a image of the scene point which is represented here by this capital P. So, image has been
in this case image formation, has taken place due to the phenomena of reflection as you
can see. So, there is another information, which is
also encoded at this image point that is a amount of reflection that amount of energy
that is received at this point which is reflected from the point P. So, that is the action of
lens, which takes care of this particular but it tries to get as much as energy reflected
energy tries to put them together on this plane p, collects it and focused it on this
point p and that is what is called also focusing. And that is why you get a very sharp picture
if it is properly focused, a sharp point representation of the scene point.
So, this is another encoding. So, the amount of energy which is reflected from this point
that should be received here and that is sensed. So, the interpretation of this image is that
it keeps a brightness distribution in this 2 dimensional plane; where at each point you
get the corresponding it is proportional to the amount of energy reflected from its corresponding
scene point. Let us look at minutely that once again; what
is a rule of projection that I mentioned here and that provides you a very simple mathematical
tool to compute given a point P in the 3 dimensional world, what should be its corresponding image
point in a 2 dimensional plane. So, as we can see in this case that, this can be formed
if I draw a line from this point through a particular fixed point O which is here the
centre of the lens and extend that ray which hits the corresponding image plane and that
intersection point defines the image point of the 3 dimensional scene point.
So, this is a centre of projection as I mentioned and this is the image plane. And so, we can
summarize this rule as image points formed by intersection of the ray from a point P
and passing through the centre of projection O with the image plane. This kind of projection
is known as perspective projection.
But, this is not the only way images are formed there could be other kinds of imaging principles
other rules of projections can take place. For example, in this case what I am going
to show here, there are image formed image of this cube where you know one of the planes
of the cubes have been projected here. And, as you can see all these points are parallely
projected to this plane, there is a particular direction which has to be considered in this
case for this projection it could be normal to the plane, it could be any other directions
in a 3 dimensional plane. So, this rule we can summarize in this way
that image points formed by intersection of parallel rays with the image plane. And, this
kind of one example of this kind of imaging is X ray imaging where X ray beams parallel
X ray beams, it passes through our know body through bones through tissues. And then it
intersects the corresponding X ray plate, which acts likes a image plane in this case
and forms the image and this projection is known as parallel projections.
Let us take another imaging principle. Let us consider you have a surface of an object
and your imaging sensor there is a transmitter, which transmits the transmits a electromagnetic
wave or some acoustic wave and then, the reflected wave is received by receiver. So, the duration
the time interval between transmission and reception that can be measured and if you
know the velocity of the wave you can compute the corresponding distance from this point.
And consider it scan radially, you are taking this at every regular interval and you are
scanning it radially over the surface points. And you can get or you can consider also,
you can translate this transmitter receiver transmitter receiver along certain directions
and performs its action repeatedly. So, for every surface point in that path you will
get a distance. So, you can measure the distance not only that the amount of reflection, what
you get from the surface that would also determine the orientation and surface property of this
particular material. So, one example is this echocardiograms, where
acoustic waves are used and ultra sound waves are used and so, this is one kind of example.
So, if I summarize that how what is an image, how do I define an image? It is in a very
short sentence we can say, it is an impression of the physical world. And just to make it
a little more just to elaborate it we can say that, it is a special distribution of
a measurable quantity encoding the geometry and material properties of objects.
So, now I will be discussing a few concepts and operations, which are which are there
in the image processing. And in this course we would require some of these concepts as
I mentioned earlier that know you do not required to go through the first level image processing
course to attend this particular computer mission course, these are the primers that
I will be discussing here. However, it would be better if you follow some image processing
textbooks and know further know more details about this concepts.
So, let us consider the first a very simple concept of images a very first level statistics
of the distribution of this pixel values and, which can be captured in the form of a frequency
distribution of the brightness values in this particular image. Here I have shown an image
of a scanned page we can say this classes of images are document images. Once again
in this image also you have those brightness values at every pixel.
And as you can see there are two types of pixels are there mostly, one is one is a text
it depicts a text of the document the other one belongs to background. Usually in the
histogram you should have found the this you should have obtained, a bimodal kind of characteristics.
But in this case since you get so many white pixels, so it is more skewed and particularly
the distribution in the text zone it looks little flat.
So, we will come to this point know how this could be processed further to make it more
bimodal, but presently let us consider this is the let us concentrate on this fact that
a. An image histogram is nothing, but the frequency distribution of this brightness
value. And from this frequency distribution you can get the probability distribution you
can convert it into a probability distribution of the brightness value.
If I normalize the histogram; that means, all this frequencies should be divided by
the total number of pixels of the image, then you get the probability distribution of the
brightness values.
So, one of the problem of document image analysis is to separate the foreground from background
and this process is called binarization process. So, and one of the simple technique of binarization
is using a threshold value to declare that, whether a pixel is foreground or whether it
is background ah. So, in our present context the example what
I have given here foreground is the dark pixels and background is the bright pixels; so which
are white region of the document. And pixels after binarization, it could be set to one
of the two values for example, we can consider 255 represents the white region and 0 represents
the text portions or dark pixels say this is what we can consider.
One of the simple algorithms at of this you know binarization could be as follows. Say,
you can choose a threshold value T that is one of the brightness value in that intervals
some value in the brightness interval. And a pixel greater than T is set to 255 otherwise
it is set to 0.
So, this is a very simple algorithm and let us see, what is the affect of this algorithm.
Say, you consider that know this document and this is represented this is displaying
the particular a histogram. Let us consider a particular value say 156 where you perform
this thresholding and then you get an image like this. So, you see that there are only
two types of pixels, pixels with the value 0 and pixels with value 255 in this case when
your threshold value is 156. If I choose another value say 192 you get
also another kind of binarized image; and you can see the difference between these two
images. So, if the threshold value is higher than you get more foreground pixels your text
becomes sharper here, but then there are more spurious noise in your in your document also
which is not desirable. So, what is the optimum threshold value? What is a desirable threshold
value which will make my text sharper which will look the foreground also sharper or they
take the proper program pixels and also it should not contain the it should remove the
noisy part of the pixels also. So, this kind of manual choice of thresholding
may not help when you are trying to process various documents. And one of the objective
would be that to automate this operation of this thresholding ah.
So, one of the techniques, that I would be discussing here, is a Bayesian classification
of foreground and background pixels of a particular image. In this case we consider that our histogram
of the image is a bimodal histogram, a schematic diagram is shown here it is. Say, there are
two modes, there are two peaks in this histogram and we our assumption is that most of the
pixels which are around this mode around its particular peak, those pixels are they are
coming from the foreground pixels. And the and the pixels, which are coming from
the background they are centered around this particular mode. So, we consider there are
two classes of pixels and these are the know symbols of this class these are the notations
of this class in this case say it has considered w 1 and w 2 just for the abstract representation
of this problem. So, what we need to do in this case? We need
to compute the probability of a class w 1 given x and probability of a class w 2 given
x. So, this is because you know in Bayesian classification rule, we can we assign the
pixel x to class w1, if the probability of w 1 given x is greater than probability of
class w 2 given x otherwise we assign it to w 2 that is a base Bayes classification rule.
So, how we can compute this know this particular probability, which is called incidentally
posterior probability. In this case, so we can apply Bayes theorem there and in the Bayes
theorem you can see that in this case I have described the theorem.
So, consider this pixel x, so probability of a class given that pixel x, can be computed
from this three quantities. So, this is probability of the class itself then probability of x
given this class and divided by the probability of x, so this is the Bayes theorem. And it
is simpler to compute particularly this quantity is easier to compute then this quantity directly
because this is called likelihood. And we can assume that the pixels which are around,
this part they form a distributions which are coming a class distribution which are
coming from the foreground pixels and can assume they are Gaussian distribution.
Similarly, the pixels, so this is this is the probability distribution of the pixel
x, given that they are coming from class w 1. And similarly for the background class
also we can consider another probability distributions, for the background class and that would be
the probability distribution of x given w 2. So, these probabilities are called likelihood
that could be easily that could be computed easily rather than computing this directly.
And also you can compute probability of omega itself class probabilities, if I assume that
know some threshold value is chosen then these proportional areas can give me those two probabilities.
I will describe it in the subsequent slide, but what is interesting to note that actually
you may not use this probability of x at all in this computations. Because after computing
this two you need to compare only this values that would, because this is proportional to
this values given a particular x p x is you know already given by the data itself.
So, let us see how these computations can be carried out? And there is a algorithm by
which we determine this thresholds, we call this algorithms as expectation maximization
algorithm. So, let me explain this algorithm here. So, let us consider that histogram once
again histogram of the image or probability distribution of the pixels. And let us assume
and a threshold value initially say at this point. So, this value divides the intervals
this brightness interval into two halves. So, one we can consider this half belongs
to say foreground region and this half belongs to a background region. And so, this is a
representation for the foreground part and say this is a representation of background
part. So, what we can do? Given this threshold this threshold we can compute probability
of w 1 and probability of w 2 why computing the areas area of this part and also areas
of this part and take the proportional know areas of which region to compute that values.
So, this is how the probability of class probabilities could be computed, ones given a threshold
value. And then, after that we can consider only concentrate only this values these classes
only and from there we can compute the parameters of say probability of x given w 1 by assuming
it Gaussian. And similarly we can also compute parameters of probability of x given w 2 by
assuming it Gaussian. So, if I relook at the Gaussian distributions
function, it is a Gaussian distribution function as you can see there are two parameters; one
is mu which is a mean of the distribution another one is sigma which is the standard
deviation of this distribution. So, what you need to do in this case to compute the probability
this likelihood probability, you just simply you need to compute this parameters then you
can compute the probability of any value x given those parameters. So, let us assume
that we call the parameters corresponding parameters for the class w 1 as mu 1 and sigma
2 and corresponding parameters of w 2 as mu 2 and sigma 2.
So, next what we will do that we will be considering the and this is how the corresponding parameters
are computed in this case. You can see probability of w 1 is computed as the as the correspond
as the area of p x from its a summation from 0 to threshold probability of w 2 is a is
just its 1 minus p w 1 because it is a complimentary part of the area. And then this is the main
from this region and this is a standard deviation standard deviation from this region. And similarly
mu 2 is a main from this region and sigma 2 squared that is a variance or sigma 2 is
a standard deviation which is from this region and this is how we are computing.
So, we are computing the variances with this of this values and main of these values. And
these are all simple mathematical arithmetic expressions of weighted means and weighted
variances. So, if you look at the statistics, group of statistics it will be very clear
how these values are computed by this expressions are there. So, ones we get these values then
we are determining a new threshold value such that, the probability of w 1 by x is greater
than probability of w 2 x. So, as soon as it becomes less, then we choose
that threshold values, still choose that threshold value and that value would be a new value.
So, we expected this should be a threshold value, but after computing this parameter
after maximizing the probabilities of occurrences of these pixels then we found there is a better
threshold value which will be giving in a bettered better probabilistic occurrences
of this observations. So, we iterate this process, so that would be my new threshold
value and we will be iterating this process till the process is converged. So, this is
what is your Bayesian classification based binarization method.
And there is another method also we can consider here, which is almost similar and which also
defines an optimization function by which you can get the threshold value. So, this
optimization function is the between class variance of the particular two classes. So,
between class variances are also as you can see it is defined by those by those parameters
which I have discussed earlier. So, this is a class probability of w 1 this is a class
probability of w 2 and these are the means of each classes.
So, given a threshold value I can compute this sigma square B, like probability of w
1 from this part probability of w 2 from this part then mu 1 can be computed from here and
mu 2 can be computed from here. So, you consider that you are computing thee value at every
pixel value from in your intervals say 0 to 255 for example, and you are you consider
the that pixel value where the between class variance is maximum and that has been that
you consider as your threshold value. So, this thresholding principle this thresholding
technique is proposed by Otsu and it is known as Otsu thresholding technique.
So, the example of these particular processing, you can see that with that particular document
image. We have computed this Otsu thresholds and which is 157 in this case and it gives
know this kind of image. And, if I consider the Bayesian thresholds, then we find an another
image incidentally though the threshold values are same you can see that there is little
bit of difference between these two images though quality of this two images are almost
similar; this difference happens because we would like to make this histogram bimodal.
So, before applying Bayesian classification, we process the image so that, the histogram
has a sharper modes also in the foreground zone. So, how we are processing it I will
discuss in the you know next part.
So, in this case, so this is what is the method which I was referring at it is a contrast
enhancement method and here the concept of pixel mapping is used. The pixel mapping concept
is that you have an input pixel and which will be mapped to an output pixel in such
a way that dynamic range of the input would be expanded. That means, suppose in this case
dynamic range is from 0 to this value which could be say half of the interval approximately.
But in the output we are converting that dynamic range from 0 to 255 that makes the pixels
sharper I mean contrast sharper. And one of the property of course, that you need to mention
that you need to you know preserve that this function has to be monotonically increasing
because, if you have two pixels x 2 and x 1 and x 1 is higher than x 2 and corresponding
y 1 also should be higher than y 2. So, that keeps a consistency of displaying you know
brighter pixel brighter and darker pixel darker, so that is why you require this property.
One of the popular function which is used in particular in this case, is this function
from the probability distribution of the you know pixel itself. So, this is a cumulative
distribution and you are scaling it by 255 assuring the range would be from 0 to 255.
So, if I do this operation we can see that, we get a contrast rate image where the features
are more visible in this are more prominent here. You can also look at this histogram
this histogram has similar shape, but the dynamic range has been expanded modes are
more clearly visible. In fact, this is a technique what I was referring at this is a technique
we applied in the document which has been processed for binarization. So, with this
let me stop here, this is a first part of this particular talk, we will start further
we will go for the next part in the next lecture. Thank you very much for your listening.
Keywords: Images, projection, histogram, thresholding, expectation maximization, Bayesian, equalization
Auto Scroll Hide
Module NameDownload
Week_01_Assignment_01Week_01_Assignment_01
Week_02_Assignment_02Week_02_Assignment_02
Week_03_Assignment_03Week_03_Assignment_03
Week_04_Assignment_04Week_04_Assignment_04
Week_05_Assignment_05Week_05_Assignment_05
Week_06_Assignment_06Week_06_Assignment_06
Week_07_Assignment_07Week_07_Assignment_07
Week_08_Assignment_08Week_08_Assignment_08
Week_09_Assignment_09Week_09_Assignment_09
Week_10_Assignment_10Week_10_Assignment_10
Week_11_Assignment_11Week_11_Assignment_11
Week_12_Assignment_12Week_12_Assignment_12





Sl.No Language Book link
1EnglishDownload
2BengaliNot Available
3GujaratiDownload
4HindiDownload
5KannadaNot Available
6MalayalamNot Available
7MarathiNot Available
8TamilDownload
9TeluguNot Available