NPTEL :: Computer Science and Engineering

Modules / Lectures

Video Player is loading.

Current Time 0:00

Duration -:-

Loaded: 0%

Stream Type LIVE

Remaining Time -:-

Language for Video Transcript:

Video Transcript:

In this lecture we will discuss about, different Fundamental operations of Image Processing

and an overview of image processing.

So, let us understand first how images are represented in computer. Let us consider an

image which is displayed on your screen; and consider a small portion of the image as it

is shown by this rectangle, if I zoom this portion then you will find the enlarging portions

of the image. And you can see that some of the details are more visible here and, but

there are certain kind of rectangular regions of the pixels regions which are also visible.

And, if I further zoom this area, then you will observe that further zooming those areas

there are small squares of uniform illumination, those values are shown here by the numbers

in those squared. So, finally, as you can see that, in the image at the very bottom

level of representation, every point has a number and that number is representing the

brightness value at that particular point. However, while displaying it on a screen it

is a small area which is represented; which is representing that point and this brightness

value is also shown at that point. So, the image is represented as a 2D array

of integers. So, these numbers are all those integers; and in a 2 dimensional array as

you know you require to also mention the array size. So, those array size will be the width

and height of the image for example, in this case for this image the width is 256, which

means there are 256 points along its width and height is 384, there are 384 points along

its height.

When we consider representing a colour image, there would be three such 2 dimensional arrays

each for representing one of the primary colours like red, green and blue. If you consider

any particular point of these array, all the respective locations all the respective array

elements corresponding array elements with the same array indices, they will represent

a combination of colour by this three primary colours. So, you will have a red component

of the colour, if I can display only with the red colour in the screen.

And a green component of the image, which is displayed here by the green colour; and

the blue component of the image which is displayed by blue colour. Now, once we superimpose all

these colours on a screen, then you can get the colour representation of the image itself.

So, an image in this case is represented by three, 2 dimensional array.

And when this information is stored in a computer hard disk, in the secondary storage you know

that any information in the secondary storage is stored as a file. Similarly, image has

to be stored as a file and a file it consists of a stream of bytes. So, in this case every

byte or a collection of byte will represent a pixels and we can consider that is as a

stream of pixels. However, to represent the image in the in

your program in your computation, you require other associated information’s of the images

of that image and which has to be stored also with this stream. Usually, they are stored

a head of the stream in a very predefined format and which is called header of the image

file. For example, a header can consist of this kind of information’s like, it could

it should contain the width information, height then number of components which means in this

case for a colour channel it should be three for a colour image.

Number of bytes per pixel as I mentioned that, a pixel could be represented by number of

bytes; in a very elementary representations usually it is 1 byte per pixel. And where

the values it you know unsigned integer values vary from, 0 to 255 per pixel. of course,

there should be an end of file representation for a file.

And there are different standard file formats which are available for an image representation,

they are very much standardized and documents are available. So, when you get an image file

in these formats, you should know the corresponding format and you should parts the header and

get the images. So, these are some example formats like TIFF, BMP, GIF etc.

So, let us consider, how an image is formed in an optical camera? Let us consider in a

camera which is represented here, a lens of the camera which plays a very critical role.

And, there is a plane where the image points are projected where the points of the 3 dimensional

scene, they are projected on a 2 dimensional plane and that is actually the focal plane

usually of these particular lens. And where the images are found and those points are

finally, sensed by corresponding sensors and they are digitised and you get a digital representation.

So, if I consider that, how a point in a 3 dimensional space is mapped to an image point;

so, let us see how it can take place, if you consider a point P where a light is reflected

and that light is passing through the centre of the lens and intersecting at the plane

of projections or the image plane. So, this point which is represented here by small p

is a image of the scene point which is represented here by this capital P. So, image has been

in this case image formation, has taken place due to the phenomena of reflection as you

can see. So, there is another information, which is

also encoded at this image point that is a amount of reflection that amount of energy

that is received at this point which is reflected from the point P. So, that is the action of

lens, which takes care of this particular but it tries to get as much as energy reflected

energy tries to put them together on this plane p, collects it and focused it on this

point p and that is what is called also focusing. And that is why you get a very sharp picture

if it is properly focused, a sharp point representation of the scene point.

So, this is another encoding. So, the amount of energy which is reflected from this point

that should be received here and that is sensed. So, the interpretation of this image is that

it keeps a brightness distribution in this 2 dimensional plane; where at each point you

get the corresponding it is proportional to the amount of energy reflected from its corresponding

scene point. Let us look at minutely that once again; what

is a rule of projection that I mentioned here and that provides you a very simple mathematical

tool to compute given a point P in the 3 dimensional world, what should be its corresponding image

point in a 2 dimensional plane. So, as we can see in this case that, this can be formed

if I draw a line from this point through a particular fixed point O which is here the

centre of the lens and extend that ray which hits the corresponding image plane and that

intersection point defines the image point of the 3 dimensional scene point.

So, this is a centre of projection as I mentioned and this is the image plane. And so, we can

summarize this rule as image points formed by intersection of the ray from a point P

and passing through the centre of projection O with the image plane. This kind of projection

is known as perspective projection.

But, this is not the only way images are formed there could be other kinds of imaging principles

other rules of projections can take place. For example, in this case what I am going

to show here, there are image formed image of this cube where you know one of the planes

of the cubes have been projected here. And, as you can see all these points are parallely

projected to this plane, there is a particular direction which has to be considered in this

case for this projection it could be normal to the plane, it could be any other directions

in a 3 dimensional plane. So, this rule we can summarize in this way

that image points formed by intersection of parallel rays with the image plane. And, this

kind of one example of this kind of imaging is X ray imaging where X ray beams parallel

X ray beams, it passes through our know body through bones through tissues. And then it

intersects the corresponding X ray plate, which acts likes a image plane in this case

and forms the image and this projection is known as parallel projections.

Let us take another imaging principle. Let us consider you have a surface of an object

and your imaging sensor there is a transmitter, which transmits the transmits a electromagnetic

wave or some acoustic wave and then, the reflected wave is received by receiver. So, the duration

the time interval between transmission and reception that can be measured and if you

know the velocity of the wave you can compute the corresponding distance from this point.

And consider it scan radially, you are taking this at every regular interval and you are

scanning it radially over the surface points. And you can get or you can consider also,

you can translate this transmitter receiver transmitter receiver along certain directions

and performs its action repeatedly. So, for every surface point in that path you will

get a distance. So, you can measure the distance not only that the amount of reflection, what

you get from the surface that would also determine the orientation and surface property of this

particular material. So, one example is this echocardiograms, where

acoustic waves are used and ultra sound waves are used and so, this is one kind of example.

So, if I summarize that how what is an image, how do I define an image? It is in a very

short sentence we can say, it is an impression of the physical world. And just to make it

a little more just to elaborate it we can say that, it is a special distribution of

a measurable quantity encoding the geometry and material properties of objects.

So, now I will be discussing a few concepts and operations, which are which are there

in the image processing. And in this course we would require some of these concepts as

I mentioned earlier that know you do not required to go through the first level image processing

course to attend this particular computer mission course, these are the primers that

I will be discussing here. However, it would be better if you follow some image processing

textbooks and know further know more details about this concepts.

So, let us consider the first a very simple concept of images a very first level statistics

of the distribution of this pixel values and, which can be captured in the form of a frequency

distribution of the brightness values in this particular image. Here I have shown an image

of a scanned page we can say this classes of images are document images. Once again

in this image also you have those brightness values at every pixel.

And as you can see there are two types of pixels are there mostly, one is one is a text

it depicts a text of the document the other one belongs to background. Usually in the

histogram you should have found the this you should have obtained, a bimodal kind of characteristics.

But in this case since you get so many white pixels, so it is more skewed and particularly

the distribution in the text zone it looks little flat.

So, we will come to this point know how this could be processed further to make it more

bimodal, but presently let us consider this is the let us concentrate on this fact that

a. An image histogram is nothing, but the frequency distribution of this brightness

value. And from this frequency distribution you can get the probability distribution you

can convert it into a probability distribution of the brightness value.

If I normalize the histogram; that means, all this frequencies should be divided by

the total number of pixels of the image, then you get the probability distribution of the

brightness values.

So, one of the problem of document image analysis is to separate the foreground from background

and this process is called binarization process. So, and one of the simple technique of binarization

is using a threshold value to declare that, whether a pixel is foreground or whether it

is background ah. So, in our present context the example what

I have given here foreground is the dark pixels and background is the bright pixels; so which

are white region of the document. And pixels after binarization, it could be set to one

of the two values for example, we can consider 255 represents the white region and 0 represents

the text portions or dark pixels say this is what we can consider.

One of the simple algorithms at of this you know binarization could be as follows. Say,

you can choose a threshold value T that is one of the brightness value in that intervals

some value in the brightness interval. And a pixel greater than T is set to 255 otherwise

it is set to 0.

So, this is a very simple algorithm and let us see, what is the affect of this algorithm.

Say, you consider that know this document and this is represented this is displaying

the particular a histogram. Let us consider a particular value say 156 where you perform

this thresholding and then you get an image like this. So, you see that there are only

two types of pixels, pixels with the value 0 and pixels with value 255 in this case when

your threshold value is 156. If I choose another value say 192 you get

also another kind of binarized image; and you can see the difference between these two

images. So, if the threshold value is higher than you get more foreground pixels your text

becomes sharper here, but then there are more spurious noise in your in your document also

which is not desirable. So, what is the optimum threshold value? What is a desirable threshold

value which will make my text sharper which will look the foreground also sharper or they

take the proper program pixels and also it should not contain the it should remove the

noisy part of the pixels also. So, this kind of manual choice of thresholding

may not help when you are trying to process various documents. And one of the objective

would be that to automate this operation of this thresholding ah.

So, one of the techniques, that I would be discussing here, is a Bayesian classification

of foreground and background pixels of a particular image. In this case we consider that our histogram

of the image is a bimodal histogram, a schematic diagram is shown here it is. Say, there are

two modes, there are two peaks in this histogram and we our assumption is that most of the

pixels which are around this mode around its particular peak, those pixels are they are

coming from the foreground pixels. And the and the pixels, which are coming from

the background they are centered around this particular mode. So, we consider there are

two classes of pixels and these are the know symbols of this class these are the notations

of this class in this case say it has considered w 1 and w 2 just for the abstract representation

of this problem. So, what we need to do in this case? We need

to compute the probability of a class w 1 given x and probability of a class w 2 given

x. So, this is because you know in Bayesian classification rule, we can we assign the

pixel x to class w1, if the probability of w 1 given x is greater than probability of

class w 2 given x otherwise we assign it to w 2 that is a base Bayes classification rule.

So, how we can compute this know this particular probability, which is called incidentally

posterior probability. In this case, so we can apply Bayes theorem there and in the Bayes

theorem you can see that in this case I have described the theorem.

So, consider this pixel x, so probability of a class given that pixel x, can be computed

from this three quantities. So, this is probability of the class itself then probability of x

given this class and divided by the probability of x, so this is the Bayes theorem. And it

is simpler to compute particularly this quantity is easier to compute then this quantity directly

because this is called likelihood. And we can assume that the pixels which are around,

this part they form a distributions which are coming a class distribution which are

coming from the foreground pixels and can assume they are Gaussian distribution.

Similarly, the pixels, so this is this is the probability distribution of the pixel

x, given that they are coming from class w 1. And similarly for the background class

also we can consider another probability distributions, for the background class and that would be

the probability distribution of x given w 2. So, these probabilities are called likelihood

that could be easily that could be computed easily rather than computing this directly.

And also you can compute probability of omega itself class probabilities, if I assume that

know some threshold value is chosen then these proportional areas can give me those two probabilities.

I will describe it in the subsequent slide, but what is interesting to note that actually

you may not use this probability of x at all in this computations. Because after computing

this two you need to compare only this values that would, because this is proportional to

this values given a particular x p x is you know already given by the data itself.

So, let us see how these computations can be carried out? And there is a algorithm by

which we determine this thresholds, we call this algorithms as expectation maximization

algorithm. So, let me explain this algorithm here. So, let us consider that histogram once

again histogram of the image or probability distribution of the pixels. And let us assume

and a threshold value initially say at this point. So, this value divides the intervals

this brightness interval into two halves. So, one we can consider this half belongs

to say foreground region and this half belongs to a background region. And so, this is a

representation for the foreground part and say this is a representation of background

part. So, what we can do? Given this threshold this threshold we can compute probability

of w 1 and probability of w 2 why computing the areas area of this part and also areas

of this part and take the proportional know areas of which region to compute that values.

So, this is how the probability of class probabilities could be computed, ones given a threshold

value. And then, after that we can consider only concentrate only this values these classes

only and from there we can compute the parameters of say probability of x given w 1 by assuming

it Gaussian. And similarly we can also compute parameters of probability of x given w 2 by

assuming it Gaussian. So, if I relook at the Gaussian distributions

function, it is a Gaussian distribution function as you can see there are two parameters; one

is mu which is a mean of the distribution another one is sigma which is the standard

deviation of this distribution. So, what you need to do in this case to compute the probability

this likelihood probability, you just simply you need to compute this parameters then you

can compute the probability of any value x given those parameters. So, let us assume

that we call the parameters corresponding parameters for the class w 1 as mu 1 and sigma

2 and corresponding parameters of w 2 as mu 2 and sigma 2.

So, next what we will do that we will be considering the and this is how the corresponding parameters

are computed in this case. You can see probability of w 1 is computed as the as the correspond

as the area of p x from its a summation from 0 to threshold probability of w 2 is a is

just its 1 minus p w 1 because it is a complimentary part of the area. And then this is the main

from this region and this is a standard deviation standard deviation from this region. And similarly

mu 2 is a main from this region and sigma 2 squared that is a variance or sigma 2 is

a standard deviation which is from this region and this is how we are computing.

So, we are computing the variances with this of this values and main of these values. And

these are all simple mathematical arithmetic expressions of weighted means and weighted

variances. So, if you look at the statistics, group of statistics it will be very clear

how these values are computed by this expressions are there. So, ones we get these values then

we are determining a new threshold value such that, the probability of w 1 by x is greater

than probability of w 2 x. So, as soon as it becomes less, then we choose

that threshold values, still choose that threshold value and that value would be a new value.

So, we expected this should be a threshold value, but after computing this parameter

after maximizing the probabilities of occurrences of these pixels then we found there is a better

threshold value which will be giving in a bettered better probabilistic occurrences

of this observations. So, we iterate this process, so that would be my new threshold

value and we will be iterating this process till the process is converged. So, this is

what is your Bayesian classification based binarization method.

And there is another method also we can consider here, which is almost similar and which also

defines an optimization function by which you can get the threshold value. So, this

optimization function is the between class variance of the particular two classes. So,

between class variances are also as you can see it is defined by those by those parameters

which I have discussed earlier. So, this is a class probability of w 1 this is a class

probability of w 2 and these are the means of each classes.

So, given a threshold value I can compute this sigma square B, like probability of w

1 from this part probability of w 2 from this part then mu 1 can be computed from here and

mu 2 can be computed from here. So, you consider that you are computing thee value at every

pixel value from in your intervals say 0 to 255 for example, and you are you consider

the that pixel value where the between class variance is maximum and that has been that

you consider as your threshold value. So, this thresholding principle this thresholding

technique is proposed by Otsu and it is known as Otsu thresholding technique.

So, the example of these particular processing, you can see that with that particular document

image. We have computed this Otsu thresholds and which is 157 in this case and it gives

know this kind of image. And, if I consider the Bayesian thresholds, then we find an another

image incidentally though the threshold values are same you can see that there is little

bit of difference between these two images though quality of this two images are almost

similar; this difference happens because we would like to make this histogram bimodal.

So, before applying Bayesian classification, we process the image so that, the histogram

has a sharper modes also in the foreground zone. So, how we are processing it I will

discuss in the you know next part.

So, in this case, so this is what is the method which I was referring at it is a contrast

enhancement method and here the concept of pixel mapping is used. The pixel mapping concept

is that you have an input pixel and which will be mapped to an output pixel in such

a way that dynamic range of the input would be expanded. That means, suppose in this case

dynamic range is from 0 to this value which could be say half of the interval approximately.

But in the output we are converting that dynamic range from 0 to 255 that makes the pixels

sharper I mean contrast sharper. And one of the property of course, that you need to mention

that you need to you know preserve that this function has to be monotonically increasing

because, if you have two pixels x 2 and x 1 and x 1 is higher than x 2 and corresponding

y 1 also should be higher than y 2. So, that keeps a consistency of displaying you know

brighter pixel brighter and darker pixel darker, so that is why you require this property.

One of the popular function which is used in particular in this case, is this function

from the probability distribution of the you know pixel itself. So, this is a cumulative

distribution and you are scaling it by 255 assuring the range would be from 0 to 255.

So, if I do this operation we can see that, we get a contrast rate image where the features

are more visible in this are more prominent here. You can also look at this histogram

this histogram has similar shape, but the dynamic range has been expanded modes are

more clearly visible. In fact, this is a technique what I was referring at this is a technique

we applied in the document which has been processed for binarization. So, with this

let me stop here, this is a first part of this particular talk, we will start further

we will go for the next part in the next lecture. Thank you very much for your listening.

Keywords: Images, projection, histogram, thresholding, expectation maximization, Bayesian, equalization

Auto Scroll

Assignments

Module Name	Download
Week_01_Assignment_01	Week_01_Assignment_01
Week_02_Assignment_02	Week_02_Assignment_02
Week_03_Assignment_03	Week_03_Assignment_03
Week_04_Assignment_04	Week_04_Assignment_04
Week_05_Assignment_05	Week_05_Assignment_05
Week_06_Assignment_06	Week_06_Assignment_06
Week_07_Assignment_07	Week_07_Assignment_07
Week_08_Assignment_08	Week_08_Assignment_08
Week_09_Assignment_09	Week_09_Assignment_09
Week_10_Assignment_10	Week_10_Assignment_10
Week_11_Assignment_11	Week_11_Assignment_11
Week_12_Assignment_12	Week_12_Assignment_12

Show entries

Search:

Sl.No	Chapter Name	MP4 Download
1	Lecture 01: Fundamentals of Image Processing Part I	Download
2	Lecture 02: Fundamentals of Imagr Processing Part II	Download
3	Lecture 03: Image Transform Part I	Download
4	Lecture 04: Image Transform Part II	Download
5	Lecture 05: Projective Geometry â€“ Part I	Download
6	Lecture 06: Projective Geometry â€“ Part Ii	Download
7	Lecture 07: Projective Transformation	Download
8	Lecture 08: Homography: Properties â€“ Part I	Download
9	Lecture 09: Homography: Properties â€“ Part Ii	Download
10	Lecture 10: Homography: Properties â€“ Part Iii	Download

Showing 1 to 10 of 60 entries

Previous1 2 3 4 5 6Next

Show entries

Search:

Sl.No	Chapter Name	English
1	Lecture 01: Fundamentals of Image Processing Part I	Download Verified
2	Lecture 02: Fundamentals of Imagr Processing Part II	Download Verified
3	Lecture 03: Image Transform Part I	Download Verified
4	Lecture 04: Image Transform Part II	Download Verified
5	Lecture 05: Projective Geometry â€“ Part I	Download Verified
6	Lecture 06: Projective Geometry â€“ Part Ii	Download Verified
7	Lecture 07: Projective Transformation	Download Verified
8	Lecture 08: Homography: Properties â€“ Part I	Download Verified
9	Lecture 09: Homography: Properties â€“ Part Ii	Download Verified
10	Lecture 10: Homography: Properties â€“ Part Iii	Download Verified

Showing 1 to 10 of 60 entries

Previous1 2 3 4 5 6Next

Show entries

Search:

Sl.No	Chapter Name	Gujarati
1	Lecture 01: Fundamentals of Image Processing Part I	Download
2	Lecture 02: Fundamentals of Imagr Processing Part II	Download
3	Lecture 03: Image Transform Part I	Download
4	Lecture 04: Image Transform Part II	Download
5	Lecture 05: Projective Geometry â€“ Part I	Download
6	Lecture 06: Projective Geometry â€“ Part Ii	Download
7	Lecture 07: Projective Transformation	Download
8	Lecture 08: Homography: Properties â€“ Part I	Download
9	Lecture 09: Homography: Properties â€“ Part Ii	Download
10	Lecture 10: Homography: Properties â€“ Part Iii	Download

Showing 1 to 10 of 60 entries

Previous1 2 3 4 5 6Next

Show entries

Search:

Sl.No	Chapter Name	Hindi
1	Lecture 01: Fundamentals of Image Processing Part I	Download
2	Lecture 02: Fundamentals of Imagr Processing Part II	Download
3	Lecture 03: Image Transform Part I	Download
4	Lecture 04: Image Transform Part II	Download
5	Lecture 05: Projective Geometry â€“ Part I	Download
6	Lecture 06: Projective Geometry â€“ Part Ii	Download
7	Lecture 07: Projective Transformation	Download
8	Lecture 08: Homography: Properties â€“ Part I	Download
9	Lecture 09: Homography: Properties â€“ Part Ii	Download
10	Lecture 10: Homography: Properties â€“ Part Iii	Download

Showing 1 to 10 of 60 entries

Previous1 2 3 4 5 6Next

Show entries

Search:

Sl.No	Chapter Name	Tamil
1	Lecture 01: Fundamentals of Image Processing Part I	Download
2	Lecture 02: Fundamentals of Imagr Processing Part II	Download
3	Lecture 03: Image Transform Part I	Download
4	Lecture 04: Image Transform Part II	Download
5	Lecture 05: Projective Geometry â€“ Part I	Download
6	Lecture 06: Projective Geometry â€“ Part Ii	Download
7	Lecture 07: Projective Transformation	Download
8	Lecture 08: Homography: Properties â€“ Part I	Download
9	Lecture 09: Homography: Properties â€“ Part Ii	Download
10	Lecture 10: Homography: Properties â€“ Part Iii	Download

Showing 1 to 10 of 60 entries

Previous1 2 3 4 5 6Next

Sl.No	Language	Book link
1	English	Download
2	Bengali	Not Available
3	Gujarati	Download
4	Hindi	Download
5	Kannada	Not Available
6	Malayalam	Not Available
7	Marathi	Not Available
8	Tamil	Download
9	Telugu	Not Available