welcome all of you to this course on fundamentals
of database systems this is mostly an undergraduate
course for computer science and engineering
as well as for information technology students
so let us start off with what do we mean by
a database so a database is essentially a
collection of data but very importantly it
is not any data it is a collection of inter
related data so the data fields or the data
points must have some connections with them
so it's a set of inter related data for example
I mean we all have some idea of what databases
are and where it can be used for example in
an institute the entire records about an institute
constitutes one database so it will have different
entities such as faculty members students
staff etc as well as other kinds of things
such as courses etc and everything constitutes
one database so this is what is a database
and what is a database management system or
a dbms it is a system that provides an environment
to handle a database so it must be efficient
and convenient to use and essentially it contains
some programs and interface to store data
so the first point is for store data that
is number one number two is it must be able
to visualize data then access data or access
or what is also known as querying the data
and the fourth one is update or manipulate
the data so these are the four tasks of a
database management system so this is over
the database the question comes then is that
the normal file system can also do all of
these things It can of course store data so
we store files etc it can visualize data so
well it is not really clear what does visualization
mean we will see that later but you can at
least see what the files and folders are etc
it can of course access data you can access
a particular file within a particular folder
etc and it lets you update the contents of
the file contents of the folder etc so how
is database more important or how is it first
of all different from a file how is it more
important than a file system so there is this
different advantages of the file system that
let us go through so advantages of dbms over
file systems
so the first one is that it reduces the data
redundancy and inconsistency problems so redundancy
stroke inconsistency so let me go over each
of these points a little bit so what does
data redundancy mean essentially data redundancy
means the following is that there is a same
piece of data the same information that is
stored in multiple places and database I mean
a well designed database does not really allow
that so if there is an employee or if there
is information about a student the name of
the student for example the roll number etc
is generally should be stored in only one
place this is the big difference about file
system where most of us replicate files copy
files from here to there etc etc what does
that copying also does is that it runs into
this problem of inconsistency so what does
inconsistency actually mean is that you have
two versions of a file and in one version
you have changed something and you have inadvertently
forgot to make the same modification to the
other file or there are two other two people
making two different versions of the file
etc so they so the information on these files
two files may be inconsistent so one of them
may not be correct because the other one has
been updated and the update has not yet come
to this piece so database by virtue of storing
each piece of information once and in one
place reduces this inconsistency so there
is no redundancy and there is no inconsistency
so that's the first point the second point
is it is a very related point it is called
data isolation
so here what happens is that the data is isolated
in the form that there is only one so the
data is stored in an internal format and there
is only one interface to access the data so
that essentially the data is stored in some
kind of a binary format which is not the headache
of the final user the user wants the piece
of data and there is an interface the database
interface will let that user access the data
in a particular format so the other interesting
the other important advantage of this is that
because the data is isolated and stored in
a format that database handles this this problem
of formatting of data is not there and this
we often run into with file systems for example
you may be working with an excel file which
is a spreadsheet and once you take it to for
example linux system or mac system that excel
file may not open correctly because it is
not in that open document format and all those
things and the and the and the and the programs
to open them does not behave consistently
that's the problem but the database isolates
that the third important thing is called data
integrity so what does data integrity mean
essentially is that there are many pieces
of data where there is schematics of data
or there is a correctness condition for example
is that the cpi or the cumulative point average
or your marks etc must be the grade point
average whatever you call it they are generally
constant to be between zero to ten in our
Indian systems at least but suppose you are
storing all of information in a file or in
an excel spreadsheet or whatever now there
is no check so if you for example by chance
entered somebody's cpi as minus one or twelve
there is no check in when you try to put in
the data however a database will let you put
in some integrity constants so whenever you
define the cpi field you can say that it is
between zero to ten and so any attempts to
enter a data value of for example minus one
will result in a nil error and it will not
let you enter the logic is in the database
system itself not by the access so once you
define the database of that cpi you can say
that it is between zero to ten and then that
logic is within the database system so this
is the third point the fourth point is called
the atomicity of operations so this is probably
one of the most famous uses of database and
as an example let us take the bank transfer
case so certain amount of money so let us
say one hundred rupees is transferred from
person a account to person b account the atomicity
of operations so the atomicity of transactions
defines that either the money is transferred
completely or no money is transferred at all
so what do we mean by that is that it cannot
happen that rupees hundred has been deducted
from a's account but it has not been credited
to b's account or the other way around that
rupees hundred has been credited to b's account
but it has not been debited from a's account
and you can see what problems will happen
if this I mean inconsistent credits and debits
can take place so a database system encodes
that entire operation of debiting from a's
account and crediting to b's account as one
operation or one transaction and we will see
more about transactions later in the course
and one transaction and it mandates the atomicity
of it so either this has happened completely
so either one hundred has been debited and
credited both or it has not been debited credited
either so this is the other important part
let us come to the fifth point the fifth point
is concurrency so very simply concurrency
means multiple operations can take place together
in the database the database is I mean generally
takes care of those things as long as there
are no conflicts in that particular data item
so if a is transferring money to b and c is
transferring money to d they do not need to
wait for each other and both the transactions
can happen together in file systems or it
may not happen if all the transactions are
opened in the same file so one file the same
file needs to be first edited for a to b then
saved and then c to d should happen but it
depends also on the file system that is being
used probably the last important thing about
databases or why it is useful over file system
is that of security so
security of the data is paramount in databases
and the database or the data fields can be
made secure in multiple ways so first of all
for example the employee database is secured
in the sense that unauthorized users may not
gain access to any of this so that is one
part of the security the other and probably
the more important part of the security is
that an employee so there are different access
control levels in the database for different
fields and different roles for example an
employee may be able to see her own record
but not records of others it may happen that
way it may also happen that student may see
a cpi of herself but not of others but a student
may change her address home address but not
her cpi so even if the student even if the
record is about the student so even within
a record there can be attribute level security
attribute level access privileges etc and
the security mandates that nobody I mean unless
it is allowed by the database system and these
roles have to be defined very carefully while
defining the fields of the database will be
allowed to change that these are the six important
points for the advantages of dbms over the
file systems as you go over it the first one
is redundancy or inconsistency then it is
data isolation then it is data integrity the
fourth one is atomicity of operations fifth
one is concurrency and the sixth one is security
so well that ends the first part of the module
in the next module we will cover we will start
on with the relational data models