Module 5 : Geographical Information System (GIS)
  Lecture 40 : Errors and corrections
Error sources in GIS
  • Although GIS provides a tremendous capability for data processing, one should be aware of situations where things may go wrong. For this one should know various sources of errors which may limit the applicability of results.
  • Data quality is an important issue in GIS and indicates how good data are (Heywood et. al, 2000). It describes the overall fitness or suitability of data for specific purpose. It also describes that the data are free from error and other problems. Various ways in which this can be assessed is by understanding terms like error, accuracy, precision, and bias. Further, the resolution and generalization of source data and the data model can also influence the final results. The data also need to be complete, compatible and consistent, and applicable for the analysis.
Issues related to data quality
  • Some aspects of data quality can be understood in terms of the following
    • Accuracy
    • Precision
    • Bias
    • Resolution
    • Generalization
    • Completeness
    • Compatibility
    • Consistency
    • Applicability
Errors
  • Definition of Error
    • Difference between a measured quantity and its true (unknown) value
      e = M - T;  where e = error, M = measured value, T = true value
Sources of Errors

Instrumental errors

Caused by imperfections in instrument construction or adjustment.

Natural errors

Caused by changing conditions in the surrounding environment.

Personal errors

Caused by limitations in human senses and manual dexterity.

 
Accuracy Vs. Precision
  • Accuracy : Indicates nearness of measured value (M) to true value (T).
    • True value is always unknown. Therefore, accuracy is indeterminate in practice and we resort to other methods of assessing the reliability of measurement to decide which of the measurements is the best (provided we have decided what is the best).
    • An accurate GIS database indicates a true representation of reality.
  • Precision allows us to make such judgments and refers to the degree of consistency between measurements and is based on the size of the discrepancies in a data set. It is helpful in two ways:
    • It is an indication of the spread of the measured values of the quantity. Several values grouped closely together are said to be more precise set of measurements than other with a broader spread of values. The precision is inversely proportional to variance.
    • Denotes something to a higher manufacturing standard.
  • Bias : It indicates a systematic variation of data from reality. It is a consistent error throughout the dataset.

Figure 40.1 gives a comparison between accuracy and precision. It shows that in surveying measurements:

  • Pacing less precise
  • Taping more precise
  • EDM most precise

The figure also shows the following interpretations

Result
Observer's point of view
a is accurate but not precise. Never know its accurate
b is neither precise nor accurate. Assumed poor
c is precise but not accurate. Caused by systematic errors
d is accurate and precise. Desirable for data collection
Observation Pacing
(p)
Taping
(t)
EDM
(e)
1 571 567.17 567.133
2 563 567.08 567.124
3 566 567.12 567.129
4 588 567.38 567.165
5 557 567.01 567.114
Figure 40.1: Precision and accuracy (Wolf and Ghilani, 1997)
 
The following concepts have been taken from Heywood et. al. (2000):
  • Resolution : It describes the smallest feature in the data set that can be displayed or mapped. In raster GIS, it is indicated by the raster cell size. For example, a square cell of 20 m will indicate that only those features will be mapped which are above 20 m x 20 m. In vector GIS, resolution is determined by scale of the original map and the size of feature represented on that.
  • Generalization : It is the process of simplifying the complexities of the real world to produce maps and models.
  • Completeness : It refers to the fact that complete set of spatial and temporal data is available.
  • Compatibility and consistency : It refers to the fact that different data sets used in creating GIS database should be developed using similar methods of data capture, storage, manipulation and editing. For example, the maps used should be at very different scales.
  • Applicability: It describes the appropriateness or suitability of data for a set of commands, operations or analyses.