Fundamentals of Data Storage in a GISystem
Instructors' Notes
Syllabus Context
- This unit describes the storage of real world data in a digital format.
- It would be best included in a teaching program after units which provided
insights into modelling the real world, see GISCC section
Representing the earth.
- Issues concerned with how the real world is sampled are also related, see
GISCC section
Abstraction and incompleteness
Background Information
One of the key words suggested for this module was "discreteness".
This is one of a number of important characteristics of digital data.
When real world information is stored digitally, a number
of changes in its nature are determined by the technology. In addition to
characteristics enforced by the processes of abstraction, generalisation
(etc.), which will be covered in other units, this unit highlights the
processes of selection, representation and quantification, because these
processes are required by the technology. Students should be aware of all
these processes and their potential effects on real world data.
An underlying theme of this unit is that understanding the elements of
computer storage will enable a GIS user to nominate sensible storage and
processing procedures for different types of data. The design of data
storage, requires knowledge that all numbers are not identical in a computer
system. Therefore Table 3 is essential to the theme of
the unit, although following the NCGIA guidelines it is not placed in the
body of the unit text.
The description of binary
number systems is only important as background information. Most
students with a geography background need not understand binary numbers:
most students with a computing background will already understand them.
Tables 1 and 2 are peripheral to the theme of this unit, and merely
provide additional information on the existence of non-decimal number
systems and ASCII codes.
Demonstrations and Exercises
Most GISystem software enables users to specify data types for new data sets
(Erdas IMAGINE is particularly flexible). It is interesting to create data
sets with the same data but different data formats, and compare file sizes:
this could be set as a student exercise.
Changes in storage requirements are
larger with raster than vector files (so IMAGINE provides striking
examples). Changes with vector data are also observable: remind students
that 2 kbytes on a 100 record test data set will be 2 Mbytes on a 100000
record real life data set.
The suggested study questions emphasise the application of knowledge of data
storage formats.