Fundamentals of Data Storage in a GISystem

Instructors' Notes

Syllabus Context

Background Information

One of the key words suggested for this module was "discreteness". This is one of a number of important characteristics of digital data. When real world information is stored digitally, a number of changes in its nature are determined by the technology. In addition to characteristics enforced by the processes of abstraction, generalisation (etc.), which will be covered in other units, this unit highlights the processes of selection, representation and quantification, because these processes are required by the technology. Students should be aware of all these processes and their potential effects on real world data.

An underlying theme of this unit is that understanding the elements of computer storage will enable a GIS user to nominate sensible storage and processing procedures for different types of data. The design of data storage, requires knowledge that all numbers are not identical in a computer system. Therefore Table 3 is essential to the theme of the unit, although following the NCGIA guidelines it is not placed in the body of the unit text.

The description of binary number systems is only important as background information. Most students with a geography background need not understand binary numbers: most students with a computing background will already understand them. Tables 1 and 2 are peripheral to the theme of this unit, and merely provide additional information on the existence of non-decimal number systems and ASCII codes.

Demonstrations and Exercises

Most GISystem software enables users to specify data types for new data sets (Erdas IMAGINE is particularly flexible). It is interesting to create data sets with the same data but different data formats, and compare file sizes: this could be set as a student exercise.

Changes in storage requirements are larger with raster than vector files (so IMAGINE provides striking examples). Changes with vector data are also observable: remind students that 2 kbytes on a 100 record test data set will be 2 Mbytes on a 100000 record real life data set.

The suggested study questions emphasise the application of knowledge of data storage formats.