Fundamentals of Data Storage

by Carol R. Jacobson, School of Earth Sciences, Macquarie University, NSW, Australia

This unit is part of the NCGIA Core Curriculum in Geographic Information Science. These materials may be used for study, research, and education, but please credit the author, Carol R. Jacobson, and the project, NCGIA Core Curriculum in GIScience. All commercial rights reserved. Copyright 1998 by Carol R. Jacobson.

Your comments on these materials are welcome. A link to an evaluation form is provided at the end of this document.


Advanced Organizer

Topics covered in this unit

Intended Learning Outcomes

Instructors' Notes

Full Table of Contents

Metadata and Revision History


Fundamentals of Data Storage

1. The Relationship between the Real World and Data in a GISystem

1.1 Weaknesses of a Discrete Data Model

1.2 Selection

1.3 Representation

1.4 Quantification


2. Storage of Digital Data within a Computer System

Understanding the elements of computer storage will enable a GIS user to design optimum storage for different types of data.

2.1 Bits

2.2 Binary Systems

2.3 Bytes

2.4 The ASCII Coding System

2.5 Storage of Numerical Data

2.6 Storage of Character Data


3. Design of GIS Data for Efficient Storage

3.1 Storage Terminology

3.2 Efficient Use of Data Storage Capacity

4. Summary


5. Review and study questions

(Questions 1 & 2 are from the Original Core Curriculum)
  1. Compare the data storage needs of:
    1. the data transmitted per year by the EOS satellites, which generate 1 Terabyte (1012 bytes) per day;
    2. the US Bureau of the Census's TIGER files of street networks, which are about 10G (gigabytes) and are updated every 10 years; and
    3. a database of 100M (megabytes)created for use in a one-time environmental impact study.
  2. "User expectations about data volumes rise at least as rapidly as the capacity of available storage devices" Discuss.
  3. Recommend data storage types for storing raster data for the following GISystems data:
    1. remote sensing data;
    2. nominal data for planning zones in a local government area, a total of 20 codes are used to indicate different zones;
    3. elevation data for a national park, which includes rugged terrain up to 2000 metres;
    4. geological codes, for example: Qal (alluvium), Tea (volcanic agglomerate), Pi (shale), Psn (quartz sandstone), and Cig (granite).
    5. data on the pH of soil samples, the range of values is 6.5 to 8.5 (data is recorded with one decimal place);
    6. data from a habitat study which record the locations of various tagged animals;
    7. canopy heights in a forested area, data is recorded in metres with 2 decimal places.
  4. A variety of coding schemes are used to convert non-ASCII data to ASCII equivalents before electronic transmission. Find the names of some of the common methods.

6. References

6.1. Print references

Cartography texts provide useful insights into representing the world abstractly. There are numerous computer books written for lay readers, which provide information on digital data storage. Two readable books for non-computer specialists are:

6.2. Web references


Evaluation

We are very interested in your comments and suggestions for improving this material. Please follow the link above to the evaluation form if you would like to contribute in this manner to this evolving project..


Citation

To reference this material use the appropriate variation of the following format:

The correct URL for this page is: http://www.ncgia.ucsb.edu/giscc/units/u037/u037.html.
Created: February 24, 1998.  Last revised: June 29, 1998.


Gatewayto the Core Curriculum