UNIT 1: DATA ACQUISITION

Written by Karen Beardsley Willett, University of California, Davis


Context


Acquiring data over the Internet has become common practice over the past few years. Numerous web sites provide free data layers on the web, and most provide adequate metadata (information about the data) for users to determine whether or not the data are appropriate for their needs. One must be cautious, however, when acquiring free data over the Internet. Many data formats, compression methods, and archive media are used for distributing and accessing GIS data sets.

This Unit will focus on methods for acquiring GIS data over the Internet, including understanding Internet data transfer methods and developing skills in manipulating a variety of data formats.
The following is an example of using a combination of ftp, tar, and gzip tools to retrieve and manipulate GIS data over the Internet.


Example Application

A small environmental consulting firm has hired a GIS technician to acquire basic data layers for their state. The company first requires data at a coarse scale (statewide), and later will want to obtain more detailed, finer scale data sets for particular regions in the state. Since the USGS makes numerous GIS data layers available on the Internet, several of these base layers can easily be retrieved by the GIS technician through use of a web browser. The technician proceeds to acquire data layers for Digital Elevation Model (DEM) data, transportation data, and Land Use/Land Cover data. In order to accomplish this task, the GIS technician follows several steps:

  1. From the Netscape browser, open the USGS page http://www.usgs.gov/
  2. Under Science Topics, click on Mapping
  3. Click on US GeoData under Mapping Products and Services
  4. Under 1:250,000-Scale Digital Elevation Model (DEM), review the Condensed User Guide
  5. Click on FTP via State
  6. Using FTP, download one data set in Compressed format
  7. Use GZIP utility (GUNZIP) to uncompress data
  8. As indicated in the instructions, review and download the User Guide from http://edcwww.cr.usgs.gov/glis/hyper/guide/1_dgr_dem
  9. Convert the data from DEM format into the GIS software used by the company
The technician obtained other base layers using a similar method. Some data sets are acquired from public agencies via tape or disk, and these are processed in a similar manner once extracted from the transfer media.

Learning Outcomes

The following list describes the expected skills which students should master for each level of training, i.e. Awareness/Competency/Mastery.

Awareness:

The expected learning goals of this section are to achieve a general understanding of the Internet tools available for accessing and transfering GIS data files; gain a basic vocabulary of file formats, compression methods, and transfer protocols; and recognize and be able to utilize various media sources for data transfer.

Competency:

The learning goals of this section are to develop the ability to connect to remote computers, locate desired data sets, and retrieve data of various formats over the Internet or from archive media.

Mastery:

The leaning goals of this section are to develop the ability to prepare data sets for transfer to other users and to understand interoperability issues relating to shared access of GIS data over the Internet.


Preparatory Units

Recommended:

None

Complementary:

  1. Unit 2 - Locating Demographic Data
  2. Unit 3 - Locating Transportation Network Data
  3. Unit 4 - Locating Land Records Data
  4. Unit 5 - Locating Natural Resources Data
  5. Unit 6 - Locating Terrain Data
  6. Unit 7 - Using and Interpreting Metadata

Awareness


Learning Objectives:

  1. Student can use web browser to find data.
  2. Student can define basic vocabulary relating to the Internet and data transfer methods.
  3. Student is able to retrieve data over the Internet using File Transfer Protocol (ftp).

Vocabulary:

Topics:

  1. Unit Concepts
    • The Internet is a federation of computer networks that speak the same protocols ("language"). The networks are connected to each other with high-speed telephone circuits. The protocols spoken are computer networking protocols, used to enable computers to communicate with each other (just as human protocols help humans communicate). Three roles played on the Internet: Information provider, User (or customer) provider, and Connection provider.

    • There are three basic methods for accessing the Internet: Modem, dial-up networking, and high-speed telephone circuits leased from the phone company.

    • There are a number of different ways to find and acquire existing digital data over the Internet. The basic requirement for doing so is an internet browser and a relatively fast connection. (For example, a 14400 speed modem from home is usually slower than users are willing to put up with.)

  2. Types of Connectivity and Data Transfer
Tasks:


Competency


Learning Objectives:

  1. Student can use the Internet to connect to a remote computer.
  2. Student can find and retrieve relevant data layers.
  3. Student is able to process data from numerous formats, including GIS export files, tar files, and compressed files.

Vocabulary:


Topics:

  1. Unit Concepts
Tasks:


Mastery


Learning Objectives:

  1. Student understands basic concept of interoperability.
  2. Student is able to work with Java, Active X, and other Internet mapping tools.

Vocabulary:

Unit Concepts:


  1. Java

    The Java programming language and environment is designed to solve a number of problems in modern programming practice. Java started as a part of a larger project to develop advanced software for consumer electronics. These devices are small, reliable, portable, distributed, real-time embedded systems. When they started the project they intended to use C++, but encountered a number of problems. Initially these were just compiler technology problems, but as time passed more problems emerged that were best solved by changing the language.
    It is commonly thought of as a way to make Web pages sexy -- incorporating stock tickers, sound or video into Web pages. It has evolved into much more. It is becoming known as a computing platform -- the base upon which software developers can build applications. Developers can build a variety of applications using Java -- traditional spreadsheets and word processors in addition to mission critical applications used by the biggest companies: accounting, asset management, databases, human resources and sales.

    Java applications, or applets, are different from ordinary applications in that they reside on the network in centralized servers. The network delivers the applet to your system when you request them. For example, let's say that you want to check your personal financial portfolio. You'd dial in to your financial institution and use your Web browser to log into the bank's system. The portfolio data will be shipped to you along with the applet needed to view it. Let's assume that you're considering moving your money from one account to another. No need to perform a series of cut-and-paste exercises. The system will also send you an applet that will allow you to change the rate of interest and length of investment to perform a series of "what-if" scenarios.

    >From the corporations' point-of-view, Java will simplify the creation and deployment of applications thus saving money. Applications created in Java can be deployed without modification to any computing platform, thus saving the costs associated with developing software for multiple platforms. And because the applications are stored on centralized servers, there is no longer a need to have people insert disks or ship CD's to update software.

  2. Active X

    ActiveX is Microsoft technology similar to Java but integrated fully for Microsoft products only.
    For more information, read about the
    Microsoft Componet Object Model (COM).

  3. Internet Map Servers

  • Implications for GIS users

    Numerous java based mapping applications have been built. Interactive mapping over the Internet is possible by use of other methods, including Perl scripting and HTML programming. ESRI Internet Map Server for ArcView has a built in Java Applet (called Map Cafe) that allows Internet manipulation of GIS data layers in a manner that mimics ArcView. Map Objects Internet Map Server may be implemented using either Java, Active X, or neither (simple html coding).



    Tasks:



    Follow-up Units

    1. Unit 9 - Converting digital spatial data between formats, systems and software

    Resources




    Back To Core Curriculum for Technical Programs Welcome Page

    Currently maintained by Steve Palladino
    Created: May 14, 1997. Last updated: October 5, 1998.
    Content comments to Karen Beardsley Willett
    Formatting comments to Steve Palladino