The Internet, a world-wide collection of interconnected networks of computers, has facilitated the accessing and sharing of information around the globe. The World-Wide Web is a project on the Internet that allows hypermedia information retrieval across the network. Geographic Information System (GIS) data were made accessible on the Internet by using the Geographic Resources Analysis Support System (GRASS) and the Common Gateway Interface (CGI) of the Hypertext Transfer Protocol in the World-Wide Web. An X Window System-based GUI enabled users anywhere on the Internet to manipulate and display GIS data layers of interest. A platform-independent, display-only map production system was also developed for data browsing. Data were also organized using visual search techniques (image maps) and made available in a vendor-neutral format (Spatial Data Transfer Standard).
Availability of spatial data in digital forms has long been one of the most significant obstacles to the widespread development and use of GIS applications. The first two International Integrating GIS and Environmental Modeling Conferences/Workshops identified digital spatial data availability as one of the greatest problems facing those developing integrated modeling and GIS applications. To overcome this, numerous data development efforts are being planned, are underway, or have been recently completed to provide digital spatial data. As such efforts develop spatial data in digital form, the data availability obstacle becomes a data access obstacle. Several access-related problems also arise including: (1) locating spatial data in digital form, (2) obtaining access to the desired data in a timely fashion, (3) accessing the most recent version of spatial data sets, (4) compatibility of data formats, (5) large storage requirements for data of interest, and (6) exchange of fees for access to data. The information super highway (the Internet) provides an opportunity to overcome many of these problems and to facilitate the application of GIS--even for the novice user.
The overall goal of this work is to demonstrate the potential of Internet for overcoming spatial data access problems and for facilitating the use of GIS by large numbers of diverse users. The Internet is an international communication infrastructure comprised of thousands of regional networks scattered throughout the globe (Comer, 1995). This world-wide connectivity includes more than 13,500 foreign networks and over 20 million users in over 50 countries. Presently, more than 15,500 billion bytes of information are transferred per month across this network. The Internet supports multiple modes of communication such as electronic mail, remote login sessions, remote file transfers (FTP), hypermedia searches, etc. The Internet is evolving as a major medium for information sharing and retrieval and is an efficient tool for scientific and engineering research and development.
The World-Wide Web (also called WWW, W3, or the Web) is a wide-area hypermedia information retrieval system providing access to a myriad of documents and data on the Internet. WWW is also a body of software and a set of protocols and conventions to provide easy and consistent access to information on the Web. There are many W3 browsers that can be used to surf the Web, including Mosaic (NCSA, 1995), Netscape (Netscape Communications Corporation, 1995) and Lynx (Montulli, 1995).
With the development of the WWW, opportunities arose to organize various resources found on the Internet in an efficient and user-friendly manner. One major type of resource is geographic information systems (GIS) data.
The main objectives of this work were as follows:
First, a brief overview of important protocols and conventions of the WWW is given. These include the Hypertext Markup Language, the Hypertext Transfer Protocol, and the Common Gateway Interface. Following this overview is a brief tour of an integrated GIS data organization, browsing, analysis, and transfer system. Finally, implementation details for this system are given.
WWW uses Uniform Resource Locators (URLs) to represent hypermedia links and links to network services (hyperlinks) within HTML documents (Berners-Lee, 1994b). The general structure of a URL is: protocol://host:port/path where
protocol is the Internet protocol (e.g., http, ftp, news), host is the name of a host computer (in RFC1037 format (Hunt, 1992 and Albiltz and Liu, 1992) connected to the Internet, port is an optional integer value specifying a host port, and path is a filename.
Many of the references in this paper contain URLs.
Connection: The client establishes a TCP/IP connection to the server using the URL address, usually using port 80; Request: The client sends a request for the URL using HTTP; Response: The server processes the request and sends back the requested document; Close: The server closes the connection and client terminates the TCP/IP connection.
All Web clients and servers must be able to speak HTTP in order to send and receive hypermedia documents.
There are many Web servers, each differing slightly in functionality and implementation. These servers, like the FTP daemon (Postel and Reynolds, 1985)are programs that respond to an incoming connection and provide a service to the client. Hypertext Transfer Protocol Daemon (HTTPD) (NCSA, 1995a) is a public domain Web server developed by the National Center for Supercomputing Applications (NCSA). HTTPD also records the date and time of requests along with the IP number of the client, which is useful for keeping track of traffic.
Gateway programs or scripts are server-side executable programs that are run (upon request from a client) to serve information. These gateways are initiated when the client requests the URL corresponding to the gateway. Since these scripts are executed on the server, gateway programs are somewhat independent of the client's operating environment. Gateways interact with the client and server using the HTTP. Gateways conforming to the HTTP specifications can be developed in any programming language, such as C, FORTRAN, Pascal, PERL, Bourne Shell, C Shell, etc.
Information requested from the server to the CGI script is handled using command line arguments as well as environment variables. The environment variables are defined when the server executes the gateway program.
Examples of environment variables are:
REQUEST_METHOD The method with which the request was made. Request method "POST" is generally used in HTML forms. QUERY_STRING Arguments to a CGI program. REMOTE_ADDR The IP number of the client making the request. This can be used for defining the remote DISPLAY when running X Window System programs. AUTH_TYPE If the server supports user authentication for security reasons, this variable is required for protocol-specific user authentication. CONTENT_TYPE This defines the type of data attached with the request to the server. CONTENT_LENGTH The length of the content attached to the URL which is required to decode the CONTENT of the request from the client.
CGI scripts can return a variety of document types, such as images, HTML documents, and audio files, in response to the request from the client. Information on the type of document/data that is being sent back depends on the content in the first line of the response. The first line will be different depending on whether the program is returning a full document or a reference to one. In the former case, the first line of the gateway output should be of the form:
Content-type: (a MIME type/subtype encoding (Borenstein and Freed, 1992)).
For an HTML document, the first line of the output is:
Content-type: text/htmlImmediately following this is a blank line/linefeed, which indicates to the server that the definition of the output is over.
Sensitive images or image maps are gateway scripts that can be used to make an image region sensitive with hyperlinks pointing to different URLs. The imagemap program is public domain software written in C that provides the above functionality. The image map software requires an ASCII map file that contains coordinates defining the regions (polygons, rectangles or circles) and the corresponding URL to be fetched. The ASCII map file can be prepared using the software xv or mapedit, which are shareware and public domain software, respectively. Geographically-sensitive image maps were used in the present research to arrange spatial data sets.
Many elements of an integrated WWW-based GIS data organization, browsing, analysis, and transfer system are discussed below. These elements allow the following interaction:
For example, if presented an image map of the United States, suppose that the client selects Texas. This leads to a county map of Texas. At this point, the client may select either a database for the entire state or for a particular county. Let's say that the client selected Upshur County, Texas and that no smaller region is defined on the GIS data server. The client would then be given a list of maps available for this county. For the sake of this example, assume that a vector map of streams and a raster map of land use are available. The client could then start a GUI for a GIS (as a CGI program) and then display these maps or perform a simple analyses:
After browsing data in this manner, suppose that the client deems the streams map valuable enough to make a personal copy. This vector map could then be downloaded from the server in SDTS format.
This type of interaction via WWW provides an excellent interface as well as world-wide access to a GIS data server. The elements of such a server necessary for the above example were developed. Implementation of each element is described below. First, a CGI interface to GRASS Lite, a graphical user interface to the GRASS GIS, as well as a display-only map-making interface, are described. Next, examples of data presentation/organization using image maps are given. Finally, a SDTS data conversion/transfer system is presented.
All of these interfaces are available at the following URL: http://ingis.acn.purdue.edu/
GRASS was chosen for this project because of many factors: (1) open file formats (Gardels, 1993 and Ireland, 1995), (2) freely-available source code (which allows for modifications and/or customizations), and (3) lack of licensing restrictions. As a public domain system, a WWW server using GRASS does not utilize costly floating licenses. In addition, one of the aims of our GIS/WWW project was to redistribute data developed for other research, which existed in GRASS format.
As with similar software systems, graphical user interfaces (GUIs) have been recently developed to allow the user to perform the GRASS commands in a graphical and user-friendly environment. GRASS Lite (Zhuang and Engel, 1995) is a GRASS GUI developed by Xin Zhuang of Wyle Laboratories (Arlington, Virginia) using the Tcl/Tk toolkit (Ousterhout, 1994). Tcl/Tk is native to the X Window System, `a vendor-neutral, system-architecture neutral, network-transparent windowing and user interface standard (MIT, 1991).' Since the X Window System is a network-transparent system, graphical applications can be physically run on the CPU of one machine but displayed on another machine's monitor (perhaps located on another continent), as long as both machines are on the Internet and are running the X Window System.
Using the Common Gateway Interface, software was developed to integrate GRASS Lite on the Internet. To utilize GRASS Lite for display/analysis of data through the Web, clients page through a series of three documents, exchanging information with the server. Each document is created dynamically by a CGI program. The first document uses the FORMS option in the HTML language to allow the user to specify an X Window System display for graphical output. It also instructs the user to allow the server access to their display (using xhost).
The next document allows the user to select a GRASS data set (LOCATION). It also overcomes a restriction that GRASS not be run concurrently by individual users (a built-in safeguard of GRASS for database integrity). To allow multiple sessions, a temporary HOME directory is then created with a GRASS start-up file (.grassrc) and a data directory. A data directory has a symbolic link to the PERMANENT mapset and a sub-directory containing a default mapset named workspace. Clients have write-access to their workspace but read-only access to the PERMANENT mapset.
The final document presented to the user executes GRASS Lite and provides helpful information to begin using the GUI as well as an e-mail address of the maintainer. As clients progress from the first to the third document/CGI program, information obtained from prior FORMS options are passed as hidden options in the HTML document. Each gateway program looks for specific items and encodes them in dynamically-created HTML documents.
In addition to map display, this GIS GUI on WWW allows clients to query GRASS databases and perform algebraic manipulations of raster data before downloading. Additional functionality may be provided in future versions. Figure 1 shows an example GRASS Lite session that was initiated via the WWW.
Figure 1. GRASS Lite Session via a WWW Session
It should be pointed out that the clients actually access a version of the software that was modified such that obvious security holes have been removed (e.g., any GRASS Lite options that gave users access to the UNIX shell have been disabled). More complete versions of this software are available from the author, but these are not recommended for use via WWW without security-related modifications.
The previous example using GRASS Lite requires users to run the X Window System, which is not always readily available to most PC users connected to the Internet. If users only need to view data (and not perform any types of geographic analyses), a display-only system (platform-independent) would be useful. Because of this, a similar approach was followed (using CGI scripts and the GRASS software) to build a display-only system. This system is also accessible from the above URL.
After selecting the GRASS data set, the LOCATION is posted to a CGI script that reads the available raster, vector and site data layers from an ASCII file and creates a HTML document with forms. The user has the option to select one raster layer, multiple vector layers and one site layer. The selected data layers along with the location name are posted to another gateway script which creates a HTML document that allows the user to select options to compose the final map. The options specified are the functionalities available in ps.map (Carlson, 1994), the PostScript cartographic output program of GRASS. The selected data layers and the corresponding map compositions are sent to another URL which processes the arguments and develops a script file that can be redirected into ps.map. The script reads the PostScript file created and converts it to a format viewable by most Web browsers. The final raster image is displayed in a HTML document as shown in Figure 2.
Figure 2. Result of Accessing and Displaying the Indiana Data Set County Raster Map
The approach described above can take a significant length of time (more than two minutes) to generate the desired map information and have it returned to the client Web browser (return time depends on the speed of the Internet connection). To overcome this constraint and to reduce the computational load on the server, the GRASS commands that display information were modified to directly create a file in gif format. The gif file can be directly displayed in the client WWW browser. A WWW form-based interface was written to provide access to this revised version of the GRASS display commands. The user selects the map or maps to be displayed, selects colors to be used, and provides text to be used as titles. This approach reduced the server computation time for creating requested maps by an order of magnitude as compared with the above approach.
For the GRASS GIS (as well as most UNIX-based software systems), documentation is usually provided in the form of man pages in roff format. This format is versatile in that it may be read on both ASCII and graphics devices. In addition to being the native markup language of Web browsers, HTML also has this characteristic. As demonstrations of the utility of HTML with respect to man pages, many have written conversion utilities for roff-to-HTML. GRASS man pages, however, were written for a slightly different set of roff macros than those used by most UNIX software. Therefore, a custom conversion program was written for GRASS man pages.
The result was WWW-based documentation for GRASS commands. In addition to the benefits described earlier, WWW-based documentation ultimately becomes indexed in large search systems (e.g., WebCrawler by America Online and Lycos (http://www.lycos.com/). This indexing points potential users to GRASS when searching WWW pages (e.g., a search for map algebra may point clients to GRASS).
An example illustrating this type of visual search mechanism utilized TIGER data for the State of Indiana. Since TIGER data are stored by counties, an image map of county boundaries in Indiana was created. By selecting the county of interest, a client could download all TIGER line data for that county.
The SDTS format was chosen in the development of a mechanism to make GRASS data available to users (who may or may not have GRASS, but whose chosen GIS will likely be able to import data in this format) through WWW. A series of CGI programs, similar to those described in the previous sections, allow clients to select any vector map in any available GRASS data set. The final CGI document runs the GRASS command v.sdts.out (Stigberg and Qian, 1995), creates a tar archive of the results, and sends the tar file back to the client. The first line of the output for this is:
Web browsers configured to recognize this type of MIME-encoding will present users with a dialog box asking for a file name to save the data.
As with other data handling procedures on the Internet, such as FTP, gopher, etc., it is always advisable for users to be aware of the data they are downloading, and more importantly, that others may be able to peek at data streams. For users of these interfaces, this may be of little concern since geographic information downloaded from public servers is rarely sensitive in nature. However, for sensitive data, encryption procedures (such as PGP (Zimmermann, 1995)) are an option.
Several World-Wide Web (WWW)-based interfaces for GIS data sharing and software access were discussed. The GRASS Lite graphical interface to the GRASS GIS was made available via WWW using the CGI interface. This facilitated access to the geographic data sets and allowed simple analyses to be performed without actually downloading data or software. A platform-independent, display-only map creation interface provided a good browsing facility for potential data consumers. The data conversion capabilities of GIS were used to demonstrate the possibility of sharing the data in different formats, particularly in the Spatial Data Transfer Standard (SDTS) format. Presentation of spatial data using image maps allowed users to reach particular data sets quickly and efficiently. These interfaces demonstrated the capacity to view, manipulate, and distribute geographic data via WWW in an efficient, organized, and user-friendly manner. These interfaces are a major advancement in information sharing for GIS data.
Future work may evaluate additional GIS-related applications via WWW. Automated techniques for locating, accessing and using disparate spatial data (or information derived from these data) that are distributed on the Internet are needed. Such techniques will facilitate the development of complex decision support systems and GIS applications that are capable of providing the information required to assist in solving a wide variety of problems. In the more immediate future, other similar interfaces can be developed to extract data from RDBM systems. Other GIS software, such as ARC/INFO, can be integrated on the Internet and the usage of this facility can be restricted to a particular group by taking advantage of the access authorization facility in HTTP. Access authorization and payment mechanisms may also be used to restrict access to particular data to support costly data development, software licenses, and system maintenance.
ASCII American Standards Code for Information Interchange (ISO 646) CGI Common Gateway Interface FTP File Transfer Protocol GIS Geographic Information System GRASS Geographic Resources Analysis Support System GUI Graphical User Interface HTTP Hypertext Transfer Protocol HTTPD Hypertext Transfer Protocol Daemon MIME Multipurpose Internet Mail Extension (RFC 1341) PERL Practical Extraction and Report Language PGP Pretty Good Privacy SDTS Spatial Data Transfer Standard (FIPS 173) TCP/IP Transmission Control Protocol/Internet Protocol TIGER Topographically Integrated Geographic Encoding and Referencing (U.S. Census Bureau) URL Uniform Resource Locator WWW World-Wide Web
Albiltz, P. and Liu, C. 1992. DNS and BIND, A Nutshell Handbook, Mar 1993 edn, O'Reilly & Associates, Inc., Sebastopol, Calif.
Berners-Lee, T. 1994a.
HTTP: A protocol for networked information, Internet Draft.
Internet Engineering Task Force.
Berners-Lee, T. 1994b.
Uniform resource locators, a syntax for the expression of access
information of objects on the network, Internet Draft. Internet Engineering
Task Force. W3 Consortium and MIT Laboratory for Computer Science, 545
Technology Square Cambridge, Massachusetts.
Berners-Lee, T. and Connolly, D.W.
Hypertext markup language - 2.0, Internet Draft. Internet Engineering
Task Force. W3 Consortium and MIT Laboratory for Computer Science, 545
Technology Square Cambridge, Massachusetts.
Borenstein, N.S. and Freed, N.
Multipurpose internet mail extension (MIME), Internet RFC-1341,
Internet Engineering Task Force.
Carlson, P. 1994. ps.map: software for cartographic map creation. GRASS 4.1 Reference Manual, U.S. Army Corps of Engineers, Construction Engineering Research Laboratories, Champaign, Ill.
Comer, D.E. 1995. The Internet Book, Prentice Hall, Englewood Cliffs, N.J.
Gardels, K. 1993. What is open GIS?, GRASSCLIPINGS: The Journal of Open Geographic Information Systems 7(1):40.
Hunt, C. 1992. TCP/IP Network Administration, A Nutshell Handbook, May 1994 edn, O'Reilly & Associates, Inc., Sebastopol, Calif.
Ireland, E. 1995. Data access is the path to mainstream mapping applications, Geo Info Systems 5(5):61-62. OpenGIS Special Section.
McCool, R. 1995.
The common gateway interface, Software available from the National
Center for Supercomputing Applications at the University of Illinois in
MIT 1991. The X Window System, version 11, release 5 edn, Massachusetts Institute of Technology.
Montulli, L. 1995.
Lynx users guide version 2.4, Software available from the University
NCSA httpd 1.4, Software available from the National Center for
Supercomputing Applications at the University of Illinois in
NSCA mosaic, Software available from the National Center for
Supercomputing Applications at the University of Illinois in
Netscape Communications Corporation 1995.
Welcome to Netscape, Software available from Netscape
Communications Corporation, 501 E. Middlefield Rd., Mountain View,
NIST 1992. Spatial data transfer standard, Federal Information Processing Standard Publication 173. National Institute of Standards and Technology, U.S. Department of Commerce.
Ousterhout, J.K. 1994. Tcl and the Tk Toolkit, Addison-Wesley.
Postel, J. and Reynolds, J. 1985. File transfer protocol (FTP), Internet RFC-959, Internet Engineering Task Force.
Stigberg, D. and Qian, T. to appear 1995. v.sdts.out: software for exporting SDTS data. GRASS 4.2 Reference Manual, U.S. Army Corps of Engineers, Construction Engineering Research Laboratories, Champaign, Ill.
Zhuang, X. and Engel, B.A. 1995. Tcl/Tk GUI toolkit offers cross-platform application development, GIS World 8(7):58-60.
Zimmermann, P. 1995. The Official PGP User's Guide, MIT Press.
James Darrell McCauley, Engineer (formerly with Purdue University) Case Corporation email: firstname.lastname@example.org Telephone: 708.887.2055
Kumar C. S. Navulur, Research Associate, Agricultural and Biological Engineering Purdue University W. Lafayette, IN 47907-1146 email: email@example.com Telephone: 317.494.1196 Fax: 317.496.1115
Bernard A. Engel, Associate Professor, Agricultural and Biological Engineering Purdue University West Lafayette, IN 47907-1146 email: firstname.lastname@example.org Telephone: 317.494.1198 Fax: 317.496.1115
Raghavan Srinivasan, Agricultural Engineer and Associate Research Scientist Blackland Research Center Texas Agricultural Experiment Station Temple, TX 76502 email: email@example.com Telephone: 817.770.6670 Fax: 817.770.6678