Multi-server Internet GIS: Standardization and Practical Experiences

Carel van den Berg, Frank Tuijnman, and Tom Vijlbrief
Professional GEO Systems
Damrak 44
1012 LK Amsterdam
The Netherlands
E-mails: frank@pgs.nl and tom@pgs.nl

and

Co Meijer, Harry Uitermark and Peter van Oosterom
Cadastre
P.O.Box 9046
7300 GH Apeldoorn
The Netherlands
phone +31-55-5285806 or +31-55-5285163
fax +31-55-3557931
E-mails: uitermark@kadaster.nl, and oosterom@kadaster.nl
 
In this paper we present an approach to an open infrastructure for geographic information on the Internet. This infrastructure enables data providers to publish their data independently, while enabling end-users to access data from several providers simultaneously, and integrate the data locally in a geographic browser. Our goal is that an end-user finds accessing geographic information in this environment as easy as if he would be working with a state-of-the-art GIS package with all data that he is interested in on his own computer. The key elements that are required are: a common format for publishing meta-data on each server, a common SQL derived query protocol, standard file formats, and standard certificate based authentication procedures, for access control and (optionally) billing. An experiment with this approach has been carried out, with three data providers in Holland: The dutch Kadaster, the municipality of Almere, and the cable-tv company Casema. In this paper we present the major design decisions, the choices that we made for the prototype environment, and the relationship to ongoing specification and standardization processes for geographic data, in particular the relationships with the proposed European CEN standards, and the recently accepted specifications from the OpenGIS consortium.

INTRODUCTION

The current wave of GIS software for Internet is based either on the file downloading paradigm, or on the picture paradigm (presenting a map as a JPEG picture), or on the client-server paradigm (creating a closed interaction between a client and a single server). Neither of these approaches can capitalize on the main potential of the Internet: integrated and easy access to a vast amount of geographic information on various servers. In addition to that the interaction protocol between client and server is typically proprietary, which means that someone who browses geographic information needs software from the same vendor as is used by the publisher.

To make GIS popular on the Internet one needs to create for geographic information the same level of uniformity as the World Wide Web has done for text. The brilliance of the World Wide Web lies in the combination of the hypertext model with the Internet, together with a formatting standard for text (HTML). The hypertext model, however, does not work for geographic data: it is not particularly useful to jump from one map to the next. So another basic metaphore has to be used.

The standard model for geographic data on the computer is the layer model. The layer model of geographic information systems relates to a paper map like the hypertext model relates to text on paper. To make geographic data on the Internet attractive one has to set a standard for the layer model, so that we can obtain a topographic layer from one source, a pollution layer from another, and a property layer from a third source, and dynamically merge them in a geo-web browser.

To achieve this the following standard protocols have to be defined (in addition to support for authentication and billing):

Many aspects of these protocols are subject to ongoing standardization efforts. The format for returning geographic information is essentially a description of a file format. The method for querying the meta-data has a clear relationship with the meta- data standards, and clearinghouse related activities. The protocol for querying geographic information is new. The CEN has acknowledged the need for something like this (see CEN/TC 287, which specifies names and semantics of required spatial operators), but so far no complete proposal exists. The closest relevant specification for the query protocol is the OGC specification for SQL with simple geometric features. Despite the fact that it is relatively easy to identify protocols that can be used to address part of the problem, no comprehensive proposal exists so far that can be used to achieve open access to geographic data publishers on the Internet.

ARCHITECTURE

Our approach to the design of the meta-data structure, the query formalism and the format for the returned data is based on the object-relational formalism, where we include geographic features as attributes (that have a geometric type) within a relational table.

The motivation for this approach is based on the following considerations:

1) object relational database management systems with support for geographic data are now available from most major vendors (Informix, CA-Ingres, Oracle). Even if one wants to provide access to a file based collection of geographic data it is not difficult to implement a limited selective capability on top of it, though of course performing such selections will cost more time. So it is technically feasible for any organization to implement this functionality.

2) the object relational model is currently the only widely available formalism that can deal with geographic data, and in which all three required elements (a meta-data structure, a query formalism, and a format for returned data) are defined in an integrated manner. This is essential from a technical point of view: the meta-data does not just describes the data, it also has to provide the 'words' that can be used in queries, and it has to be clear which words can replace which syntactic element in a query. The returned data has to be understood as a response to the query, so there has to be a well defined relationship between the semantics of queries, and the actual data that is returned.

3) It can be mapped easily to the stateless http protocol, because sql is also stateless (meaning that any request can be handled independently from previous or subsequent requests).

4) It ensures that the browser requires only knowledge about which data is available, rather than detailed knowledge about file naming conventions, and tiles.

In mapping extended SQL to http, in such a way that it can be used effectively for geographic data publication over the Internet, a number of issues have to be tackled:

  1. Geometric types and there semantics have to be defined.
  2. A standard method has to be defined to formulate a query
  3. A format has to be defined for the returned data
  4. A safe and sufficient version of SQL has to be defined, in order to be sure that a server cannot be crashed by a malicious client.
  5. Authentication, and billing have to be included to support commercial geographic data publication.
  6. Compression has to be incorporated, in order to be able to transfer geographic data effectively over the Internet.
The full paper describes each of these aspects in detail, here we give just an example of a possible http request to a server:
//ooa.kadaster.nl/cgi-bin/magma?coordsys=rdm&database=kad4&relation=percelen&
attributes={magma_oid,geo_bbox,geo_pgn,owners}&
where=WRectangle.intersects(189000,485000,192000,488000)&and&
owners>='oost'&and&owners<='oostf'
This query requests the four named attributes (magma_oid, geo_bbox, geo_pgn, owners), for all parcels in the kad4 database, within the selected region, where the owners names prefix is between 'oost' and 'oostf'. The coordinates are given in RDM (plane state coordinates that are used as a standard in Holland). As a result a list of tuples will be returned, that match the where clause.

TRIAL EXPERIENCES

A trial with this approach has been carried out in the province of Flevoland, where three organizations, the municipality of Almere, the cable-tv company Casema, and the Cadastre, have implemented a system that allows them to access the data of the other parties directly. The data published by the Cadastre consists of the parcel boundaries. The data from the municipality consists of a large scale topographic map and large scale topographic plans for new neighbourhoods (Almere builds about 3000 new houses annually). The Casema has published the cable locations, both planned and existing. For browsing the Java based Lava GIS browser from PGS is used, and for the server the Magma GeoData publisher is used to interface between the http requests and various geographic datastores (Ingres 2.0 for the Cadastre, Illustra for the topographic data, and flat DXF files for the Casema). The three partners have installed their own servers (connected to the internet) providing spatial data on request.

In this trial the feasibility of dynamic data integration over the Internet has been demonstrated, supporting both raster and vector data at the client side, and using a Java based GIS browser to give everyone direct access. Secure communication and paid access to data are aspects that will be evaluated in the next phase of the trial.

CONCLUSION

To support open access to geographic data over the Internet three related protocols need to be defined.

The solution proposed in this article is based on the object relational model (with geometric types). Simple SQL request are encoded as URL's for http, and the result is returned as a list of tuples, with geometric attributes.

The trial, with the Lava/Magma software from PGS, has demonstrated that it is effectively possible to implement this approach, and to achieve in this manner open, easy and integrated access to data from different organizations. As such it can be the basis for a public infrastructure that allows each organization to publish her data independently, while at the same time it enables clients, both professionals and citizens, to have integrated access through the Internet to all available geographic information.