FRIEND - FRamework for the Integration of ENvironmental and geographical Data

Martin Brändli and Andreas Ernst
Spatial Data Handling Division
Department of Geography
University of Zurich
Switzerland
brand@geo.unizh.ch
aernst@geo.unizh.ch
 
Different concepts and developments of managing persistent data (flat files, relational and object-oriented databases) have led to significant obstacles for the development of interoperable systems. This fact is especially true for the domain of geographical information systems (GIS). Currently available GIS are very similar with respect to their architecture and functionality. Nevertheless, they are not compatible with each other concerning the data they manage. Most systems distinguish between the management of geometrical and descriptive data. For performance reasons, geographical data are mainly stored in proprietary databases. Consequently, the interoperation between different GIS is very difficult, since many different interfaces have to be supported. FRIEND (Framework for the integration of environmental and geographical data) is an interdisciplinary project in the area of GIS. Its main objective is the integration of heterogeneous, space-related data repositories under a common roof. In particular, it aims at solving (or at least decreasing) the integration problem posed by the necessity to develop and maintain many interfaces. This paper intents to outline the integration approach which is used in FRIEND. Starting from a framework that approaches data integration from a database technology view in general, it specifically describes the integration of geodata components using an approach based on OGIS (Open Geodata Interoperability Specification), implemented with the Java programming language.
 
Typically, there are four choices to bring together data from different sources: migration (physical transfer of data from different systems to one new system), data transfer between two systems, use of common data catalogs, and federated database management systems (FDBMS). FRIEND is based on the last option. It allows the integration of data without migration and is thus called "logical integration". Within a FDBMS, local database management systems (components of the federation) keep their autonomy. Some of the issues of the architectural design of this logical integration will be discussed throughout the paper. FRIEND aims at developing a generic solution for the integration problem. This generic solution is implemented with a framework that is adjustable to particular situations. It utilizes a set of interoperable objects following an object-oriented approach, and is implemented as a layer connected to an object-oriented database management system which handles data that have to be stored on the global level of the federation. The data model used in the framework is an ODMG-compatible (Oject Database Management Group) object-oriented model.
 
In order to integrate GIS-components on the logical level, the object-oriented data model has to be extended by geographical data types and methods. Since one of the goals of the FRIEND-project is to reach high conformity with international standards, the integration of and access to GIS-components is based on OGIS interfaces. As a consequence, the object-oriented data model used for the integration layer has to be completed with OGIS-compliant data types and concepts. Even if the integration layer is developed as generic as possible, we start the implementation of the integration of GIS-components and geographical data from a real world integration problem which exists at the municipality of the city of Zurich (Switzerland), and which may be similar or comparable to many other urban administrations. During the past thirty years the different institutions of the municipality (surveyors's office, water and energy supply utilities, etc.) developed their own spatial data handling solutions leading to a heterogeneity of systems and data models in conjunction with multiple and inconsistent data collections of the same objects. Today, this situation is not acceptable anymore and the joint use of the data is required. The most important demand on data integration concern the supply utilities in conjunction with the surveyors's office, since the supply utilities heavily depend on consistent survey data. The complexity of the present data as well as the apprehension of loosing autonomy make a real world integration application difficult at the time being. However, the knowledge of the situation acquired by an analysis of the current state of the involved institutions motivated us to implement a so-called "miniworld", which simulates the actual situation by substantially reducing the complexity and size of the data and the number of involved institutions.
 
This miniworld is designed in the following manner: The federated system consists of two geodata components including the surveyors's office exposing objects such as parcels, buildings, and landmarks on the one hand, and the water supply utility exposing features such as reservoirs, pipes, pumps, consumer sockets, and sleeves on the other hand. In order to install a heterogeneous environment, the surveyors office is implemented with an object-oriented, the water supply utility with a relational database system. The realization of the miniworld involves four steps, each of it coming closer to the real world situation. It corresponds to the five-level schema to describe the architecture of a FDBMS proposed by Sheth and Larson (1990):

  1. The two components (survey and water supply) are implemented with the programming language Java storing the data in flat files. They will act as servers of geographical features. Since both components are expressed in an object-oriented data model, there is no difference between the data model of the two components at this state. Consequently, local and component schemata are identical.
  2. An export schema is generated for each component, that is, each object intended to be exported is extended by an OGIS-interface.
  3. Implementation of a viewer, which is able to access geographical objects of both components via the OGIS-interfaces. This viewer is the client part of the miniworld.
  4. The local components are differentiated. The water supply component will store its data in a relational environment, whereas the survey component keeps on being based on an object-oriented data model. This differentiation requires an additional schema translation for the water supply component, since the local schema and the component schema are not identical any more. The main benefit of this incremental and pragmatical approach is that the access mechanisms via OGIS-interfaces used in the miniworld implementation will finally be embodied in the integration layer described above. The current research concentrates on technical issues concerning the Java-based communication between different platforms, and modeling issues that arise from the mapping of the miniworld data types to the open geodata model types of OGIS. For both concerns, the proposed miniworld serves as an ideal testbed in order to verify the completeness and adequacy of the concepts and components proposed by OGIS. Since the miniworld is characterized by reduced complexity and amount of data, future work has to prove the adequacy of the chosen approach by using real world data. In addition, the miniworld will act as an application to explore geographical consistency requirements which arise at the global level of federated database management systems.
Sheth, A. P., and J. A. Larson (1990): Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys, 22 (3), 183-236.