See information about the Specialist Meeting held at the International Conference and Workshop on Interoperating Geographic Information Systems, December 1997.
In GIS, this problem of monolithic software has been a significant impediment to the rapid implementation of analytic tools. New functions can be added to existing packages only with the cooperation of the package’s developer, and with a new compilation of code. The research community is unable to use such "closed" and monolithic systems for rapid prototyping of new ideas, since such activities would require access to the source code and knowledge of internal data structures. As a result, developers of new analytic methods, and designers of new scientific models are likely to choose open development environments, such as the programming languages, rather than GIS platforms to test their ideas--and it can take many years before simple, widely used and tested methods of spatial analysis appear as standard features in the popular GIS’s. Since its inception, NCGIA has been concerned with the need for more accessible prototyping environments, and faster implementation, in its desire to promote GIS as a spatial analysis tool. GRASS frequently served as a rapid prototyping environment because of its open, public-domain status, but that route was only available to users of raster data models, and now appears to be closing.
The advent of an "open" software design philosophy dramatically changes these settings. Rather than making an a priori decision about what functionality to incorporate into a piece of software, an open environment would allow users to combine components (functions or processes) in an ad hoc manner. This provides users not only great flexibility, but also allows them to focus on the particular tasks they want to perform. Furthermore, it enables consistency as users can use their favorite components for any task, independent of what application they are running. We argue that such open, modular environments are likely to be much more favorable to the rapid advance of GIS as a research platform.
Interoperability attempts to make software systems that are based on different data models work together. The interoperating software systems could be two or more spatial databases or GISs, or a GIS with a spreadsheet and a statistical analysis package. Much of the technical detail of format conversions, and transfers between apparently incompatible platforms would be handled invisibly by the system, since the information necessary to complete such operations successfully should be available directly, without user intervention.
Consider the following example. We wish to evaluate the query "determine the average rainfall in each of the 48 contiguous states". We have available data on the spatial distribution of rainfall, which we conceive as a continuous surface, and on the boundaries of states. Because both inputs are conceived as fields, with respectively a single value of the variables rainfall and "state" at every point within the common boundary of the states, no further specification should be necessary--the process by which the results will be obtained is sufficiently defined. This would be true, for example, if the query were handled by traditional, non-digital methods by giving instructions to an assistant.
In practice, however, in the current state of GIS development, considerable further specification is needed. Even if both data sets are located on the same platform, and accessible by the same GIS, we will still need to specify assorted conversions from or to raster and vector, changes of projections, overlay, calculations, etc. Yet all of this further specification is in principle redundant. Useful further information on the reliability of the results, which might have been available from a statistical analysis, would not be offered by the GIS, though it is dependent on some of the additional parameters, for example raster cell size.
This example illustrates how interoperability could be achieved within the set of field representations offered by many current GIS, with the effect of simplifying the complexity of the operation to the user, and simplifyin g the process of learning about GIS. On this basis we could argue that the specification of the overlay function, currently taught as a cornerstone of GIS education, is in fact always redundant. Kemp (1993) and Vckovski (1996) have both discussed this potential for interoperability between field representations.
A further level of interoperability concerns the easy transfer of information between systems. In the previous example, it might be that the rainfall data exists as an IDRISI file on System A running DOS, and the state boundaries as a polygon coverage in ARC/INFO on System B running Unix. The same arguments apply--it should be possible for the systems to anticipate all of the instructions that would have to be given to overcome the current lack of interoperability between them. The term "featurism" is sometimes used to describe this tendency of system designers to require excessive specification of operations, and the consequent excessive complexity of command languages and user interfaces.
The software industry (particularly Microsoft and Apple) has been moving fast in making this grand vision of interoperability happen with the first rudimentary support structures for simple inter-operations. Different interoperability models are available or under development, such as OLE (Object Link Embedded), CORBA (Common Object Request Broker Architecture), and OpenDoc. In the desktop market, we find the first products of spreadsheets, words processors, and spelling checkers that make use of them, and allow users to combine these components and move data around, or apply the desired processes. Users of word processing packages are already familiar with such rudimentary forms of interoperability, since a user of Microsoft Word may now be able to share documents with a colleague without any concern for the compatibility between the respective word processors--there is increasing interoperability not only between word processing software but between operating systems as well.
While the necessary steps towards making GIS software components work together are seen largely as a software-engineering problem, there are deeper semantic problems that are rooted in the use of different spatial data models. This proposed research initiative will focus on the theoretical aspects of interoperability for GIS.
Interoperability is not new to GIS. One of the earliest examples of the concepts of interoperation for GIS is the conversions between different map projections. In order to perform, say, an overlay of two data sets, the coordinates in the two data sets are expected to be in the same coordinate reference system; otherwise, the numerical processes of calculating line intersections will yield surprising results. Intelligent data sets would know about their map projections, and intelligent operations would know how to make themselves compatible.
The earliest attempts at making GISs work together with other software modules go back to a 1988 paper, when Johnston et al. (1988) described their efforts to perform an allocation problem by integrating GIS software with other software pieces, demonstrating the difficulties system developers and users had with sharing geographic data across different computational processes. The integration, called Orpheus, was not a GIS software package, but a methodology of how to use a suite of product from different vendors to accomplish a complex task in an integrated fashion. It included, of course, a GIS package, and other software for image processing, CAD, surface modeling (then not integrated with GIS); and architectural, engineering, and construction software. All products were installed on the same machine running under the same operating system, and transfer of data was done through file systems. Besides the observation that the various software pieces could be used in sequence, the most important aspect of this work was the fact that the team thought they had come upwith an informed decision for which the integration was seen as the critical component.
The idea of coupling GIS and other software was formalized by Goodchild (1987), Nyerges (1993), and others. Two packages were said to be tightly coupled when the user was presented with a single interface, and the two packages interacted with a common database. Loose coupling merely required the exchange of data between the two packages, often with a third software component for format conversion. Finally, functions were embedded in GIS when they were executed within the GIS, using the GIS user interface.
In the database arena, a similar observation was made, though the problems addressed were not spatial in their nature. Databases with different schemas were supposed to be used for an integrated analysis. Database management systems designed for different data models, such as hierarchical and relational, were supposed to be used in parallel. There the notions of schema integration and heterogeneous databases were invented. Different approaches to database interoperability have been discussed. The three most common scenarios are
• build a global conceptual schema that unifies all data models.
• build mediators (Wiederhold, 1992) among the different models.

This is the lowest level of interoperability and clearly a software engineering problem. It has only a few problems that are particular for geographic information, such as making cartographic display work consistently across different screen sizes, resolutions, and color schemes. Although there is substantial compatibility across hardware platforms for many operating systems, at this point the GIS user is still faced with a certain degree of incompatibility within the Unix world, and substantial incompatibility between different operating system implementations of some of the most popular products. Some vendors offer products for only one operating system, and very few have tried to establish any significant level of compatibility across the full range of popular operating systems--Unix, Windows 95, Windows NT, and Macintosh.

Such provisions of access to data are similar to the way some word processors are capable of reading files that were generated by another word processor. This works for text because its semantics are well defined: characters in different formats, fonts, sizes, styles, organized between left and right, top and bottom margins. However, as soon as there are semantically richer constructs—style sheets, figures, tables—most formatting gets misinterpreted or lost. Even such simple conversions as from Word for Windows to Word for the Mac do not work reliably and consistently.
For spatial data with a semantically rich structure, this approach is inappropriate. While attractive for the very reason of speeding up access, it falls short for at least two other reasons: (1) This approach only permits access to bit-strings, but largely ignores the semantics of what has been stored. Users have no control over what operations are supposed to be performed on what. It is, however, the operations that capture the semantics of spatial information. By just accessing raw data, grossly inappropriate use of data will occur; (2) Unless access is made through a high-level query language, such as an extended SQL version, no provisions for concurrent access to the same data are provided, making it impossible for multiple-users to do more than view the same data.
The problems of interoperability of GIS data sets are much more severe than for word processing documents because there is a rich variety of data models available for representation of geographic variation--fields, for example, can be represented in six fundamentally different ways. It is not surprising that the easiest transfers of geographic data occur when the data consists of simple points, lines, and areas, with no topological structures and with no attributes. Once representations include relationships, or capture the complex spatial variation of fields, the problem becomes much more difficult. By way of analogy, the problem of achieving interoperability between the field representations in a GIS is perhaps comparable to the problem of interoperating in the text world between a text document and a FAX.

This approach is tedious and results in the smallest-common-denominator data model. Data models that capture more semantics than the platform for interoperation lose during the exchange. Even if GIS A and GIS B both have the provisions for the same powerful data modeling concepts, they could not preserve this when "talking" to each other through the exchange data model.


In this alternative, data flows between any pair of GISs, and is processed by a variety of software modules offering a range of services. Each service must be capable of examining the data, to see if it is suitable for the desired processing--and data can be directed to appropriate services by the user. For this model to work, each data set must be formatted according to certain agreed principles. But it can be coupled with header information that adds detail to the specification. For example, we might agree that all data sets must follow the TIFF specification. But detail in the header of each would provide further information, such as the geographic footprint of the data set, or whether the data represents measurements on a continuous scale or classifications on a nominal scale, that would determine what processes could be meaningfully applied. In this way, it is possible for thegeneral specification to be quite broad.
The OGIS specification (http://www.ogis.org) is an example of this approach. It lays out broad specifications for the various classes of geographic data, allowing GIS designers to anticipate the range of specifications of data sets, and to build services accordingly. Data sets may be exchanged between software modules and services produced by the same vendor, within a single user interface, or exchanged between services of different vendors under user control. Thus the same model of distributed, modular processing can be scaled from a single user and workstation to a wide area network of the scale of the Internet.
In interoperating GIS environments, users do not have to be concerned with the location of processing software, or the locations of data. All of the steps of data conversion disappear, and only the environment persists. A user dealing with a representation of a field, for example, would interact with the system as if the object of interaction were a continuous field, rather than a collection of discrete, representative objects. A request that requires the combination of information from two different fields would automatically invoke the necessary command to overlay the two fields.
Current GIS’s are based on file systems, or database management systems. Interoperability raises the possibility of entirely new architectures that are tied into the system a priori, therefore fixing the operations allowed by the user. Interoperating GIS’s prohibit operations that make no sense to the environment, such as the combination of a field representation with a collection of discrete objects, or a spreadsheet operation on a collection of geographic lines.
If interaction with GIS’s can be raised to the level of the user’s conceptualization, then entirely new languages of interaction can be designed, that make sense to the user’s conceptualization, rather than addressing the discrete objects that are internal to the representation. In effect, this means that we can redesign some of the early results of research on GIS languages, such as Tomlin’s (1991) map algebra, to be closer to conceptualizations of queries, and thus easier to use.
In an interoperable world, the definition of a data set may be very different from our traditional views. The contents of a single map may be better expressed as several distinct data sets, each of which requires a different conceptualization, and thus a different mode of address in an interoperable world. Thus we need to examine the question of granularity of geographic information, and may need to depart sharply from traditional ideas in this regard.
We commonly think of geographic information as somehow homogenous, but in reality very different concepts are required to understand the distinction between a set of points sampling variation that is conceived as a single field, versus a collection of points representing the locations of outbreak of a disease, for example. Attempting to establish interoperability across this vast range of distinct concepts may be doomed from the outset--instead, it may be necessary to identify domains within which interoperability can reasonably be achieved, but between which interoperability is practically impossible.
The OGC community has focused to date largely on the data modeling issues. We need to address the question of the highest level of interaction between user and system, which manifests itself in the user interface. We should develop conceptual designs for high-level user interactions that are close to their thinking and appropriate to their particular application domain. Such designs can form the basis of future generations of GIS. The same data sets will appear differently to different users--a street may be an artery to a traffic engineer, but a barrier to an ecologist. Issues of scale, temporality, and data quality should be investigated within such an environment.
We organized a session on interoperability at the Third International Conference/Workshop on Integrating GIS and Environmental Modeling, in Santa Fe, NM, in January 1996. Papers were given by Karen Kemp, Kenn Gardels, and Andrej Vckowski; Vckowski also gave a paper at the NSF/ESF Young Scholars conference in August, 1996.
We will collaborate fully with OGC in issuing the open call for participation in the specialist meeting, and in planning the program for the conference--we anticipate that the conference program committee will consist of the core group from the initiative, plus a roughly equal number from the OGC community.
Johnston, K., D. Tomlin, H. Keegan, D. Smith, S. Sperry, N. Tonias, B. Baldassano, D. Roche, T. Johnson, and J. Koche (1988) Orpheus: an integration. In ACSM-ASPRS Annual Convention, St. Louis, MO, pp. 11-22.
Kemp, K.K. (1993) Environmental Modeling with GIS: A strategy for dealing with spatial continuity. Technical Report 93-3. Santa Barbara, CA: National Center for Geographic Information and Analysis.
Nyerges, T. (1993) In M.F. Goodchild, B.O. Parks, and L.T. Steyaert, editors, Environmental Modeling with GIS. New York: Oxford University Press.
Tomlin, C.D. (1991) GIS and Cartographic Modeling. Englewood Cliffs, NJ: Prentice Hall.
USDOC (1992) Spatial Data Transfer Standard (SDTS). Federal Information Processing Standards Publication 173 (FIPS 173). Part 2: Spatial Features. Washington, DC: U.S. Government Printing Office
Vckovski, A. (1996) Virtual data sets - smart data for environmental applications. Proceedings, Third International Conference/Workshop on Integrating GIS and Environmental Modeling, Santa Fe, NM, January 21-25, 1996. CD and http://www.ncgia.ucsb.edu.
Wiederhold, G. (1992) Mediators in the architecture of future information systems. IEEE Computer 25(3): 38-49.