Accounting
for the semantic differences between various Geographic Information Systems
Mark Gahegan
Geographic Information
Science
Curtin University
of Technology
PO BOX U 1987
Perth 6001, WESTERN
AUSTRALIA
phone: +618 9266
3309
fax: +618 9266
2819
E-mail: mark@cs.curtin.edu.au
Web: http://www.cs.curtin.edu.au/~mark/
Geographic Information
Systems (GIS) employ distinct conceptual models of geographic space (Goodchild,
1992), often as a reflection of the origins of the software (e.g. CAD and
image processing). Some of these models are radically different, such as
the images employed by Idrisi( compared to the object coverages used by
Arc/Info(. Others are more subtly different, such as a topologically oriented
coverage compared to the 'spaghetti' polygons used by many 'desktop' GIS.
The meaning of spatial data is not the same within these models, and translation
that is based solely on the geometry can lead to logical inconsistencies
within the translated data. Whilst a good deal of very useful progress
has been made by the likes of ISO TC211 and the Open Geodata Interoperability
Specification (and related models), as yet these standards fall somewhat
short in addressing the semantics of the underlying geographic models.
In earlier work (Gahegan, 1996) a semantic notation was developed to describe
the various transformations that occur as data is operated on or changed
from one conceptual model to another. It is based on a data communication
protocol described by Pascoe & Penny (1995) which has been extended
to encompass certain key geographic properties and both a conceptual and
physical data model. The notation describes a 'before' and 'after' state
for a given transformation and is useful for communicating the likely effects
of a specific transformation in terms of the data properties that may change
as a consequence. In turn this can highlight any changes in the underlying
conceptual model that occur and furthermore can show where assumptions
regarding the meaning of the data are invalid or need to be made explicit.
More recently, further additions have allowed the specification of uncertainty
characteristics within the data (Gahegan & Ehlers, 1997).
This paper proposes
some extensions to the notation to help describe the (sometimes subtle)
differences between the data models used by different GIS and thus to aid
in the interoperability process by providing a concise and symbolic description
of geographic Perth data, specifying its semantic content as opposed to
relying on the geometry to imply a meaning. This description, termed a
'transformation expression' can be equally applied to both datasets and
operations. A dataset contains meaning which is imposed
as a consequence of the conceptual model of the GIS under which it was
gathered. This is represented by an expression of the form:
((abstract properties(, (geographic model(, (physical data structures(, (system details(),
where:
abstract properties
describe the data as the user perceives it (equivalent to an external view).
geographic model describes
the implications and limitations of the geographic model of space under
which the data exists. Physical data structures
describes how the data is physically encoded on the storage device, and
is necessary since the choice of data structure can have an affect on other
data properties. system details
describes the actual package and platform that the data resides in. In
practice, each of these components is further broken down into a number
of distinct parts. additional components may also be added, to fully embrace
interoperability standards such as the Open Systems Environment (OSE).
Transformations
require expressions with both a left and right side and show the changes
imposed on the data:
where the states are
described according to form given above. The after state contains a revised
expression where any properties that have changed are flagged. Thus it
is straightforward to build a taxonomy of transformation consequences in
terms of the properties of the data that change. A useful high level grouping
is:
-
Transformations changing
only the abstract data properties (no changes in the physical data structures
or geographic model).
-
Transformations causing
the geographic model to change.
-
Transformations causing
the physical encoding of the data to change.
-
Transformations moving
the data to another system.
For example, using
A, G, P, and S to represent the dataset properties respectively, a transformation
which moves data from one system to another but using the same geographic
model and data structures is given by:
When considering interoperability,
the transformation will often be made up of several components: first moving
the data to a new system, then operating on it, them possibly moving it
back again:
The
export transformation moves the data into the interoperability format from
the host system, changing its physical structure and (possibly) its geographic
model. From there it is imported into the internal format of the new system,
again changing its physical structure and (possibly) its geographic model.
Next, some operation is carried out (here shown as only affecting the abstract
data properties) after which it may be passed back again to the original
host. For simplicity, only the highest level properties are shown above,
with the introduction of further properties, the transformation expressions
become can quite specific in identifying exactly what has changed.
It is a relatively
straightforward task to move from the symbolic description a set of automated
rules and constraints that can determine if some interoperation is likely
to cause difficulties; by comparing a semantic description of a chosen
operation in one GIS with a description of a chosen dataset within another.
Any semantic differences between the description of the dataset and the
left side of the transformation expression indicate a potential conflict
in meaning that may require resolution. Mismatches can be graded according
to their severity, ranging from warnings to outright conflicts. In some
cases, it may be possible to carry out any required conversion in an automated
fashion; in others, some form of user intervention might be necessary.
In either case, warnings can be issued and the mismatch documented.
The work is motivated
by research into interoperability and data translation in regard to a new
three dimensional geo-information system being developed by CSIRO (Australia)
to support the needs of a wide range of geoscientists, including geologists.
The aim is to make this system a semantically rich environment by ensuring
that objects are ascribed meaning based on their modelling role, as opposed
to their geometry. Interoperability issues are not restricted to the more
'standard' GIS, but also include many of the available geological and exploration
packages such as Surpac( and Vulcan(. These provide a wealth of further
spatial primitives beyond the standard points, lines, regions and surfaces;
including volumes and profiles.
References
Gahegan, M. N. (1996),
Specifying the transformations within and between geographic data models.
Transactions in GIS, Vol. 1, No. 2, pp. 137-152.
Gahegan, M. N and
Ehlers, M. (1997). A framework for the modelling of uncertainty in an integrated
geographic information system. Proc. ISPRS International Workshop on Dynamic
and Multi-Dimensional GIS, Hong Kong.
Goodchild, M. F.
(1992), Geographical data modeling. Computers and Geosciences, Vol. 18,
No. 4, pp. 401-408.
Pascoe, R. T. and
Penny, J. P. (1995), Constructing Interfaces between (and within) Geographical
Information Systems. International Journal of Geographical Information
Systems, Vol. 9, No. 3, pp. 275-291.