Werner Kuhn
Institute for Geoinformatics
University of Muenster
kuhn@ifgi.uni-muenster.de
To achieve interoperability, precise specifications of interfaces are
necessary. Elementary data types must be exactly described to allow for
higher-level types and operations must be specified with their inputs and
outputs. Selecting a language to write specifications with is one of the
most crucial methodological problems.
Natural language specifications are not appropriate as they require human interpretation and may thus lead to different understandings. The consequence would likely be different implementations of components which cannot cooperate. In current practice, most specifications formally describe structures for data types and the signatures of operations. They use some kind of formal syntax from the theory of formal languages (regular grammar, parsing, signatures from algebra). The interpretation of the results of operations, their semantics, is stated - faute de mieux - in natural language.
Logic-based formalisms have been explored as a means to specify semantics.
However, they are generally difficult to understand and check for correctness.
Also, it is difficult to express change in logic by pre- and post-conditions
(Hoare 1969; Wirth 1976; Floyd 1985)
Functional languages support the expression of an operation (acting on something and yielding a result) as a mathematical function. This idea is very close to the needs of interface specifications. It allows for a functional view of the services specified, without involving procedural notions.
Purely functional languages avoid side-effects (something else being operated on at the same time) and make sure that the values of variables, once assigned, never change. This results in referential transparency (the same name always refers to the same thing) and allows for a ‘mathematical reading’ of the code (like a collection of equations). Strictly typed functional languages come with a type theory (Milner 1978; Cardelli and Wegner 1985) which restricts the interpretation of formulae to properly typed models. This avoids some of the fundamental logical antinomies and other puzzles of mathematics.
Class-based functional languages permit an object-oriented style of
coding. The best known examples are Haskell (Hudak, Peyton-Jones et al.
1992; Peterson, Hammond et al. 1996; Peterson, Hammond et al. 1997) and
notably Gofer which allows classes with multiple parameters (Jones 1991;
Jones 1994; Jones 1995). The underlying theory, however, is substantially
simpler than that of standard object-oriented languages (Abadi and Cardelli
1996). For example, it draws a clean distinction between behavior inheritance
and implementation inheritance.
Running example in the full paper.
Functional languages are computationally complete: any recursively computable function can be expressed. Pure functional languages like Haskell or Gofer have minimal ambiguity in their foundations. These are directly related to ‘denotational semantics’ ((Scott 1977), more readable in (Stoy 1977)). Evaluation is done by ‘substitution’, the expression on the left-hand side being replaced with the expression on the right-hand side of an equation.
Only the Boolean type with its operations (and, or, not) is built into
these languages. It can be fully and easily tested for correctness. The
finiteness of computers limits the representable integers to a finite subset.
For simplicity, the languages typically use the integers and floating point
numbers of the underlying hardware. This means effectively that the ANSI
specifications for numbers and their implementation are imported into any
specifications written with a language. Neither the finiteness of representations
nor the use of (standardized) floating point approximations are a problem
when writing interface specifications; they are a problem of discrete computational
modeling (particularly of geometry) as a whole.
Gofer is a functional programming language and must therefore restrict expressions to executable ones. Mathematically, this means the language must be constructive: stating the existence of something without providing a method to construct it is not possible. This coerces the formal statements of (spatial) theories into computational tractability and thereby helps to make specifications implementable. At the same time, it provides a prototyping mechanism to the specifiers - something that is very badly missed in practice today.
Gofer separates the construction of a theory from that of a model for this theory. Classes define theories. Data types are the actual carriers of data from which instances build models for the theories. Thus, models are example implementations and can be used for prototyping.
Theories are restricted to constructive axioms. Properties which cannot
be explained as constructive axioms, can be stated as testable theorems.
Constructive models can be written and tested.
Object-oriented design is motivated by the dominance of objects in human thinking and mathematically based on universal (or multi-sorted or heterogeneous) algebra (Birkhoff 1945; MacLane and Birkhoff 1967; Birkhoff and Lipson 1970). An algebra consists of an (abstract) carrier sort (often called type), a set of operations with their signatures, and axioms defining the behavior of the operations.
Classes in Gofer are algebras and instances define particular models
for these algebras. Multiple instantiations of a class can exist in the
same program. For a particular data type, the behavior of multiple classes
can apply. This is a clean and sound mechanism for multiple inheritance.
From a programmer’s point of view, it is a drawback of Gofer that its standard, freely available implementation does not offer user interface building tools. The language also has no mechanism for persistence. Both these limitations are not an issue in specification writing.
The use of monads allows for a solution to both (King and Wadler 1993;
Peyton Jones and Wadler 1993) and several very powerful methods for user
interface specification exist under Unix.
Gofer (in its current implementation) has some space limitations. However, the limitations of human thinking appear to be more serious: we simply cannot comprehend the interaction of several pages of specification. A design of a database for storing and accessing Gofer code has been completed at TU Vienna and is currently being tested.
Our experience shows that ‘clean’ parts of a specification can be combined
without restrictions on depth (i.e. layer can be put onto layer). Any ‘unclean’
part will surface several layers higher up and confuse issues profoundly.
This is most likely not linked to language specifics, but to a general
problem of specification writing; a sharp tool like Gofer only makes it
painfully visible.
Gofer is a public domain software and implementations for UNIX, Macintosh,
DOS and Windows are available over the Internet.
The use of a purely functional language with a class-based type theory
is highly recommended for writing specifications of OpenGIS interfaces.
The semantics which come with such a language are mathematically sound
and provide a good foundation for building specifications. The constructive
nature and executability of the code ensures ‘no-nonsense’ specifications
and offers prototyping capability.
Birkhoff, G. (1945). Universal Algebra. First Canadian Math. Congress, Toronto University Press.
Birkhoff, G. and J. D. Lipson (1970). "Heterogeneous Algebras." Journal of Combinatorial Theory 8: 115-133.
Cardelli, L. and P. Wegner (1985). "On Understanding Types, Data Abstraction, and Polymorphism." ACM Computing Surveys 17(4): 471 - 522.
Floyd, C. (1985). On the Relevance of Formal Methods to Software Development. Formal Methods and Software Development. C. Floyd, H. Ehrig, M. Nivat, J. Thatcher (Eds.). Springer-Verlag, Lecture Notes in Computer Science: 1-11.
Floyd, R. W. (1979). The Paradigms of Programming. ACM Communications 22 (8), August 1979.
Frank, A.U. and W. Kuhn (1995). Specifying Open GIS with Functional Languages Advances in Spatial Databases (SSD'95), ed. Max J. Egenhofer, and John R. Herring. Springer-Verlag, Lecture Notes in Computer Science Vol. 951: 184-195.
Hoare, C. A. R. (1969). "An Axiomatic Basis for Computer Programming." ACM Communications 12(10): 576-580.
Hudak, P., S. L. Peyton Jones, et al. (1992). "Report on the functional programming language Haskell, Version 1.2." SIGPLAN Notices 27.
Jones, M. P. (1991). An Introduction to Gofer, Yale University.
Jones, M. P. (1994). The Implementation of the Gofer Functional Programming System, Yale University.
Jones, M. P. (1995). Functional Programming with Overloading and Higher-Order Polymorphism. Advanced Functional Programming. J. Jeuring and E. Mejer. Berlin, Springer. 925: 331.
King, D. J. and P. Wadler (1993). Combining Monads. Functional Programming - Workshop Glasgow 1992. S. Ayr, J. Launchbury and P. M. Sansom. Berlin, Springer Verlag.
MacLane, S. and G. Birkhoff (1967). Algebra. New York, Macmillan.
Milner, R. (1978). A Theory of Type Polymorphism in Programming. Journal of Computer and System Sciences 17: 348-375.
Peterson, J., K. Hammond, et al. (1996). Report on the functional programming language Haskell, Version 1.3. http://haskell.cs.yale.edu/haskell-report/haskell-report.html
Peterson, J., K. Hammond, et al. (1997). "The Haskell 1.4 Report." http://haskell.org/report/index.html.
Peyton Jones, S. L. and P. Wadler (1993). Imperative functional programming. ACM Symposium on Principles of Programming Languaes (POPL), Charleston, ACM.
Scott, D. S. (1977). Logic and Programming Languages. ACM Communications 20 (9), September 1977: 634 - 641.
Stoy, J. (1977). Denotational Semantics. Cambridge MA, MIT Press.
Wirth, N. (1976). Algorithms + Data Structures = Programs. Prentice Hall, Englewood Cliffs, N.J.