Developing Internet-based user interfaces for improving spatial data access and usability

Chun Sheng Li, David Bree
Department of Computer Science, University of Manchester
E-mail: { csli | dbree }@cs.man.ac.uk

Adrian Moss, James Petch
Department of Environmental and Geographical Sciences,
Manchester Metropolitan University
E-mail: { a.moss | j.petch }@mmu.ac.uk


Abstract: Recent research findings indicate that a lack of awareness, and problems of accessibility and usability of spatial data are significant bottlenecks to increasing numbers of users and applications. Developing easy to use and widely accessible spatial information systems with user friendly and flexible interfaces has been found useful and effective particularly for inexperienced users. New users need to be aware of data services and be able to easily manipulate data. This paper is concerned with the design and development of user interfaces on WWW as knowledge engineering for improving the accessibility and usability of large and complex spatial data sets.

WWW is increasingly used in geo-spatial research and spatial data services. This networked hypertext environment (over the Internet) provides an ideal platform for the KINDS (Knowledge-based Interface to National Data Sets) initiative. The paper describes the design and development of the KINDS experimental system. A survey of academics in the north west of England examining their background, potential use of data sets, knowledge of computing and information systems and human-computer approaches to spatial data has been carried out. Based on the results a series of interfaces have been built on WWW using interactive maps, free text searching and fill-out forms to enable the user to easily discover information about spatial data services and example applications, and directly interrogate spatial data using ESRIs' Arc/Info GIS over the Internet.

Keywords: Spatial data access, digital data libraries, user interface design, information storage and retrieval, WWW, Arc/Info


1. Introduction

The Manchester Information Data sets and Associated Services (MIDAS) at Manchester University provides on-line access to key strategic research and teaching and data sets. These include the UK 1981 and 1991 Census data, Ordnance Survey and Bartholomew mapping data, Landsat and SPOT satellite imagery data. Together these key data sets are known as the 'National Data Sets' (NDS). Within the UK the NDS are freely available to the UK academic community under the Combined Higher Educational Software Team (CHEST) agreement. However uptake of spatial data sets has been found to be limited (MIDAS Annual Report, 1995).

Accessibility and usability of spatial data sets are major bottlenecks to increasing the number of applications (Li et al., 1995). To increase the accessibility of data, data providers must promote awareness of the existence and contents of those data sets. Many spatial data sets are of potential use to a range of different end users however if potential users are unaware of the existence of a data set the result maybe low utilisation and even a waste of an expensive resource (Cornelius and Strachan, 1989; Ruggles 1990; Walker et al., 1992). The provision of effective meta data (data about data) underpins attempts to increase data accessibility (McLaughlin and Nichols, 1994). The development and deployment of meta information systems is currently receiving global research attention. See for example GENIE (Walker et al., 1992), GeoWeb (Plewe, 1994) and Project Alexandria (Frew et al., 1995; Andresen, D. et al., 1995).

Poor usability results from low understanding of the use of spatial data. Spatial data analysis is a knowledge intensive activity. New users face a significant learning curve when adopting spatial data. The techniques associated are often significantly different from other types of analysis (Fotheringham and Rogerson, 1993). Users must familiarise themselves with the command structure and nomenclature of geographical information systems. Often users must also have expertise in data pre-processing and the systems of data provision used by data providers. Often data sets have to be used together with ancillary software packages to support operations such as data handling, map plotting, etc. To be able to use the data, the user should have both knowledge of data coverage, structure and the support systems. The combinations of high level technical skills required often result in spatial data handling being the preserve of the highly technically competent.

The Knowledge-based Interface to National Data Sets (KINDS) project aims to extend and intensify the use of the National Data Sets service available from MIDAS by educating potential users in their use. This is a dual mission. Firstly potential users are to be made aware of the existence of the data sets and secondly guided in their use. In order to achieve this, KINDS must reach the widest possible audience through the World Wide Web (WWW) with user friendly, easy to access and effective Internet-based search engines and hypertext interfaces for users to browse and handle spatial data.

This paper presents some selected findings from two surveys of potential users examining actual and potential use of the NDS; users' knowledge of computing and information systems and human-computer approaches to spatial data. The survey findings have been used as the basis for the development of KINDS experimental system. The paper describes the design and development of a series of interfaces on WWW to enable the user to easily discover information about spatial data services and example applications, and carry out real-time task processing with ESRI Arc/Info to generate maps over the Internet.

2. KINDS technical survey and findings

An extensive user survey including both semi-structured interviews and a technical questionnaire survey was carried out in Manchester at the early stage of the KINDS project to examine the academic requirements for data in teaching and research, and approaches likely to be used to accessing spatial data.

The Manchester academic community is amongst the largest in Europe. Seventy eight researchers, lecturers and post-graduate students at academic departments representing about 28 disciplines were approached to determine their needs for spatial data. The rationale of the user survey was to reveal the extent of data use within the academic community and paths of dissemination through users. Hence a mixed approach of quantitative and qualitative research methods was adopted. One of the prime objectives of the survey was to examine informal methods that users employed to gain support. This is the phenomena of a user who when faced with a problem using a computer system opts to speak to a nearby college or friend rather than approaching formal support services. In order to test the extent to which this took place in the data processing environment respondents were asked to suggest likely colleges to be interviewed. Whilst we fully acknowledge that this process does not adhere to random sampling rules and precludes rigorous statistical analysis a fuller picture of information dissemination within higher education was revealed as a result. The full results of the user survey will be reported in a future publication. We present some of our analysis here as background to what will follow.

32.1 Actual and potential use of the data
Use of spatial data is a knowledge intensive activity. Spatial analysis concepts and techniques are significantly different from those used in other analysis methods (Fotheringham and Rogerson, 1993). Users must familiarise themselves with the command structure and nomenclature of geographical information systems and the environment in which they reside - often GIS are mounted on high end workstations whilst the majority of new users are familiar with personal computers. The actual use of MIDAS data services appeared relatively low (see left columns in Figure 1). However many respondents felt that national data sets would be useful resources when they became aware of the coverage, contents and suitability of the data sets. The right columns in Figure 1 show the users' interest in using national data sets for their teaching and research.

Figure 1. Actual use vs. potential use of data

2.2 The user's technical ability
Knowledge of information technology seems to be a key factor which decides whether data set applications are successful. To investigate how these users access data and how user friendly and effective interfaces are for spatial data handling, a technical questionnaire survey was carried out. User knowledge, interest in networked information systems and human computer interaction issues were examined.

41 (52.5%) respondents returned completed questionnaires. The users were classified into four categories based on knowledge and daily use of information systems and computer-based applications.

Figure 2. Categories of the users

The extensive use of the Email facility indicated that most users already have access to computer networks. The Internet utilities (telnet, FTP, etc) were also widely used. WWW, the most recent Internet service, has become one of major platforms in the academic community. Over 70% respondents commented that WWW has been a useful resource for their teaching and research. The following table shows usage of the Internet-based Information systems against each category of user.

2.3 User interfaces
The user interface is one of the most important aspects of a spatial information system. Respondents were asked to select the types of interfaces they prefered. Selections were listed in sequence according to the respondents preference. The results were processed using weighting and ranking techniques, to assess which class of users found which interfaces more useful. Table 2 lists the interfaces ranked by the points scored.

Variable Std Dev Minimum Maximum Sum Valid No Label
A 1.24 2 6 184.00 36 interactive dialog boxes
B 1.19 2 6 124.00 29 tree structured menus
D 1.28 2 6 121.00 27 interactive maps
C 1.15 3 6 102.00 24 multi-selectable items
E 1.35 2 5 71.00 23 natural language interfaces

Table 2. Answers ranked by the total points scored

Some users were unable to comment on interfaces since they had no experience. To address this data on the respondents' background was considered. The following table shows the ranking indicated by the average points scored.

Variable Mean Std Dev Minimum Maximum Valid NoLabel
A 5.11 1.24 2 6 36 interactive dialog boxes
D 4.48 1.28 2 6 27 interactive maps
B 4.28 1.19 2 6 29 tree structured menus
C 4.25 1.15 3 6 24 multi-selectable items
E 3.09 1.35 2 5 23 natural language interfaces

Table 3. Answers ranked by the average points

2.4 Bottlenecks for the spatial data use
The survey revealed that many respondents were unaware of the spatial data service and the availability of data sets from MIDAS. Factors such as the significant spatial data handling learning curve, inadequate technical guidance and support help to explain lack of use.

Ease of use of user interfaces was preferred over functionality by users with less technical skill. The results of the survey were broadly in compliance with Davis and Medyckyj-Scott (1994) who reported that inexperienced users suffered difficulty transferring existing knowledge from their disciplines to spatial data handling. The results indicate that user friendly and flexible interfaces are important for improving spatial data usability.

3. Browsing spatial data

3.1 Bartholomew Map Data set
The Bartholomew data has been used as the test data by the KINDS project. It is a layered vector map data set comprising of point, line and area features. The data is structured into several sets comprising World, European, GB, London and Central London coverage's. In the Bartholomew (Great Britain) data set, the coverage is divided into tiles, based on the National Grid. Each tile, covering an area of 100 km square, is identified by a set of two characters as shown in Figure 3.

The data for either the GB national coverage or a tile is stored in 16 thematic data layers including administrative boundaries, contours, roads, railways and ferries, point features, urban areas, and water, etc. In addition, there is an annotations layer that contains textual information describing the feature data.

Within the Bartholomew data set, the features are organised into classes, each class is identified by an OBS_ACC_NO (observation accession number), which describes the feature and its entity types (point, line or polygon). The OBS_ACC_NOs uniquely identify each type of feature. For instance in the 'Roads' data layer, existing motorways are the unique code no (235), the primary trunk dual carriage way A roads are referenced as (173291) and so forth.

The following is a list of thematic data layers available in the Bartholomew (GB) data,

Data LayerDescription
Administrative BoundariesNational and Regional boundaries. Also includes lochs(lakes) and coastline.
ContoursContours at 100m intervals plus 50m and 150m.
Danger Zones
DrainageAlso includes canals.
Forest Parks
National Parks
National Trust
Other LinesIncludes other rights of way.
PointsLeisure, physical, road and industrial features and other transport and road distance points.
RoadsMotorways, A and B roads (Dual or single carriageway), minor roads and roads under construction.
Railways & Ferries
Regional Parks
Scenic Areas
TopographyRocky shores, beaches and woodland.
WaterLochs(lakes) and marshes.
Urban Areas
AnnotationCartographically placed text annotation.


3.2 The KINDS thematic map library
Potential spatial data users can gain information about the data set far more easily by browsing its contents than by reading a textual description. Thus enabling users to quickly browse through maps was a major objective in increasing awareness of the Bartholomew map data. The WWW is a fast and feasible way of presenting maps by distributing GIF format images. Such images which are of sufficiently good quality to display spatial objects but require only small amounts of memory and transmission time through the internet to the user's client software.

A number of sample maps were generated manually using ESRI Arc/Info Arcplot in the early stages of the KINDS project as a feasibility study. After the KINDS experimental system was released on WWW, users expressed an interest in seeing more detailed feature maps covering more specific areas. A feature map library of the major features contained in the Bartholomew (GB) map data was built to provide as detailed spatial information as possible. Spatial features are linked to their corresponding textual descriptions in hyper text markup language (HTML) pages. the features and their descriptions can be retrieved by either an interactive map interface or a search engine.

A virtual map library has been created. The library comprises of a full UK national coverage directory and 55 (for all 100km squares) sub-directories named after the corresponding tile of the National Grid. ESRI Arc/Info Arc Macro Language (AML) was used to create scripts to automatically generate feature maps from the data set and generate legends automatically. Arcplot is unable to export maps in GIF format and so SDSC Image Tools (produced by San Diego Supercomputing Center) was used for converting maps from Sun raster to GIF format.

3.3 Map interface
An interactive map (also sometimes referred to as a clickable image) is an inline image in an HTML document. The position of any mouse click within the image is captured using of the HTML tag ISMAP. When a user clicks the mouse over the image, the browser sends the pixel co-ordinates to the WWW server. The co-ordinate information is then processed by a program on the server to return an appropriate URL (HTML document) after comparing the mouse co-ordinates with with boundary location information in the virtual library imagemap database. The ISMAP tag provides for a limited degree of spatial querying of maps.

The required number of co-ordinates is dependant upon the shape of region to be defined. For circles, two pairs of co-ordinates are required: centre and any edgepoint; for rectangles, co-ordinates of upper-left and lower-right; for polygons of 100 vertices at most, each co-ordinate pair stands for a vertex.

The ISMAP facility is a powerful tool for producing WYSIWYG - "what you see is what you get" interfaces. The interactive map provides an intuitive and easy to use method for the user to swiftly browse through the Bartholomew data's "layer-and- tile" structure. The interactive map is based on a UK map using the National Grid, regional and county boundaries for geographical location referencing (Figure 3). Countries are marked in different colours for users to easily point to an exact area of interest, for example, Scotland is in blue colour, England in red and Wales in green. The user can simply move the mouse to an area (tile), and click on it to see detailed thematic data which is linked to the KINDS map library with over 800 feature maps.

3.4 Free text search engine
A "free text" search engine provides a direct entry for both expert and inexperienced users to quickly discover information about the data set. The interface uses a dialogue box for the user to enter a query which is passed to the search engine. The engine then retrieves an index file linking to the KINDS thematic map library and returns headings of documents which match the users query.

The index file is an important element in the free text search engine. Spatial features originally organised based on National Grid references have been restructured to be linked with UK counties/regions, and major cities. Each entry in the file is indexed with both the two-character identification of tile and counties or regions as well as major cities covered by the tile. Thus the search engine complements the map interface by allowing the user to search for information about specific cities rather than using the technical tile structure.

The search engine executes a Perl script in the server. It filters out terms (with less than two characters or starting with symbols such as "/", ".", "*") which may cause inaccurate results to be presented. Boolean (logic AND, OR) searches and right-hand truncation searches (with an "*" at the end of search term) are supported to make searches effective and efficient.

Help information including a list of indexed terms used in the search engine and information about its structure have been added for inexperienced users. The user is also able to start querying by clicking on 'hyperlinks' listed in the help file.

4. Making Road Maps - an example for directly handling spatial data across WWW

4.1 Representation of UK Roads
A dynamic link between the WWW and the Bartholmew data using ERSI Arc/Info has been created to familiarise the user with working with spatial data. The Bartholomew (GB) data set is structured into 16 thematic data layers comprising over a thousand classes of features, the roads data layer was selected for the dynamic link experiment. The roads layer includes 22 classes of road features (see the list below) and each feature is available in both the national coverage and individual tiles.

The UK roads are well classified and structured in the data set. The following taxonomic conceptual model shows the hierarchical relations of roads.

Figure 4. The hierarchical structure of roads

The Bartholomew data is mounted on the MIDAS national data sets machine as Arc/Info coverages. To assess the suitability of the data set, users must first manipulate it within Arc/Info. Therefore users must already have considerable knowledge of spatial data handling before being able to make a decision about the use of MIDAS data. New users are thus placed at a considerable disadvantage. The KINDS experimental system reverses this situation by allowing users to manipulate the data set without having to directly use a geographical information system. A visual approach based on a clickable UK map (as in Figure 5) is adopted for choosing the coverage. The user can select features they wish to see using a simple forms interface.

4.2 WWW form-based interface - a front end to the map maker
WWW forms (or fill-out forms) are a computer equivalent of paper forms. When users 'submit' a WWW form the users responses are transmitted to a program on the HTTP server for processing.

A form provides for complex interactions between the user and other software via programs residing in the HTTP cgi-bin (executable programs) directory.

A forms interface to the ESRI Arc/Info system has been created to allow WWW users to interact with spatial data (Figure 5). The interfaces allows the user to create simple maps which inform about the contents and potential of the Bartholomew data set. The test application allows the user to select and display elements of the road data layer of the Bartholomew data set.

4.3 AML scripts coding - an automatic process
Submitting the WWW form activates Arc/Info after generating an AML script via a translation program residing in the HTTP servers' common gateway interface (CGI). The translation program is implemented using Perl. The query string is split into items to determine where to access the data (i.e. which tile); what road features and additional themes are selected and what colours should be used to mark these features.

Subsequently an AML script then can be coded to produce a map reflecting the user's request. A file comprising map symbols and textual descriptions for generating map keys (legends) with the AML script is also produced. The result of the AML script is a 7'' by 9'' road map in encapsulated post script (EPS) format. The EPS file is then converted into GIF format using the Image Tools package. The process in its entirety normally takes about 60-80 seconds to complete and send a map (9k to 21k in size) back to the users WWW browser (this is dependant upon on the speed of the users network). The above map is an example road map of south Manchester and Cheshire generated according to the user request in Figure 5.

Example of a Road Map (Figure 6)

After the Arc/Info working environment being set up,

4.4 Outline of the KINDS Map Maker
The following schema illustrates the major processes of the KINDS experimental system.

Figure 7. Outline of the KINDS map maker

5. Future development

KINDS is a three year initiative at the end of the first 18 months of funding. In the remainder of the funding period further data sets and software packages will be added to the existing frame work. The existing interface will be complemented by a knowledge base to guide the users through use of the data. Knowledge acquisition for the development of the KBS is underway and coding will commence shortly.


Acknowledgements

Andrew Johnston, of the KINDS team, deserves special mention as co-worker in carrying out the user surveys. In addition we wish to thank the other members of the KINDS team (Keith Cole, Kamie Kitmitto, Jim Yip and Andrew Basden) not least for their help and advice. The Knowledge-based Interface to National Data Sets (KINDS) project is a multi-institution iniative comprising MIDAS (Manchester Computing, UK), the Department of Environmental and Geographical Sciences (Manchester Metropolitan University, UK) and the Information Technology Institute (University of Salford, UK). KINDS is funded by the Joint Information Systems Committee (New Technology Initiative NTI-107).

Bartholomew digital data is Copyright Bartholmew and available to UK academics from MIDAS under the terms of the CHEST license agreement. We wish to thank Dr. Tim Rideout of Bartholomew for his support and helpful comments. Bartholomew may be contacted at Dr. Tim Rideout, 12 Duncan St., Edinburgh, EH9 1TA, UK.

ESRI, Arc/Info, Arcplot and ARC Macro Language are registered trademarks of the Environmental Systems Research Institute, Inc., Redlands CA, USA.


References


Andresen, D., Carver,L., Dolin R., Fischer, C., Frew,J., Goodchild, M., Ibarra, O.,  Kothuri, R.,
Larsgaard, M., Manjunath, B., Nebert, D., Simpson, J., Smith ,T., Yang, T. and  Zheng, Q.(1995) 
The WWW prototype of the Alexandria digital library

http://alexandria.sdc.ucsb.edu/public-documents/papers/japan-paper/ 

Berners-Lee, T (1993)  The WWW initiative and HTTP, HTML, etc.  CERN WWW Documentation.

http://www.w3.org/    

Boston, T. and Stockwell, D. (1994) Interactive species distribution, mapping and modelling
using World Wide Web, Proceedings of the Second International WWW Conference 94, Chicago, USA.

http://kaos.erin.gov.au/database/WWW-Fall94/species_paper.html

Crossley, D and Boston, T (1995) A generic map interface to query geographic information 
using the World Wide Web.

http://www.w3.org/pub/Conferences/WWW4/Papers/australia/


Davis, C and Medyckyj-Scott, D (1994)  GIS usability: recommendations based on the user's
view.  International Journal of Geographical Information Systems, vol.8, no.2, pp175-189.

Frew, J., Carver, L., Fischer, C., Goodchild, M., Larsgaard, M., Smith, T. and Zheng, Q. (1995)
The ALexandria rapid prototype: buidlding a digital library for spatial information

http://www.esri.com/resources/userconf/proc95/to300/p255.html

Fotheringham, A.S. and Rogerson, P.A. (1993) GIS and spatial analytical problems.
International Journal of Geographical Information Systems, vol.7, no.1, pp3-19.

GENIE Project (1994).

http://www-genie.lut.ac.uk/info.html


Li, C S; Kitmitto, K and Cole, K (1995)  Developing a WWW interface to Arc/Info.  Proceedings
of ESRI'95 (UK) user conference, Nottingham University, Sept 14-15, 1995.

Li, C S; Moss, A et al (1995) Access Large and Complex Data sets via WWW.  Proceedings of
NTTS '95,  Nov 20-22, 1995, Bonn, Germany.

Massem, P (1994)  Demo of the interface between Arc/Info and the web. 

http://www.geo.ed.ac.uk/home/RESEARCH/MASSEM.HTML


McLaughlin, J. and Nichols, S. (1994) Developing a national spatial data infrastructure.  
Journal of Surveying Engineering ASCE 120 2 pp62-76.

McCool, R (1994)  The Common Gateway Interface, NCSA WWW Documentation.

http://hoohoo.ncsa.uiuc.edu/cgi/overview.html 


McGranaghan, M (1991) Matching Representations of Geographic Locations.  In Mark, D M and
Franks A U (eds), Cognitive and Linguistic Aspects of Geographic Space, pp387-402.

Medyckyj-Scott, D and Blades, M (1991)  Cognitive representations of space in the design and
use of geographical information systems.   People and Computer VI, British Computer Society
Conference Series, 1991, pp421-434.

Petch, J; et al (1995)  The KINDS project.  Proceedings of NTTS '95,  Nov 20-22, 1995, Bonn,
Germany.

Petch, J; Moss, A; Johnston, A and Yip, J (1995)  Spatial data services: analysis of user needs. 
(to appear)

Plewe, B (1994) The GeoWeb project.  Proceedings of the Second International WWW Conference 94, 
Chicago, USA.

http://wings.buffalo.edu/~plewe/paperwww.html


Raper, J. and Green, N. (1992) Teaching the principles of GIS: lessons from the GISTutor
project International Journal of Geographical Information Systems vol.6, no.4 pp279-290.

Walker, D.R.F., Newman, I.A., Medyckyj-Scott, D.J. and Ruggles, C.L.N. (1992) A system for
identifying data sets for GIS users. International Journal of Geographical Information Systems
vol.6, no.6 pp511-527 .