IRIS: a Tool to Support Data Analysis with Maps

Gennady and Nathalia Andrienko
GMD - German National Research Center for Information Technology
Schloss Birlinghoven, Sankt-Augustin, D-53754 Germany
e-mail: gennady.andrienko@gmd.de
www: http://allanon.gmd.de/and/and.html
Tel: +49-2241-142329
Fax: +49-2241-142072
 
IRIS is a software system designed to assist users in exploration of spatially referenced statistical data such as economical, demographic or ecological data related to geographical locations. Analysis of such data is impossible without representing them on maps. The software systems known as GIS (Geographic Information Systems) can be used for data mapping, but, despite their power in operation with geometric information, the visualization facilities of GIS need to be improved. The most serious shortcoming is that no guidance is provided to the user in selection of presentation techniques for data to be analyzed, whereas finding an adequate presentation, on the one hand, is crucial for successful analysis and right conclusions, on the other hand, requires special knowledge from the field of thematic cartography. IRIS incorporates this knowledge in the form of generic, domain-independent rules. This allows to generate automatically thematic maps correctly presenting users data.

By automatic map generation IRIS releases the user from the necessity to think how to present her/his data and from the routine work on map building and allows her/him to concentrate on data analysis. To get a cartographic presentation with IRIS, the user needs only to select the fields to be presented.

To choose the adequate presentation techniques for given data, IRIS takes into account data characteristics (types of fields: numeric, categorical, logical; number of different values or value range for a field) and relations among data components (whether the fields to be analyzed are comparable, whether they can be summed to produce some meaningful total, whether some of the fields is included into another). Some of the techniques emphasize these relationships (see Fig.1).

Different presentation techniques provide different opportunities for analysis. For example, bars allow easy estimation of absolute values and differences, pies are good for seeing proportions, painting area objects in colors or shades according to values of an attribute gives an integral view on variations of this attribute through the territory. All these opportunities can be useful during exploratory data analysis when the user does not know beforehand the inherent features of the data. Therefore, whenever different presentations of the same data are possible, IRIS offers all of them so that the user could switch between them in the process of data analysis.

Maps on computer screen should not be mere reproductions of paper maps. The new (comparing to the history of cartography) output medium offers new opportunities for analysis: a map can be dynamic and reactive to users actions. In IRIS we develop tools for interactive manipulations with maps that are intended to strengthen the potential of different visualization techniques in data exploration. By "interactive manipulations" we do not mean such basic operations as zooming or direct access to data values through the map. Our idea is that each presentation method requires a specific interactive manipulation tool that exploits the principal features of the method and helps in the kind of data analysis this method is suitable for. We expect that in this way the user can utilize the potential of each presentation more completely and effectively.

An example of interactive manipulation is visual comparison designed for the so called choropleth maps representing values of a numeric attribute into color shades in which objects are painted: the greater is a value, the darker is the color. Choropleth map is good for the study of spatial distribution of attribute values: colors are promptly perceived by a human; similarly colored neighboring spots tend to be perceived together as larger patterns (figures, images), and this favors finding interesting spatial patterns. By comparison of two or more choropleth maps one can reveal relationships among several attributes: relatedness will manifest itself in similar patterns.

In "visual comparison" some number N between the bounds of the value range of the shown attribute is selected, and the map is redrawn so that values grater than N are depicted by shades of green and those less than N are shown by shades of cyan. The greater is the difference between some value and N, the darker is the shade used to represent it. The values exactly equal to N are shown in light yellow (see Fig.2).

So, visual comparison adds color hue to the expressive means used in the map. This encourages visual grouping of objects: neighboring objects painted in the same color tone tend, despite differences in shades, to be perceived together as a single figure. This evidently favors revealing spatial patterns. Thus, in our example (Fig.2) it is clearly seen that the least percent of children (less than the lower quartile) in Bonn is in the center of the city.

In addition to visualization and related interactive operations IRIS contains such facilities for data analysis as querying, calculations in spreadsheet manner, generation of derived attributes. IRIS is implemented in 2 variants: as a program running on PCs under Windows and as a WWW application with interface in Java language running under any WWW browser. The system can be accessed remotely at the address: http://allanon.gmd.de/and/java/iris. The WWW variant of the system was called more than 4500 times by people all over the world. IRIS was included in Top 1% web applets and Top 10 web applets lists (September 1996) by independent Java Applet Rating Service.