Spatial analysis of problems of this sort can be handled very naturally through hierarchical statistical modeling, where there is a measurement process at the first level, an explanatory process at the second level, and a prior process at the third level. The resulting models are heteroskedastic and spatial, and the method of statistical analysis is Bayesian.
In our paper, we shall feature epidemiological data, reflecting the importance of disease mapping to society in general. Here, the "polygons" are known as "small areas", which has come to mean any group of regions whose whole makes up a larger region of interest. There are a number of issues related to the display, analysis, and interpretation of spatial epidemiological data that we believe are important:
* Improved small-area estimation with a focus on identifying extreme values.
Maps constructed using raw disease-incidence rates or standardized rates do not account for variation in the precision with which these rates estimate true underlying rates, because there are unequal numbers of person-years-at-risk across small areas. Hierarchical statistical analysis avoids this problem because the resulting estimates average small-area disease-incidence rates with regional or national data. Unfortunately, the resulting estimates for low-population areas can be "overly smooth" in the sense that they are less likely to be identified as locations of increased risk when they do in fact have high risk. Through the use of appropriate loss functions, we propose to construct small area estimates that facilitate the identification of areas of high risk.
* Assessing the fit of the statistical model and determining if high-risk locations have unusually high risk.
The use of statistical models to provide improved small-area estimates introduces the chance that model misspecification will lead to misleading or erroneous policy conclusions. Specifically, it remains to determine whether regions identified as having high risk for disease incidence indicate model failures or new potential risk factors.
* Relating small-area data to point-level epidemiological mechanisms.
Disease incidence or mortality data is usually reported for small areas, although the increased emphasis on data collection makes it likely that individual incidence data will be available in the future. Regardless, data regarding environmental risk factors are likely to be collected on different geographical scales than the disease-incidence data. Moreover, certain risk factors are determined at the individual or point level.
Statistical models are needed that allow for the integration of individual mechanisms with small-area data and for the possibility of aggregation of some small areas into larger small areas. GIS will play an important role in managing data of different aggregations and in displaying the results of the hierarchical statistical analyses referred to above.
Research presented in this talk is joint with Hal
Stern and Deanne Reber of the Department of Statistics, Iowa State University.
Between 1976 and 1983 he was Lecturer and Senior Lecturer at The Flinders University of South Australia. From 1983 he has been Professor of Statistics and, since 1993, Distinguished Professor in Liberal Arts and Sciences at Iowa State University. He is soon to leave Iowa State to become Professor of Statistics at The Ohio State University.
He is the author of over 120 articles in refereed journals and of two books, the most recent being, "Statistics for Spatial Data, rev. edn", published by John Wiley and Sons in 1993. His research interests are in the statistical modeling and analysis of spatio-temporal data, including statistical image analysis and remote sensing.
Dr Cressie is a Fellow of the American Statistical Association and The
Institute of Mathematical Statistics, and he is an Elected Member of the
International Statistical Institute.