Paul Longley
Department of Geography, University of Bristol

Position Statement
Curriculum Vitae
Address

Position Statement

1. Introductory comments

'Econometric theory is like an exquisitely balanced French recipe, spelling out precisely with how many turns to mix the sauce, how many carats of spice to add, and for how many seconds to bake the mixture at exactly 474 degrees of temperature. But when the statistical cook turns to the raw materials, he finds that hearts of cactus fruits are unavailable, so he substitutes cantaloupe; where the recipe calls for vermicelli he uses shredded wheat; and he substitutes green garment dye for curry, ping pong balls for turtle's eggs, and, for Chalifougnac vintage 1883, a can of turpentine.' Valavanis 1959: 83, quoted in Kennedy 1979.
 

   Ever since undergraduate days in Bristol in the 1970s, I have felt fully imbued with the quantitative locational analysis tradition in geography - not least because some of the origins to the approach can be traced to Bristol in the 1960s. Yet I share the frustration aired in some of the other position papers that the spatial 'mainstream' to geography has been sidelined in the major geography journals, that it accounts for a reduced real share of intellectual activity in the subject, that its interdisciplinary outreach has been limited, and that today's GIS practice appears to develop largely separately from academia. (At least GIS and RS together make up one of six 'specialisms' that are key to UK central government's ranking of subject performance in what remains a mainstream discipline). I should like to comment on the way that research practice and, in particular, data handling, may contribute to this state-of-affairs, and to suggest how reconfiguring some priorities in spatial analysis might be beneficial.

   Spatial analysis, like econometrics, has benefited from the proliferation of digital data sources in recent years. Today's spatial data models allow far 'thicker' depictions of geographical reality to be created than those I cut my own teeth on. The transformational (Martin 1996) or simplifying assumptions entailed in building GIS-based models of real world spatial distributions have become much less heroic as a consequence. And of course it is well known that developments in computer hardware remain more or less commensurate with the increase in available data, making it possible to explore and model spatial interactions in more detail than ever before. Viewed in this context it is paradoxical that, in the UK at least, there is less faith than ever in 'predict and provide' approaches to planning, that business and service planning is turning away from conventional spatial analysis and that some of the spatial analysis community view essentially 'black box' techniques with increasing favour. Why is this the case?

   In the socio-economic realm one suggestion might be that any increase in the sophistication of analytical models has been more than outpaced by increases in the complexity of the systems themselves - witness, for example, the scale and pace of change in the physical forms of urban systems, or the fragmentation of household consumption patterns and lifestyles. A variant on this theme is suggested by Curry (1995) and others who seem to suggest that the quality of digital data can never be adequate for the resolution of significant problems of real world concern. A third suggestion, which is the one I will pursue here, is that the research community should refocus effort away from abstract semantic discussion or analytical elegance and towards the messy empirical problems of data integration. This should be done in as rational, orderly and application-centric a fashion as possible.

   Goodchild and Longley (1999) appraise of the 'linear project design' as a model for contemporary research in natural and social science. For generations of students, the formulation of research hypotheses has been followed by choice of a data collection method (and designing a survey schedule, as appropriate), identifying a sample design, piloting, field collection of data (with verification and resampling), collation of results, analysis, and report-writing. They reflect that, although this robust and defensible schema has underlain generations of student dissertations, it was never a panacea in practice, for reasons of data resolution, surrogacy and timeliness - and the amount of funding available for scientific research (we're all researchers now!). Today's GIS environment is also characterised by datasets which are collected by many different means and which pass through many hands. Many of the problems of data resolution, surrogacy and timeliness are today less problematic, yet more data are second hand and more data are collected using unscientific research designs (indeed they are often not principally collected for 'research' at all).

2. The developing digital data infrastructure
2.1 Changes in supply, pricing and access
In physical and social science alike, the costs of data have generally been a (sometimes the) major component of the costs of GIS creation. The order of magnitude of data costs reflects a number of technological and secular imperatives which govern the supply, pricing, and access aspects to data availability.

   In the early days of GIS, the ‘data bottleneck’ of (manual or semi-automated) digitising presented a major impediment to the creation of spatially-referenced databases, particularly if the hard copy source documents were complicated or ambiguous. Early software systems provided (by present day standards) fairly unsophisticated procedures for detecting and correcting the results of error-prone digitising. Moreover, ‘framework’ spatial data, such as those created and maintained by national mapping agencies were available only in hard-copy printed form, and in the early days of GIS there was resistance to initiating the task of converting ‘legacy’ hard copy maps to digital form.

   A wealth of digital data has since come into existence. First, and as with computer hardware, new technology is playing an important role. In particular, the wide (selective) availability of global positioning systems makes creation of new digital datasets much more straightforward than hitherto. Second, most national mapping agencies have gradually overcome their initial reluctance to create digital versions of their paper records, while at smaller scales private providers have created a range of digital atlas products. And third, computerised logging of the physical and social environment takes place with ever-increasing frequency, and to ever-greater levels of detail—for example through high-resolution remote sensing of the physical and built environments and the digital encoding of consumer purchasing behaviour (through loyalty programmes and the development of ‘relationship marketing’) in the socioeconomic realm.

   Yet this has not created a panacea for data modelling. In practice, accurate field recording of data remains an expert task and sound geographical analysis presumes sound data standards. Many national mapping agencies (such as Great Britain’s Ordnance Survey) have only succeeded in ‘going digital’ in the face of increasingly stringent public expenditure constraints by recovering vastly increased proportions of their creation and maintenance costs through user charges: the inevitable consequence is a rationing of framework data on an ‘ability to pay’ basis. Similarly hawkish data pricing regimes may apply to the data products from the new generation of high-resolution satellite sensors, while high royalty charges dissuade many business users from census data and census data products in some countries (such as the UK). At the same time, governments are reluctant to fund even their traditional linear project design-driven surveys, in view of the apparent tide of information created using new data capture technologies. With respect to the academic realm, the rise of interdisciplinary science is leading to a higher incidence of jointly-funded projects, and the commonplace situation in which the creators of spatial data may be widely separated from some of the communities of end users. As creators and users of data become more and more separated, in space, time, and intellectual tradition, the ability to describe data becomes increasingly critical. The creator must be able to tell the user about methods, accuracies, formats, and all of the details needed to transfer, open, and make effective use of the data. Moreover the user must be able to determine whether a given data sets meets or falls short of requirements, and this is increasingly accomplished through metadata.

2.2 The changing remit and requirements of modelling
The early years of the spatial analysis paradigm were associated with the development of wide-ranging models of physical and social systems. The remit of such models was avowedly ambitious, yet on reflection the data infrastructure was not commensurate with the tasks in hand. A number of commentators have identified reasons for the subsequent demise of large scale socioeconomic modelling activity, although the innovation of GIS has brought with it a renaissance in model-building activity. Moreover, any decline in large-scale modelling of socio-economic systems has been matched by the rapid growth of environmental modelling, much of it coupled with or otherwise making use of GIS.

   The new is quite different from the old, however. Within the socioeconomic realm, Birkin (1996) has described how the current generation of spatial interaction models, for example, seeks only to model limited (in terms of spatial extent, time frame and attribute range) aspects of urban sub-systems. This in part reflects secular trends in all developed societies away from system-wide planning, yet it also reflects a profound reappraisal of what we now consider to be the appropriate domain and capability of analytical models. Today’s urban models are much more data-rich in two respects. First, the revolution in the supply and availability of geographical information means that data no longer represent coarse zonal aggregations, and thus that the data model of spatial distributions bears a closer correspondence with reality. Second, the first generation of urban models used data derived exclusively from public sector sources and which were thus restricted to the limited range of variables of interest to officialdom. Whilst such data can be used, singly or in combination, to create crude indicators of human behaviour and activity patterns, such indicators bear at best a very imperfect correspondence with reality.

   Within the socioeconomic realm, the present status of modelling is rather ambiguous. Within academia, disenchantment with urban modelling leaves it as an area of activity with a significantly reduced real share of intellectual activity compared to, say, twenty years ago. Business applications of data-rich partial models of components of urban systems are buoyant, and today client repeat purchases provide vindication of the validity of spatial interaction and other modelling approaches. Within planning, there has never been a greater need for accurate data and analytical models of urban systems, because the rate, scale, and pace of change has never been greater. Yet, in the UK at least, there is disquiet about the ‘predict and provide’ approach to planning which has hitherto been based upon aggregate modelling approaches.

2.3 Model linkage: towards a new perspective?
The linear project design presumed that resources were available for a linear, vertically integrated sequence of events. Today’s research environment is much less straightforward. The strictures of public expenditure make it less likely that large-scale purpose-specific research will be funded, while information commerce makes it less than unequivocal that the best secondary data will be available. Yet data warehouses are bursting with data that might be combined to create richer profiles of landscapes, morphologies, households, and activity patterns than have ever been created before. While the developing geocomputation paradigm presents us with some ‘brute force’ mechanisms for searching out generalisations from large and complex datasets, we may have no way of knowing whether such generalisations hold any scientific validity.

   A negative view of this research environment would suggest that a price has been put on scientific truth that lies beyond the budget of many researchers. There is some truth in this, yet economic imperatives need also to be viewed in their technological context. In truth, as our retrospective of urban modelling above has illustrated, data collected through the linear project design did not provide a panacea in practice. Today’s digital data infrastructure is more detailed, relevant, and up-to-date than ever before. The problem is that this infrastructure is also more piecemeal, and hence possibly ill-founded and unsafe.

   The environment for spatial analysis is GIS, which has always been an applications-led technology. The sophistication of current applications requires a breadth and depth of data that could never have been sustained by established data collection methods. Today’s open and desk-top GIS alike are geared towards the analysis of application-specific ‘horses for courses’ datasets. Such datasets are required to model real-world systems that are dynamic and fast-changing, and thus the timescale between data collection and availability of secondary analysis needs also to be shortened. Our understanding of physical and social systems alike is now of such sophistication that infrequently collected, aggregate, and surrogate spatial data are simply not good enough. These are all crucial considerations, yet they all lie outside the remit of the linear project design. Are we therefore faced with a stark choice between scientific validity and ‘making do’ with inappropriate, overly-aggregate, out-of-date indicators? The rejection of Census-based geodemographics in favour of lifestyles (i.e. data warehouse) analysis in much of business geographics suggests that the road to scientific truth is no simple one-way street, and that proponents of inductive data-led thinking have their supporters in the world of application.

   Framed in these terms, one of the big questions for GIS at the turn of the millennium must be: Can the new digital data infrastructure be assembled together in a sufficiently accurate, orderly and rational way to bridge relevance, richness and academic respectability? Goodchild and Longley (1999) use the term ‘concatenation’ to describe the integration of two or more different data sources, such that the contents of each are accessible in the product. The polygon overlay operation is one simple form of concatenation. They use the complementary term ‘conflation’ to describe the range of functions that attempt to overcome differences between data sets, or to merge their contents. Conflation thus attempts to replace two or more versions of the same information with a single version that reflects the pooling, or weighted averaging, of the sources.

3 Model linkage in practice
3.1 RS–GIS concatenation
Census information and satellite imagery are diverse sources of information. Longley and Mesev (1997) use information from the 1991 UK small area census statistics as ancillary information to improve the classification accuracy of a contemporary (LANDSAT TM) image of Bristol. Information from the Census is used to assist in sample training and post-classification sorting. The resultant hybridised dataset is designed with a specialised purpose in mind—to provide detailed data models of the distribution of population and domestic property. This is used to reappraise conventional analysis of the density at which urban space is occupied—and through comparisons Longley and Mesev (1997) develop density gradient profiles for different categories of urban space filling, such as ‘built form’, ‘residential’, ‘households’, and ‘population’. They demonstrate that the differences between these apparently similar categories are more than semantic, and can heavily condition whether and to what extent we might consider density profiles characteristic of particular settlement types. The optimistic message of this work is that, once the differences between different conceptions of ‘urbanity’ have been clearly grasped, it is possible to develop a range of customised indicators of urban morphology. In this way, customised GIS-based data models are informing our thinking about the ways in which urban settlements fill space, as well as providing detailed information as to the morphology of particular settlement structures.

3.2 Conflating geodemographics and lifestyles
‘Lifestyles’ is a broad term that has been used to describe data pertaining to the consumption of a wide range of goods and services by identifiable individuals and households. Lifestyles data originate from a diverse range of sources, such as guarantee card returns, questionnaires attached to nationally circulated prize draw entries, and market research surveys. They are usually georeferenced through the postcode system (e.g. in the UK to the unit postcode, which typically comprises 15 or so addresses in urban areas). At least one UK ‘data warehouse’ estimates that it holds up-to-date information on 11 million UK households. Such data have evident use for direct marketing, for past consumption habits are key guides to future behaviour. Harris (1999) has analysed the anonymised individual/household records from one particular lifestyles questionnaire which was mailed out in October 1996. The number of respondents to this survey constitutes 10.8% of all households in Bristol, UK (population 636,000): this makes the survey larger in size than a mini census, yet the characteristics of non-respondents are likely to be very unrepresentative of respondents. In recent years, lifestyles approaches have gained some ground as tools for geomarketing at the expense of the use of census and composite geodemographic indicators, because the latter are increasingly out of date (the last UK Census was held in 1991), they are expensive to use because of UK royalty structures and, perhaps most damning of all, the census contains too few variables that bear an identifiable correspondence with consumer behaviour (most notably in the UK, because of the absence of an income question in the Census).

   The ‘geodemographics–lifestyles’ debate thus epitomises the tensions described in Section 2 above. Geodemographics is based on tried and trusted techniques and derives from a dataset (the Census) which has been designed and implemented using the most rigorous research design principles; and yet at the end of the day, it is out of date, and can supply at best only very imperfect indicators of real-world consumer behaviour. Sampling theory tells us that reweighting of largely self-selecting samples on the basis of sub-group response rates is foolhardy; yet survey research practice tells us that quantitative indicators should be direct and transparent, and that survey results are only directly applicable to the population from which the respondents were drawn (few of us would wholly identify with our digital past-selves who filled out a census form at the start of this decade).

   A middle path between these two lies in Batey and Brown’s (1995) assertion that lifestyle descriptors can be used as a wrapper to add depth to the labels assigned to different geodemographic groups. Thus, for example, the SuperProfiles category ‘affluent achievers’ has fairly distinctive Census characteristics in terms of house construction type, socio-economic status and car ownership, to which lifestyle labels about theatre and restaurant patronage, share registers, newspaper readership, and credit card usage are added. The data from which these labels are obtained are in many cases collected by unscientific means or strictly pertain only to coarser aggregations of households. Yet Harris’s (1999) cluster analysis of (unweighted) lifestyle data finds some practical validity to this approach: it nevertheless runs rough-shod over conventional views about how scale and aggregation issues should be tackled.

4 The future of spatial analysis
Goodchild and Longley (1999) suggest that the kinds of circumstances and imperatives presented in the preceding discussion will lead to the emergence of the following kinds of spatial analysis in the coming years:

(source: after Goodchild and Longley 1999)

   This statement has highlighted the way in which the advanced information economy of the late 1990s has multiplied the number of potential sources of (rich) digital information, yet in ways which will be less standardised and project-specific than those implied by the linear project design. A major challenge to the GIS community is to devise methods to reconcile diverse datasets with different data structures or spatial referencing systems. Only in this way will GIS be able to tease out the complex relationships that exist between projects, data sets, and analytic techniques in modern science. The self-perception of rigour amongst spatial analysts has hitherto been misplaced because of the vagaries and inadequacies of data quality, resolution and richness: progress requires us to face up to the fact that the linear project design was never a panacea in practice.

Batey P, Brown P 1995 From human ecology to customer targeting: the evolution of geodemographics. In Longley P, Clarke G (eds) GIS for business and service planning. Cambridge, GeoInformation International: 77–103
Birkin M 1996 Retail location modelling in GIS. In Longley P A, Batty M (eds) Spatial analysis: modelling in a GIS environment. Cambridge, GeoInformation International: 207–25
Curry M R 1995 GIS and the inevitability of ethical inconsistency. In Pickles J (ed.) Ground truth: the social implications of geographic information systems. New York: Guilford Press: 68-87
Goodchild M F, Longley P A 1999 The future of GIS and spatial analysis. In Longley P A, Goodchild M F, Maguire D J, Rhind D W (eds) Geographical information systems: principles, techniques, management and applications. New York, Wiley: 1: 567–80
Harris R 1999 A comparative analysis of a lifestyle and geodemographic typology. Working paper. Bristol, University of Bristol
Kennedy P 1979 A guide to econometrics. Oxford, Martin Robertson
Longley P A, Mesev V 1997 Beyond analogue models: space filling and density measures of an urban settlement. Papers in Regional Science 76: 409–27
Martin D J 1996 Geographic information systems: socioeconomic applications. London, Routledge
Valavanis S 1959 Econometrics. New York, McGraw-Hill
 
 



 
 

Curriculum Vitae

Paul Longley is Professor of Geography in the School of Geographical Sciences, University of Bristol, where he was previously a reader and a lecturer. His previous appointments have been as lecturer in the Department of City and Regional Planning, University of Wales, Cardiff (1984-92), lecturer in the Department of Geography, University of Reading (1983-84) and Scheel Scholar in the Universitaet Karlsruhe (1980-81).

Research Interests

His research interests are grouped around the use of geographical information systems (GIS) and quantitative methods in urban analysis. They include:      information integration within GIS (notably remote sensing - GIS integration);
     fractal geometry;
     local taxation;
     urban housing markets;
     statistical modelling;
     social survey research practice.

He is editor of the journal Computers, Environment and Urban Systems, reviews co-editor of Environment and Planning B: Planning and Design and an editorial board member of Papers in Regional Science, Geographical Systems, and GIS Europe. He is co-author (with Michael Batty) of Fractal Cities: a Geometry of Form and Function (Academic Press 1994). He is co-editor (with Michael Goodchild, David Maguire and David Rhind) of the second edition of Geographical Information Systems: Principles, Techniques, Management, Applications (John Wiley, 1998). Other co-edited works include: GIS for Business and Service Planning (with Graham Clarke: GeoInformation International 1995), Spatial Analysis: Modelling in a GIS Environment (with Michael Batty: GeoInformation International
1996) and Geocomputation: a Primer (with Sue Brooks, Rachel McDonnell and Bill Macmillan). He is a past chairperson of the (then) Institute of British Geographers Quantitative Methods Study Group and between 1991 and 1996 was European Organising Secretary of the Regional Science Association International.



 
 

Address

Professor Paul Longley
School of Geographical Sciences
University of Bristol
University Road
Bristol BS8  1SS
United Kingdom

     Telephone (Direct): +44 (0)117 928 7509
     Telephone (Dept. Sec.): +44 (0)117 928 7875
     Fax: +44 (0)117 928 7878

Web page: http://www.ggy.bris.ac.uk/staff/pl/pl.htm
Email: Paul.Longley@bristol.ac.uk


Go back to list