Thomas C. Hart, Stephanie L. Greene, Alexandr Afonin
Eighty map attributes were compiled on 500 meter cells, including longterm monthly moisture and temperature characteristics, terrain description, and soil character. Climate interpolations were kriged from over 100 stations distributed at an average interval of 30 kilometers, and were corrected for elevation effects via normalization and recalculation with an input DEM. Moisture and temperature zones were derived by ISODATA clustering on sets of nine and six seasonal combinations. There were mixed results with respect to the reliability of the data preparations.
More important was the progression of map data assimilation on the part of the field scientists. The modeled attributes provided a set of field corroborations which gave confidence and useful inference to the team members. Sampling gradients were devised with targets changing as the needed sets were filled. Experimental combinations were abandoned with improved confidence after repeated attempts to discover suitable sites. Nearly 600 seed samples were gathered with a better knowledge of their bias relative to natural distributions and abundances. Map query yielded attribute estimates that greatly extended each sample's passport database within the US and Russian seed archive systems.
Breeding programs make use of large archives of germplasm maintained by national and international genebanks. These institutions are responsible for the collecting, cataloguing, protection, and scientific access of literally millions of germplasm samples from tens of thousands of crop varieties. Unfortunately many national genebanks are under heavy budgetary pressure. This is especially true for the V.I. Vavilov Institute of Plant Industry in Russia, which houses one of the most significant and largest germplasm collections in the world (NRC 1993). Inasmuch as every nation shares a future where each collection may prove critical to future productivity, the collaboration between countries supports an enlightened common interest. For the past five years the USDA National Plant Germplasm System (NPGS) has been providing support to the Vavilov Institute.
Under this momentum during the summer of 1995 a major collecting trip was launched to acquire wild forage species (alfalfa, clover, grasses) in the Western Caucasus Mountains of Southern Russia. The mission's primary goal was to sample populations adapted to the acidic soil regions of the area. The objectives of two national programs and eight scientists from the disciplines of plant breeding, botany and soils, compounded by a field schedule of three months duration within the highly diverse environments of the Caucasus, resulted in an innumerable set of options from both biological and logistical perspectives. Careful planning was imperative to enable an efficient and focussed sampling strategy.
The traditional approach typically builds on known botanical geography. Trip preparation usually includes discussions with local authorities and examination of herbarium specimens to gain better understanding of the local environments and the predicted distributions of flora (Reid and Strickland 1983). The field effort then forms a series of hunting forays where collecting is guided by the progressive enroute discovery of plant associations, and by a rhythmic sampling to capture the decay of spatial autocorrelation. In the case of the Caucasus expedition, the ambitious breadth of science and geography led to an early commitment towards habitat-oriented maps to help guide the selection of target areas and sort through the long list of priorities. There was an initial desire to construct a set of all purpose strata, to be called Terrain Units, to serve as a sampling frame and a multi-attribute site classifier.
Apart from decisions on which area and taxa to target for field collection, the issues of efficiency and more effective sampling design can be addressed by map analysis. Traditionally there has been an emphasis on maximizing the number of accessions brought home, glutting the system with an often limited regard to the materials' representation of the targeted adaptation or the diversity of the landscape. Mapping helps by providing an overview which leads to more diversity from fewer samples: this is achieved in part by eco-geographic stratification and in part by examining off-route opportunities to reduce the ever-present bias of windshield survey. Maps also provide a medium for ranking collecting opportunities by uniqueness and ease of access. Thus the use of models and their map derivatives can rechannel expedition energy towards a more strategic and potentially rich set of samples.
In general the use of empirical models helps create a region-oriented deductive check on what has been todate a largely site-oriented inductive search process. By using "adaptive interpolation" and associated methods (Daly et al 1994, Hart 1988), local estimates of habitat character can be calculated more precisely using an array of sources, not limited to the direct assignment of the nearest known data loci. The map domain of GIS extends these model applications by providing a medium for interpolation, so that parameters at measured sites can be applied intelligently to the unmeasured sites of germplasm collection, especially if there is a supporting (or "adaptive") factor such as elevation already in continuous mapped form. It should be mentioned that plant habitats which yield important genetic material are frequently divergent inliers within broader mappable spatial units, so that map data alone are usually too coarse to capture all the habitats' key ecological attributes. However, the dangers of oversimplification are outweighed by the benefits of multi-attribute field verification and a systematic option for developing dynamic sampling plans. Field observations strengthen the foundation which maps provide by yielding detail which the maps miss, and by redefining in experiential terms the true character of the map legends.
The prime requirement of the model and map preparations was that field scientists be able to assimilate the results with some confidence regarding the manner in which the data were derived. Care was needed to ensure that the collectors were not alienated by the forced adoption of a new set of tools. The challenge was to show them opportunities to improve their traditional effectiveness, without a distracting amount of extra effort or steep learning curves on their part. Under these circumstances a complex set of algorythms for the data processing would have strained the trust of the collector-user, lengthening the field time required for acceptance.
Accordingly, the methods were as straightforward as possible: raw input data was used without weeding, and mapped to a UTM system of nested cells, so that raster data of differing cellsize would fit together in aggregate when needed. Terrain data were mapped as elevation to 500 meter UTM gridcells (Figure 1.), and as slope and aspect (Figure 2.) to 250 meter UTM gridcells. It is the elevation raster data (or DEM) which formed the base for the climate interpolations.
Five attributes of longterm monthly climate (precipitation (Figure 3.), humidity, maximum and minimum temperature, (Figure 4. ) and average windspeed) were mapped from over 100 stations at an average interval of 30 kilometers, and were corrected for elevation via a three-part process. First, normalization of the station data for each month used multiple linear regression coefficients from station easterliness, northerliness, and elevation. Then the normalized station array was kriged to create a trend surface as if the study area were flat and sitting at a median elevation. Thirdly, the elevation effect was restored for every raster in the continuous map by using the DEM in combination with the regression coefficients in reverse.
As an initial reduction of the plethora of 60 monthly climate measures for each 500 meter cell, zones were mapped for moisture (Figure 5.) and for temperature (Figure 6.) by ISODATA clustering on sets of nine and six seasonal combinations respectively. In each result the spatial pattern of clusters showed robust and clear behavior for both amounts and seasonality.
The satellite imagery was rectified to 62.5 (MSS) and 31.25 (TM) meter pixels on a UTM projection, and was presented with grid reference and date labels, but no other obscuring annotation. Thus the prime targets of meadow patches could be evaluated for stable phenology through the multidate dimensions of seasons and years. These data were also the best local reference for familiarizing the team with the potential for collecting in neighboring areas on travel timescales of minutes and hours.
Soil polygons were digitized from four distinct map series which varied in scale, date, geographic fidelity and legend specificity. Each legend entry was given a unique numeric code, and the polygons were plotted with these numbers as labels to guide field markup. The lack of agreement between the different series was a concern from the beginning, as was the inconsistency of map unit size and shape. Roads and major trails were taken from 1950s US 1/250,000 series, and were updated with more recent Russian maps at smaller scales. Satellite imagery was examined to refine alignments where there were questionable features.
The original concept of reducing the entire set of GIS maps to a Terrain Unit stratification turned out to be too ambitious. The all-important soils data were not sufficiently reliable, the climate attributes were best thresholded in a customized manner for groups of plants with similar tolerances, and the terrain variability was on a frequency which would generate tens of thousands of Terrain Unit polygons. There was skepticism that such a combination would result in a useful reduction of the huge amount of cross-variability within the study area, and in the end the fixed strata would only have been explanatory in the manner of a least common denominator. It was decided to keep habitat attributes accessible in their original parametric distributions.
The field "kit" from the map preparations consisted of five two-sheet series of mounted prints of the climate zones, terrain data, and an image mosaic, along with road and soils overlays. These were augmented by a folio of 80 medium-format prints of satellite scenes at a more local scale.
Prior to the team's convening in the field, a set of map slides were distributed with orientation legends so that each member could preview the data formats and the overall general study area patterns. The attitude towards spatial thematic data at the beginning was generally of high interest but with some concern about its routine use. It was apparent early on that the maps could have been designed with easier legends, and that their linkages to such source data as climate station position and monthly totals should have been more direct.
The maps immediately catalyzed a review of the need for sampling on a regular distance interval, and over several days a consensus formed that the slopes of ecogeographic gradients should determine sample frequency. Whether the maps documented such gradients was a question of concern during a "test period" of about ten days duration. In this period the career-long habits of intensive, inductive collecting strategies were challenged by the deductive indicators of the maps. But the consistently strong corroborations between map and landscape caused a general increase in confidence through the first week, to the degree that the material was adopted as the primary medium for determining potential sites 2-3 days in advance.
It was not, however, a case of technology taking over. The map indicators did not supplant the discussion and decision-making based on direct observation. And while the maps were appreciated for their successful summary of conditions over broad areas, they were obviously inappropriate for predicting habitat subtleties at the level of the sampled plant populations. It was necessary and often surprising to fill out detailed site characterization forms. However, the mutual support of observations and maps led the team members to use each medium to better benefit, and it is in this subtle synergy that the maps made perhaps their most worthwhile contribution. The presence of another authority made one's eyes search for more definite clues, as if there were a heightened need to prove one's observational conclusion. It did turn out that many samples were taken off-route and in peculiar places which probably would have been overlooked had the maps not been used. While it is too early to say if these sites contributed to a broader sample of genetic variability, it was an underlying assumption which provided the impetus for making troublesome sidetrips.
There were other practical uses of the maps. For example, a series of sites were designated as gradient test arrays to form experimental sets for comparing habitat variability and genetic variability. Another strategic use was as an accounting tool to track which habitats had been sampled for a taxon. Unsampled habitats within the species' probable range were then put on a site priority list. After several attempts to find these omissions, they were deemed improbable and dropped off the list with more confidence than if the map accounts had not been available. Lastly, with the knowledge that the maps had already effectively documented certain attributes of the sites (eg. elevation, macroclimate, insolation), the team members could concentrate on those observations which were only possible on site (eg. soil pH, drainage, land use influences).
Overall the acceptance of the maps' utility by the field team was surprisingly complete, given that the scales of geography and of montane habitat ecology are to a large degree non-overlapping. The correspondence of the two perceptual realms would have been less if the modeling approach had not produced the refined local estimates of longterm climatic conditions. The success of scale transferability was stimulating in the general sense of team energies, and in the specific cases of two key scientific discourses: firstly on the strategies for sampling genetic variability, and secondly on exploring the relationships between the broadly-described habitat factors and the very narrow and localized samples of genetic composition.
Perhaps more importantly, the spatial data developed in this project will support experiments into the linkages between genetic and environmental variability. Some taxa exhibit greater phenotypic plasticity and less differentiation in measurable genetic terms. Other taxa form large numbers of distinct genotypes whose distribution has little correspondence to habitat change. Still other taxa show measurable genetic differences which do correlate to the habitats from which they were collected. It is of significance to quantify the degree to which of these possibilities apply to the major taxa collected during the expedition. For the cases where genetics do correspond strongly to collecting site habitat, the research can evaluate the relative strength of linkages among the habitat attributes, and in quantitative terms may offer a calculable model of the environmental factors associated with each sampled adaptation.
Having explored the linkages on a broad variety of forage legume germplasm, the scientists in both countries will be in a better position to set priorities for attaching such map-based descriptors to historic samples within their collections. Finally, the reporting of success and failure in the search for these linkages will affect the objectives and the planning for future efforts to acquire wild germplasm adapted to specific environmental conditions, particularly for crop types related to those tested.
In terms of non-quantitative assessment, the climate maps are certainly more accurate in those areas close to stations with no intervening topographic barriers, and are probably better for those months where the elevation influence was strong. The weaker map areas were apparent during the field itinerary, but for this application the production of somewhat flawed continuous maps was not a waste of effort, because even for those areas the study did produce characterization which forced the team into local learning once the data weaknesses were exposed. It is fortunate that the continuous nature of the raster maps allows a flexible selection of areas for update, such that pockets where geometries or theme results are problematic can be recalculated with either new or reselected inputs.
Verification of the climate surfaces was not performed. Perhaps the only means to check the behavior of the longterm monthly aggregates is to withhold stations from the primary generation of the surfaces, in order to use them as verification data. This was not done on this project, largely because the needs for precise characterization of local habitats outweighed doubts about the subtleties of the surfacing process. In retrospect this may have been an overly expedient decision, but it was difficult to justify effort towards the assessment of supporting spatial data, when the expedition's primary objectives were in the areas of botany, plant ecology, and population genetics.
As stated in other presentations at this meeting, the use of models and standard map processing should render project results open to general use, with the expectation that the protocols are repeatable and are familiar enough to be accepted broadly by future users of the database. Often a model construction or a GIS compilation contains special adjustments for the initial case study, so that inconsistencies arise when follow-up activities include other applications within the same geography or a reprocessing into adjacent or different areas. Standardization is a boon to reuse of initial results, but carries with it a cost in terms of flexibility and ease of update.
There are two fundamental reasons why ecogeographic mapping initiatives will never be completely done. First the world is a changing place. Its environmental dynamics are described only in part by the historic periods of record. Secondly, the source material for map projects of the type described in this paper are always going to be deficient in some regard, and usually will have only partial improvement during any five year period of update. For these reasons the data themselves should be evaluated on the basis of whether they are good enough for now, as opposed to whether they are good enough forever. Similarly the use of models and GIS protocols should be judged against today's state-of-the-art, with the expectation that in five years we will have the benefit of much more method development and hopefully at least an equal amount of learning from mistakes. It is in meetings such as this current one that the community of practitioners can update their sense of today's best options in terms of tools, tricks and the raw material of parametric data.
Bolton, J.L., Goplen, B.P. and Baenziger, H. (1972) World distribution and historic developments. In Hanson, C.H. (ed.) Alfalfa Science and Technology. Agronomy 15:1-34.
Brown, A.H.D., and Marshall, D.R.(1995) A basic sampling strategy: theory and practice. In L. Guarino, V.R. Rao, and R. Reid (eds.) Collecting Plant Genetic Diversity: Technical Guidelines. Cab International, UK.
Daly, C., Neilson, R.P., and Phillips, D.L. (1994) A statistical-topographic model for mapping climatological precipitation over mountainous terrain. J Applied Meteorology 33:140-155.
Greene, S.L., and Hart, T.C. (1996) Plant genetic resource collections: an opportunity for the evolution of global data sets. Proceedings of the Third International Conference on Integrating GIS and Environmental Modeling. NCGIA Santa Barbara, WWW and CD.
Hart, T.C. (1988) Upper Jubba Watershed Performance (JESS Report No. 38), Ministry of National Planning and Jubba Valley Development, Mogadishu, Somalia.
National Research Council (1993) Managing global genetic resources: agricultural crop issues and policies. National Academic Press, Washington D.C.
Reid, R. and Strickland, R.W. (1983) Forage plant collection in practice. In Mclvor, J.G. and Bray, R.A. (eds) Genetic Resources for Forage Plants, CSIRO, Australia.
Steiner, J.J. and Greene, S.L. (1996) Proposed ecological descriptors and their utility for plant germplasm collections. Crop Science 36:(in press).
Thomas Hart
Spatial Data Associates
20 Bush Lane, Ithaca, NY 14850
Phone: 607-257-0951
Fax: 607-257-3167
e-mail: jhart@lightlink.com
Stephanie L. Greene
Forage Legume Curator, USDA, ARS National Plant Germplasm System
Washington State University-Prosser
24106 N. Bunn Road, Prosser, WA 99350
Phone: 509-786-9265
Fax: 509-786-9370
e-mail: sgreene@ars-grin.gov
Alexandr Afonin
Agricultural Ecologist, N.I. Vavilov Plant Industry Institute
Bolshaya Morskaya, 42, St. Petersburg, Russia, 190000
Phone:812-465-0751
Fax: 07-812-311-8762
E-mail: alex@dma.spb.ru