Ranjan Muttiah, Raghavan Srinivasan, Bernard Engel
The theory and development of a public domain neural network package for the GRASS GIS system is described. Classical classifiers using Bayes' selection rule, nearest means, and nearest neighbor were included in the interface for comparison against neural network predictions. Sample application of the package is presented erosion best management practices (BMP) identification, and remote sensing.
Neural networks are a computational method of data analysis that are an extension of traditional statistical methods such as regressions (White, 1989), and function approximation (Baum and Haussler, 1988; Hertz et al., 1991). In statistical regressions the modeler has to a priori specify the functional form of the relationship likely to exist in the data set (nonlinear vs linear vs multiple regressions). The best functional form for the data is based on an error measure such as the least squares criterion. Neural networks, form an "internal weight" representation of the data as to minimize an error criterion (usually least squares) without too much a priori judgements about on the functional form for the data (McClelland et al., 1986). This for example provides a means of automating classification of very large datasets. Since GIS systems are data intensive in the spatial domain, and many different types of datasets (remote sensing, topography, hydrography) are used to make decisions and judgements, neural networks may find a useful role in capturing expertise, and in interpolating and extrapolating knowledge as an aid to decision making. The GIS system chosen here was the Geographical Resources Analysis Support System (GRASS) developed by the US Army Corp of Engineers (CERL, 1993). The primary reason was that GRASS is public domain, and the model developer can write specific routines using the C programming language (Kernighan and Ritchie, 1984) and the GRASS GIS graphic libraries.
The neural network interface that was developed here for the GRASS GIS platform incorporated only supervised learning. We selected the back-propagation algorithm of McClelland et al., (1986), and Baffes (1989) and the quick-propagation algorithm of Fahlman (1988). In the back-propagation algorithm, the network is iterated in "weight space" to minimize the mean square error measure at the output nodes given by:
Where dj and oj are the desired and actual values at the output
units of the network. The actual values at the output units are calculated
by propagation of the input information through the network (using scalar
products of weights and inputs in each interconnection). The weights between the units is
stored are in a weight matrix W. Each hidden unit sums its input, and then applies a
transfer function to this sum. The commonly used sigmoidal transfer function is given by:
Where a is the gain or scaling factor and b is the bias or the amount of translation of the sigmoidal transfer function on the x-axis. It has been shown that back-propagation networks are equivalent to Fourier series approximation when these sigmoidal units are used (Lapedes and Farber, 1987).
In a fixed topology network like back-propagation, the number of hidden units have to be decided before training on the data. The weight update between hidden unit i and the output node j on the n+1 th iteration is given by the gradient descent:
Where
is the back-propagated error to the hidden unit j
from the output units on presentation of input pattern p to the network,
is a learning rate, Opi is the input reaching hidden unit
j from unit i not in the same layer, and
is a momentum constant
that uses the previous weight values to avoid local minima on the error
surface.
In quick-propagation the weight updates are made according to:
Where
is the error change on
weight change between any two units in the network.
To compare and evaluate the predictions made by neural networks, additional
classifiers from the pattern recognition field were incorporated into the neural
network interface. Since GRASS already has the maximum liklihood classifier
(which assumes a normal distribution function for the training data), classifiers
using nearest means, nearest neighbors, and Bayes' rules
were developed. The nearest means classifier calculates the mean vector
of each training class, and classifies by the minimum Euclidean distance
between input and mean vectors.
In the nearest neighbor classification, the covariance matrix of the training
data
is calculated, and inverted using LU decomposition.
If the covariance matrix is singular, the error is reported to the user
and the interface returns to the main menu. On successful inversion
of the covariance matrix, the input vector is then
multiplied according to the distance formula:
In Bayes' classifier, the misclassification error (overlap error of the class probability density functions) for the classes is minimized using the Fisher's criterion. The general classification rule is given by:
The GIS interface was structured along the lines of the i.maxlik maximum liklihood classifier presently available in GRASS. A few new features were added. The user has the option of either using a pre-existing map with digitized training areas, or of selecting training areas using the interface (points, circles, polygons). Training areas that need to be deleted can be removed within the interface. The interface stores the training data for each class as a separate map layer. In the use of the tool, the user is asked to enter the name of the output map layer, the number of output classes, and the names of the input map layers. Using the lump option of the menu, the tool selects the "dominant" category within a specified window and generates a new map layer. The user can reset the resolution to the newly specified window size, or retain the old resolution in which he entered the tool. In existing GRASS routines, when resolution (window) of a region is enlarged, the middle pixel of the window in the lower resolution is selected. Training areas are selected using the define areas option. Using the zoom option, the user can zoom out to parts of the output map in which he wishes to delineate training areas. If classification is using two input vectors, the user can view the scatter plot of the training data, and selectively remove outlier points or points that cause conflicts (i.e., same input vectors belonging to two different classes). The training and input are also written as ASCII files for further data exploration outside of GRASS (such as in the use of the public domain xgobi viewer of Buja et al., 1986) If using remotely sensed data, the neural network interface allows input of spectral bands and training data as in the i.points program of GRASS. Histograms of the training data can be generated. After calculating the covariance matrix of the training data, the interface prints out the eigenvalues of the matrix. The eigenvalues could be used to identify the dominant or import features of the input map layers used in classification (see Fukunaga, 1972).
The land management application presented here consists of those areas
requiring best management practices since they have soil losses above
the soil erosion tolerance limit (areas requiring best management BMP
practices), and those land areas that have soil
losses below soil loss tolerance.
The Indian Pine watershed north of the Wabash river in West Lafayette,
Indiana was selected as
the study area. The USLE K, LS, C, and P maps for the study area were
used to calculate the soil loss from each cell in GRASS using the USLE
equation (R*K*LS*C*P) (Wischmeier and Smith, 1978). The soil erodibility K-factor map was obtained from the K-factor
in the Natural Resources Conservation Service Soils-5
database. The slope-length LS-factor map was obtained by running the r.watershed program
in GRASS which determines the LS-factor based on the elevation map. The
cropping-management C factor map was obtained by assigning values based upon the observed
crop rotation practices within the Indian Pine watershed (mainly corn-soybeans).
The agricultural fields
were digitized from aerial photographs maintained
by the Agricultural Conservation and Stabilization Service (ASCS) in
Lafayette, Indiana. The conservation practice P-factor map was obtained by assigning
management practice values upon observation of the fields in the study
area. The rainfall and runoff erosivity index R was obtained from the USDA
erosion losses hand book (Wischmeier and Smith, 1978). Those areas of the
study area above tolerable soil loss (USLE T)
requiring best management practices are shown as dark areas in the first
column of Figure 1 (please click on the image for better view).
The best management practices (BMP) areas are clustered into two areas. Training data for r.nntool was selected from the lower left corner of the BMP map. The input data to the neural network (we chose to use quickprop) consisted of the USLE factors, and the output units consisted of binary data by pixels representing whether an area required BMPs or not.
The output of the neural network tool (r.nntool) after training is shown in the second column of Figure 1. The dark areas represent the areas requiring BMPs. As shown, the neural network has predicted whole field areas as requiring BMPS. The areas displayed correspond exactly to the shape and size of fields as shown in the C-factor map in the third column of Figure 1. All the field areas predicted as requiring BMPs also had points within them that had soil loss above the tolerance limit. This shows that the way the data was represented and presented by pixels to the neural network will cause prediction of more global features (Note: this would also be true of classical classifiers).
Neural networks have found many interesting uses in remote sensing because they allow integration of remote sensing and other complementary landuse information in image classification. Classical classifiers, such as maximum likelihood and nearest neighbor classifiers, have been primarily applicable with only satellite image band information. Neural networks allow for linear and non-linear mappings between satellite spectral data, complementary landuse information (eg., land ownership,slope and aspect), and landuse classes.
A thematic mapper (TM) composite for Temple, Texas using the second,
third, and seventh channels was used to identify land use
categories using ERDAS and ARC/INFO (McKinney, 1993). The
TM scene was taken on March 14, 1992 (used here by
permission of Nature Conservancy, Austin, Texas). The TM
data was rectified using the road map of Temple, Texas.
Areas were identified as either water, forests, rangeland,
agricultural land (cropland
and pasture), and other (urban, barren, and other
categories). The TM composite
was then imported into GRASS.
To confirm the classification, we mounted a Trimble GPS unit
called Pathfinder Basic+ (Trimble, 1991) on a vehicle and
drove around the area of the TM coverage of Temple. The GPS
readings were done non-differentially. Non-differentially,
the GPS unit can be precise to 30 meters. Field GPS surveys
were made at two different rangeland sites, five different
agricultural land sites, one water body site, six different urban sites, and one
forest site. There was agreement of observed
landuse to those predicted from the TM composite, except for
the water body in the northwest corner of the image which
had a smaller coverage than that of the TM image.
For application of the neural network tool, a smaller
area (57 square kilometers) of the Temple area was subset from the
larger scene.
(figure 2)
Input into the neural network consisted of the visible green
band (.52-.60 micro meters) and the mid-infrared band
(2.08-2.35 micro meters).
The quick propagation network was chosen and training sites
were selected for water, agricultural land, range land, forests, and other categories.
Figure 2 shows the selected training
sites (the sites for each training class are stored as separate
map layers).
The scatter plot of the selected training sites are shown in
figure 3
Data were interactively cleaned where
inappropriate training sites were selected, and where there
were overlap of the wrong pixel classes in the selected
training areas.
Figure 3 shows the error
at the output units of the quick propogation network on the
training cycles.
The network converged
to a mean square error of 1.30 after 4000 iterations (epochs).
Twelve hidden units were used (based on some trial and error), and
the network had two input units, and 5 output units. There were
a total of 177 training points (water had 7 training points, forest
had 92 points, agricultural land had 51 points, range land had 8 points, and
other had 19 points). Once the network had
converged, testing was
done and an output raster file then generated and displayed
(lower right corner of the above Figure 4). A nearest means classification was
also performed on the data from an option available in r.nntool
Comparisons by area are shown in Table 1.
Table 1. Comparisons by landuses for the study area.
Land Class.....Composite......neural networks..........nearest means
.....................(%)..................(%)..................(%)