Sudhakar Mamillapalli, R. Srinivasan, J. G. Arnold, Bernard A. Engel
Effect of Spatial Variability on Basin Scale Modeling
ABSTRACT
With the integration of GIS and distributed parameter hydrologic
models, a watershed can be divided into many
subbasins. However, the effect of discretization on the quality
of the simulated output has not been studied very widely. Using the
concept of virtual basins the different soil and landuses in the subbasin
can be simulated to the level of detail desired. Using a 4300km^2
watershed in Texas, the present study was undertaken to study the effect of
increasing level of discretization and virtual basins, on the accuracy of
the output. The results indicate that in general, increasing level
of discretization and increasing the number of soil and landuse combinations
simulated within each subbasin increases the accuracy of the simulation.
There is a level beyond which the accuracy can't be improved, suggesting
that more detailed simulation may not always be necessary.
Preliminary investigations have been conducted to determine optimal
configuration.
INTRODUCTION
Hydrologic models can be broadly divided into lumped parameter models
and distributed parameter models. The lumped parameter approach considers
the whole catchment as a single entity and maps the input rainfall
excess to an output hydrograph. Though computationally efficient, this
approach doesn't
explicitly account for spatial variabilities present within the catchment.
Chief among this type of model is the USLE(Wischmeier and Smith, 1978).
Distributed
models divide the catchment
into a number of smaller areas (which could be square elements or
subcatchments), which are assumed to be uniform with respect to the
hydrologic parameters. Hydrology is simulated within each of these elements
and the output routed to the outlet. Hence these models take into
consideration spatial variability of the watershed.
Examples of these include
the AGNPS (AGricultural Non-Point Source Pollution) model (Young et al., 1987),
ANSWERS (Aerial Nonpoint Source Watershed Response
Simulation)(Beasely et al., 1977) and SWAT (Soil and Water Assessment
Tool) (Arnold et al., 1993). Considerable time and effort are required to
acquire the data, run the models and interpret resulting information.
Integration
with a GIS can eliminate many of these problems. Several models have
been integrated with GIS which include AGNPS and GRASS GIS by
Srinivasan and Engel (1994), ANSWERS and GRASS (Rewerts and Engel, 1991),
and SPUR and ERDAS (Sasowsky and Gardner, 1991).
As noted before, these models either discretize the watershed into smaller
elements by overlaying a square grid(ANSWERS or AGNPS) or into various
subbasins (SWAT and SPUR). With the integration of these models
with a GIS, it is possible to divide the watershed into a large number of
elements since the GIS automatically generates the input. Hence we can
consider the spatial variability to the level of detail supported by the
data.
However, as the
number of such elements increase, so does the computation time. It is
not clear from studies to date that the effect of increasing input levels
of detail improves
the accuracy of the simulated output. For effective use of the above tools,
it is necessary to be able to discretize the watershed to an appropriate
level of detail. A gross discretization may lead to poor simulation results
whereas very fine discretization would require far more input data and
significantly increased computation time and space (which may be important for
large watersheds comprised of hundreds of subbasins) with no or little
increase in accuracy. Also using the concept of virtual basins, different
soil and landuse combinations within a subbasin can be simulated, instead
of considering the dominant landuse and soil to be representative of the
subbasin. It may not be necessary to consider all the combinations within
a subbasin. The impact of increased detail in accounting for various soil
and landuse combinations within a subbasin and effect of level of
discretization on accuracy is studied for a watershed in Texas and presented.
OBJECTIVES
- Quantify the effect that level of discretization has on the
accuracy of output obtained from the distributed parameter hydrologic basin
model SWAT.
- Examine the impact of increased detail in simulating the soil and
landuse combinations within a subbasin using the concept of virtual subbasins
on the accuracy of output using SWAT.
RELEVANT LITERATURE
SWAT(Soil Water Assessment Tool), a continuous daily time step model
developed by Arnold et al.(1993), was obtained by adding a
new routing
structure to the
SWRRB model(Arnold et al., 1990, Williams et al.,1985)
so as to remove the restriction of only being able to simulate 10 subwatersheds
in the case of SWRRB. The new routing structure of SWAT
routes and adds flows down through the basin reaches and reservoirs. Apart
from this, changes were incorporated to simulate
lateral flow, ground water flow, reach routing transmission losses, and
sediment and chemical movement through ponds, reservoirs, streams and valleys.
SWAT is capable of simulating hundreds of subwatersheds for periods of 100
years or more. The major components of the model include hydrology, weather,
sedimentation, soil temperature, crop growth, nutrients, pesticides,
ground water and lateral flow, and agriculture management. Additional
details about the
model can be found in Arnold et al. (1994).
SWAT allows for considerable
flexibility in watershed discretization. The watershed can be divided into
cells and/or subwatersheds. Different parts of the watershed can be
divided differently. The dominant soil and landuse within each subbasin
is considered to be the soil and landuse of the subbasin.
However, in
order to account for multiple soil and landuse combinations, the concept
of virtual subbasins was incorporated into SWAT. Instead of assuming
the dominant soil and landuse to be the soil or landuse of the
subbasin, each subbasin is discretized into virtual areas (referred to
as virtual basins), each having a unique soil and landuse
combination without reference to their spatial positioning within the subbasin.
This is similar to the concept of Hydrologic Response Units (HRU's) given
by Maidment (1991).
The hydrologic response is generated within each of these virtual areas and
then the weighted average (by area) of the response from these virtual
subbasins is taken to be the output of the subbasin. Since there can
be large numbers of such combinations, a threshold is set.
Only soil and landuse combinations forming a proportion larger than that
of the threshold are considered. The threshold
is arbitrary and is set by the user.
The way the dominant soil and landuse are chosen in SWAT is that, first
the dominant landuse is determined within the subbasin, and within this
landuse the dominant soil prevalent is determined. This soil and landuse
are taken to be representative of the subbasin. It should be noted this
may not be the same as the dominant soil and landuse combination. For
example this may occur within a subbasin where there are two different
landuses A and B
each occupying say 60% and 40% and landuse A has 10 different soils
occupying 10% of the area while landuse B has just one soil occupying
the whole area. In this case the soil landuse combination chosen will be
the A landuse with one of the soils even though the other combination may
occupy a larger proportion of the area. Since between soil and landuse,
landuse affects stream flow more, this approach seems logical. In the
case of the virtual basin approach, based on the threshold on landuse set by the
user all the landuses occupying an area greater than the threshold
are selected. Within these landuses, soils forming a proportion greater
than that of the soil threshold set by the user is used to select the
soils. Both the thresholds are set by the user. The effect of varying
the threshold on the output obtained is studied.
A GRASS GIS interface to the SWAT model was developed
by Srinivasan and Arnold (1994).
Given the appropriate data layers
and data bases the interface extracts the data and writes the SWAT input
file. A number of tools were incorporated allowing for automatic extraction
of various inputs. These include tools for accessing the appropriate
databases, hydrologic tools to retrieve the topographic and other
attributes including automatic generation of the routing structure and
aggregarion tools to aggregate the inputs at the subwatersed level.
Once the input files are generated the model can be run and then the results
visualized using the output interface. The GIS interface, facilitated to
a large extent the study conducted.
Among studies made to determine the impact of level of discretization on
output of basin scale models are that of Wood et al. (1988) and
that of Sasowsky and Gardner (1991).
Wood et al. introduced the concept of the representative
elementary area (REA) in hydrologic modeling. They refer to REA as a
fundamental building block of catchment modeling. They argue that at
smaller scales,
actual patterns of variability of topography, soil or rainfall lead to
differences in the output even though the underlying distribution is
the same. As larger and larger scales are considered, more and more
of the variability is sampled and then finally an area is obtained
whose hydrologic response can be considered to be the net effect of
the individual point hydrologic responses within the subbasin or basin.
So a basin with all its variation in soils, topography, and weather
can be represented by these REAs without much loss in quality of the
output. To prove the existence of the REA, Wood et al. discretized the
Coweta River experimental catchment in North Carolina which had an area
of 17km^2, into 3, 19, 39 and 89
subcatchments by the method described by Band and Wood (1986).
In order to emulate point hydrologic response which can
be then averaged to form the basin hydrologic response, they applied
TOPMODEL (Beven and Kirby, 1979; Beven, 1986) within each 30m pixel
comprising the catchment. Then pixel output was aggregated to form
the subbasin response. The subbasin responses where then arranged
in increasing order of their areas and a running average of 15 subcatchments,
moving in steps of 5, was taken. The mean area within each window
was plotted against the mean average response. The graphs indicated that
the areal response stablized at around 1km^2 area. The size was the same
for all the outputs studied. Thus they concluded the REA for this
catchment was 1km^2. They made further studies and remarked that the size
of the REA is governed primarily by the topography. Soil and rainfall
variability didn't have a big role in determining the size of the
REA. In this study, variability in only soil, rainfall and topography
were studied. In general, large catchments in addition to the above
have land use variability to consider.
Sasowsky and Gardner (1991) used three
different configurations of the 146km$^2 Walnut Gulch watershed in Arizona:
>= 2nd, >= 4th, >= 13th stream order with 28, 15, and 1
channel segments and 66, 37, and 3 contributing areas and made SPUR
runs for each of these configurations. The runoff at the outlet was
compared with observed data and then the results seemed to imply the >= 4th
order stream network gave as good as results as the >= 2nd and hence they
concluded that REA exists for the basin under consideration. However, again
this study doesn't consider land use variability and they admit the
evaluation criteria used affect the conclusions that could be drawn.
Also calibration (the curve number) needed to be done on the model and
again the the model evaluation results depended on the changes made in
the curve number. Due to ambiguities, it is difficult to conclude that the
study establishes the existence of the REA.
METHDOLOGY
A watershed in Texas of size 4297 km^2 was used in this study. It
has originally been subdivided into 40 subbasins, composed of agriculture and
range land. Figure 1 shows the watershed with the 40
subbasins. Using the "r.watershed" tool within the GRASS GIS and
the 1:250,000 DEM, the watershed was discretized into 4, 8, 14, 20, 24, 29,
35, 40 and 54 subbasins. Measured stream flow data was available at two
locations
within the watershed. Since both these gages are not located at the
outlet of the subbasin, the simulated flow draining into the basin where
the gage was located was extracted and compared with the output. Statistics
used in the comparison are the coefficient of determination, and the coefficient
of efficiency of Nash and Sutcliffe (1971). A
coefficient of efficiency of 1 indicates perfect agreement. If the results
are highly correlated but biased, then the coefficient of efficiency will be
less than the coefficient of determination (Aitken, 1973).

Figure 1: The Bosque Watershed
Simulations were made both for the dominant case where the dominant soil
and landuse within the basin was considered to be the soil and landuse
of each subbasin and the virtual basin approach with thresholds ranging from
5% to 20% for landuse and 10% to 40% for soil.
For example a threshold of 10% for soil and 5% for landuse indicates that
landuses which form at least 5% of the subbasin area and soils which form at
least 10% of the area within each of the selected landuses will be taken
as virtual basins. Results for all these cases are presented here. In
the table of results the different thresholds are mentioned as "landuse
threshold and soil threshold". For example "10% and 20%" in the
results table indicate that 10% is the landuse threshold and 20% the soil
threshold used for the corresponding results.
Observed flow data was available for years 1965 to 1974 and 1975 to 1984
for two
different USGS gages 5000 and 5200 within the basin.
Simulations were made for these time periods and the
results compared. A single simulation was not done for both these time
periods since the rain gage data available changed in 1975.
No calibration whatsoever was attempted throughout the study,
so that the impact of spatial variability alone can be studied. Also
to remove the impact of weather variability, the Thiessen polygon average of
all the weather gages present in the basin was taken to be the rainfall and
temperature data for all the subbasins within the basin.
RESULTS
Statistics, including mean, standard deviation,
coefficient of efficiency, and coefficient of determination
were computed for the simulations described previously.
Results are presented for both USGS gages 5000
and 5200 within the study basin for years 1965-1974 and 1975 to 1984.
These gages are upstream of the outlet,
so the number of basins draining into the basin having these
gages is different from the total number of basins in the watershed.
In the result tables, the number of basins draining into these gages,
as well as the total number of basins in the watershed are presented.
The coefficient of efficiency (COE) is used as a measure of accuracy of
the simulated results and presented here, though other measures generally
followed the same trend.


Table 1 gives the coefficient
of efficiencies obtained for different basin configurations using dominant
soil and landuse approach and various soil and landuse thresholds for
the virtual basin approach for years 1965-1974 for gage 5000.
As noted before, no calibration
was attempted whatsoever of the results. From the table it is clear that
as the number of basins increased so does the coefficient of efficiency. For
example when using the dominant soil and landuse approach using just 1 basin, the
coefficient of efficiency is 0.31 while with 28 basins the coefficient
of efficiency rose to 0.72. Within the same configuration, accuracy in
simulation increases as more and more soil and landuse combinations are
simulated chosing smaller soil and landuse thresholds, with the best results
obtained with the 5% landuse and 10% soil threshold. However, the increase
in accuracy is minimal as more and more basins are chosen for
simulation, i.e. for more and more detailed configurations. For example
with 28 basins there is practically no increase in accuracy between the
dominant approach and the other configurations with all giving a COE of
around 0.72. However, with 1 basin using the dominant approach, the COE is
0.31 which increases to 0.68 with landuse and soil thresholds of 5% and
10%. The results for gage 5200 given in Table 2 are
very similar.
Results for 1975 to 1984 are given in Tables 3 and
4 for gages 5000 and 5200 respectively. The results
are relatively poor compared to the previous case (with the maximum COE
achieved at 0.74 for gages 5000 and 5200 for 1965-1974 but 0.48 and
0.49 for years 1975-84 for gages 5000 and 5200 respectively).
The conclusions drawn
however are the same as in previous case. As the number of basins used
in simulations increased, so did the accuracy, for each soil and landuse
threshold chosen. Also within each configuration as the number of soil and
landuse combinations chosen increased, so did the accuracy. However, this
increase was minimal for detailed configurations compared to less detailed
configurations.
From these results, two conclusions can be drawn. Firstly, the
increase in accuracy can be obtained either by increasing the number of
basins used in simulation or by increasing the number of soil and landuse
combinations within each subbasin. Secondly there is a limit in the accuracy
that could be obtained. Increased detail in soil and landuse combinations
simulated or the basin configuration may not give rise to better
results, but on the other hand increases the number of
simulations made~(since simulations are needed for each soil, landuse and
basin combination selected. For example if the configuration has 40 subbasins
and within each subbasin on the average 5 soil and landuse combinations have
been selected a total of 200 combinations need to be simulated).
For example considering Table 1, using 54 subbasins (which
leads to 35 subbasins flowing into the stream gage), the number of
combinations simulated in the dominant case was 54, while using a 5% threshold
for soil and 10% threshold for landuse the number of combinations were
362, however there is no increase in COE (both are 0.73). Similarly,
examining the 5% and 10% threshold column in the table, there is
practically no increase in accuracy with either 5 subbasin or 35 subbasins.
However, this doesn't indicate that using 5 subbasins with the above threshold
will always give the best results. For example for gage 5200 for year 1975 to
1984 (Table 4), the COE increased from 0.43 for 5 subbasins
to 0.49 for 35 subbasins. Even though the increases is not much, it shows
that 5 basins will not always give as good results as that of 54 subbasins.


Even though the conclusions that could be drawn are similar for both
1965-74 and 1975-84, there seems to be a vast difference in accuracy. This
may be because 1975-84 is drier (annual average 783.1 mm) compared to
1965-74 (annual average 859.1 mm) and SWAT is known to do better in wet
conditions. Also the landuse might have changed between the two different
simulations. This might have lead to relatively poor results.
This however leads to interesting conclusions. For example
looking at Table 2 one might conclude that reasonable
results would be obtained using dominant soil and landuse with 14 basins with
no increase in accuracy going to more basins or more detailed
soil and landuse combinations. However, from
Table 4, it is clear that this may not lead to the best
possible results. The conditions being simulated seem to have an impact
on the optimum basin configuration.
Next, some studies were made to determine if possible, the optimal
configuration.
There are two different aspects to this problem. First is to chose
the appropriate landuse and soil thresholds given a particular basin
configuration and the second is to chose a appropriate basin
configuration. The first of these two problems is addressed here.
For each of the basin configurations and soil
and landuse thresholds, the curve number was plotted against
the percentage of area with the corresponding curve number. Also plotted
on this curve is the curve number distribution obtained choosing this configuration but with 0 soil and landuse threshold. This is done for each of the
basin configurations and the various thresholds for which simulation
runs were made, in each the comparision being made with the corresponding
curve number distribution of the same configuration except 0 soil and landuse
threshold. In general it was found that, the closer the distribution is to
this more detailed, 0 soil and landuse threshold distribution the better
were the results. In order to quantify the results,
the coefficient of efficiency was
calculated between the detailed distribution mentioned above to
that of different soil and landuse thresholds for the same
configuration and presented in Table 5.
A higher COE between the distributions in general lead to
higher COE between the observed and simulation results and vice versa.
For example when the watershed is comprised of 4 subbasins,
the COE is very low for all soil and landuse thresholds except that when
5% and 10% thresholds are used.
Looking at the results in Tables 1
and 3, the COE's are low for all combinations except for
the 5% and 10% thresholds. A low COE in Table 5
doesn't necessarily indicate poor results in all cases.
For example for 54 subbasins, even
though COE increased from 0.60 in the dominant case to 0.93 for
the 5% and 10% thresholds case, the accuracy of the simulations were
identical for years 1965 to 1974 (Tables 1 and
2), with the COE between observed and simulated results
being above 0.70. However, for 1975 to 1984 there is an improvement in
the simulation results for this same configuration. For example
in Table 3 the COE increased from 0.33 in the dominant
case to 0.48 for the 5% and 10% thresholds. In most cases there
is a big jump in COE from all other threshold to the one with 5% and
10% soil and landuse thresholds. Also, corresponding to this there is
a jumb in COE, whih is especially noticeable in 1975-1984 results.
Checking for the COE
between the actual and the simulated distributions may be a convenient way
of determining the optimal thresholds within a particular configuration.

CONCLUSIONS
With the integration of GIS and distributed parameter hydrologic
models, a watershed can be divided into many
subbasins. However, the effect of discretization on the quality
of the simulated output has not been studied very widely. Using the
concept of virtual basins, the different soil and landuses in the subbasin
can be simulated to the level of detail intended. Using a 4300km^2
watershed in Texas, the present study was undertaken to examine the effect of
increasing level of discretization and virtual basins on the accuracy of
the output. The basin was divided into numerous configurations using
the "r.watershed" tool within the GRASS GIS. Simulations were made for
the various configurations and within each of these configurations various soil
and landuse thresholds for selection of virtual basins.
The results indicate that in general, increasing level
of discretization and increase in the number of soil and landuse combinations
simulated within each subbasin increases the accuracy of the simulation.
There is a level beyond which the accuracy can't be improved, suggesting
that more detailed simulation may not always lead to better results.
It is important to determine the optimal configuration, so that reasonable
results could be obtained without the necessity of detailed simulations. It
was also noted that from the different time periods considered, some
of the coarser levels of discretization may perform well for one period, but
not perform well for another period, whereas the finer simulations performed
well throughout.
In order to examine the results further, the proportion of
the watershed area having different curve numbers is plotted for each of the
soil and landuse thresholds against that obtained when all soil and landuse
combinations are considered within that particular configuration.
The coefficient of
efficiency is calculated and it was seen that in general within a particular
configuration as smaller and smaller thresholds are considered, the
curve number distribution better matches that using 0
thresholds and the simulation results improved. This approach seems to hold
promise to determine the optimal soil and landuse threshold that
need to be chosen within a particular configuration, though the optimal
basin configuration can not be determined.
There are some limitations to this study, one of them being the fact that
for other model outputs like sediment, apart from curve number, other soil
and topographic properties play a major role in determining the output.
So in such cases, matching the curve number distribution may not indicate
better results. Also no method to determine
the optimal configuration has been given.
Also the effect of weather variability has not been considered with a single weather used to simulate all the
configurations. Using a detailed configuration the weather might be better
represented as compared to coarser configurations. More studies need to be
conducted on other watersheds of different sizes, and variability in order
to validate these results.
REFERENCES
Assessing systematic errors in rainfall-runoff models. Journal of Hydrology, 20: 121--136.
Arnold, J.G., Engel, B.A., and Srinivasan, R. 1993. Continuous time grid cell watershed model. In Heatwole, C.D. (Ed.), Application of Advanced Information Technologies: Effective Management of Natural Resources, 2950 Niles Rd, St. Joseph, Michigan 49085-9659 USA. Information and Electrical Technologies Division of ASAE, American Society of Agricultural Engineers.
Arnold, J.G., Williams, J.R., Nicks, A., and Sammons, N.B. 1990. SWWRB-A basin Scale Simulation Model. College Station: Texas A&M Press.
Arnold, J.G., Williams, J.R., Srinivasan, R., King, K.W., and Griggs, R.H. 1994. SWAT-Soil Water Assesssment Tool. 808 East Blackland Rd, Temple, TX-76502: USDA , Agricultural Research Service and Grassland, Soil and Water Research Laboratory.
Band, L. E, 1986. Topographic partition of watershed with digital elevation models. Water Resources Research, 22 (1): 15--24.
Beasely, D.B., Huggins, L. F., Monke, E.J. 1980. ANSWERS: A model for watershed planning. Transactions of the ASAE, 23(4): 938-944.
Runoff production and flood frequeny in catchments of order n: an alternative approach. In Gupta, V.K., Rodriguez-Itrube, I., and Wood, E.F. (Eds.), Scale Problems in Hydrology.
Beven, K.J. and Kirkby, M.J. 1979. A physically based variable contributing area model of basin hydrology. Hydrological Sciences Bulletin, 24(1): 43--69.
Engel, B.A., Srinivasan, R., and Rewerts, C. 1993. A spatial decision support system for modeling and managing agricultural nonpoint source pollution. In Goodchild, M. F., Parks, B. O., and Steyart, L. T.(Eds.), Environmental Modeling with GIS, (pp. 231-237). Oxford University Press, New York, NY.
Maidment, D.R. 1991. GIS and hydrologic modeling. In Prepared for Presentation at the First International Symposium/Workshop on GIS and Environmental Modeling, Boulder, Colorado.
Nash, J.E. and Sutcliffe, J.V. 1971. River flow forecasting through conceptual models. Journal of Hydrology, 13: 297--324.
Rewerts, C.C. and Engel, B.A. 1991. ANSWERS on GRASS: Integrating a watershed simulation with GIS. Number ASAE Paper No. 91-2621. American Society of Agricultural Engineers, St.Joseph, MI.
Sasowsky, K.C. and Gardner, T.W. 1991. Watershed configuration and geographic information system parametrization for SPUR model for hydrologic simulations. Water Resources Bulletin, 27(1): 7--18.
Srinivasan, R. and Arnold, J.G. 1994. Integration of a basin scale water quality model with gis. Water Resources Bulletin, 30(3): 453--462.
Williams, J.R., Nicks, A.D., and Arnold, J.G. 1985. SWWRB, a simulator for water resources in rural basins. ASCE Hydraulics Journal, 111(6): 970--986.
Wischmeier, W.M. and Smith, D.D. 1978. Predicting rainfall erosion losses - a guide to conservation planning. Technical Report Agri. Handbook No. 537, Science and Education Administration, USDA.
Wood, E.F., Sivapalan, M., Beven, K., and Band, L. 1988. Effects of spatial variability and scale with implications to hydrologic modeling. Journal of Hydrology, 102: 29--47.
Young, R.A., Onstad, C.A., Bosch, D.D., and Anderson, W.P. 1987. AGNPS, agricultural non-point-source pollution model: A watershed analysis tool. Technical Report Report 35, U.S Department of Agriculture.