Sudhakar Mamillapalli, R. Srinivasan, J. G. Arnold, Bernard A. Engel

Effect of Spatial Variability on Basin Scale Modeling

ABSTRACT

With the integration of GIS and distributed parameter hydrologic models, a watershed can be divided into many subbasins. However, the effect of discretization on the quality of the simulated output has not been studied very widely. Using the concept of virtual basins the different soil and landuses in the subbasin can be simulated to the level of detail desired. Using a 4300km^2 watershed in Texas, the present study was undertaken to study the effect of increasing level of discretization and virtual basins, on the accuracy of the output. The results indicate that in general, increasing level of discretization and increasing the number of soil and landuse combinations simulated within each subbasin increases the accuracy of the simulation. There is a level beyond which the accuracy can't be improved, suggesting that more detailed simulation may not always be necessary. Preliminary investigations have been conducted to determine optimal configuration.

INTRODUCTION

Hydrologic models can be broadly divided into lumped parameter models and distributed parameter models. The lumped parameter approach considers the whole catchment as a single entity and maps the input rainfall excess to an output hydrograph. Though computationally efficient, this approach doesn't explicitly account for spatial variabilities present within the catchment. Chief among this type of model is the USLE(Wischmeier and Smith, 1978). Distributed models divide the catchment into a number of smaller areas (which could be square elements or subcatchments), which are assumed to be uniform with respect to the hydrologic parameters. Hydrology is simulated within each of these elements and the output routed to the outlet. Hence these models take into consideration spatial variability of the watershed. Examples of these include the AGNPS (AGricultural Non-Point Source Pollution) model (Young et al., 1987), ANSWERS (Aerial Nonpoint Source Watershed Response Simulation)(Beasely et al., 1977) and SWAT (Soil and Water Assessment Tool) (Arnold et al., 1993). Considerable time and effort are required to acquire the data, run the models and interpret resulting information. Integration with a GIS can eliminate many of these problems. Several models have been integrated with GIS which include AGNPS and GRASS GIS by Srinivasan and Engel (1994), ANSWERS and GRASS (Rewerts and Engel, 1991), and SPUR and ERDAS (Sasowsky and Gardner, 1991).

As noted before, these models either discretize the watershed into smaller elements by overlaying a square grid(ANSWERS or AGNPS) or into various subbasins (SWAT and SPUR). With the integration of these models with a GIS, it is possible to divide the watershed into a large number of elements since the GIS automatically generates the input. Hence we can consider the spatial variability to the level of detail supported by the data. However, as the number of such elements increase, so does the computation time. It is not clear from studies to date that the effect of increasing input levels of detail improves the accuracy of the simulated output. For effective use of the above tools, it is necessary to be able to discretize the watershed to an appropriate level of detail. A gross discretization may lead to poor simulation results whereas very fine discretization would require far more input data and significantly increased computation time and space (which may be important for large watersheds comprised of hundreds of subbasins) with no or little increase in accuracy. Also using the concept of virtual basins, different soil and landuse combinations within a subbasin can be simulated, instead of considering the dominant landuse and soil to be representative of the subbasin. It may not be necessary to consider all the combinations within a subbasin. The impact of increased detail in accounting for various soil and landuse combinations within a subbasin and effect of level of discretization on accuracy is studied for a watershed in Texas and presented.

OBJECTIVES

  1. Quantify the effect that level of discretization has on the accuracy of output obtained from the distributed parameter hydrologic basin model SWAT.
  2. Examine the impact of increased detail in simulating the soil and landuse combinations within a subbasin using the concept of virtual subbasins on the accuracy of output using SWAT.

    RELEVANT LITERATURE

    SWAT(Soil Water Assessment Tool), a continuous daily time step model developed by Arnold et al.(1993), was obtained by adding a new routing structure to the SWRRB model(Arnold et al., 1990, Williams et al.,1985) so as to remove the restriction of only being able to simulate 10 subwatersheds in the case of SWRRB. The new routing structure of SWAT routes and adds flows down through the basin reaches and reservoirs. Apart from this, changes were incorporated to simulate lateral flow, ground water flow, reach routing transmission losses, and sediment and chemical movement through ponds, reservoirs, streams and valleys. SWAT is capable of simulating hundreds of subwatersheds for periods of 100 years or more. The major components of the model include hydrology, weather, sedimentation, soil temperature, crop growth, nutrients, pesticides, ground water and lateral flow, and agriculture management. Additional details about the model can be found in Arnold et al. (1994).

    SWAT allows for considerable flexibility in watershed discretization. The watershed can be divided into cells and/or subwatersheds. Different parts of the watershed can be divided differently. The dominant soil and landuse within each subbasin is considered to be the soil and landuse of the subbasin. However, in order to account for multiple soil and landuse combinations, the concept of virtual subbasins was incorporated into SWAT. Instead of assuming the dominant soil and landuse to be the soil or landuse of the subbasin, each subbasin is discretized into virtual areas (referred to as virtual basins), each having a unique soil and landuse combination without reference to their spatial positioning within the subbasin. This is similar to the concept of Hydrologic Response Units (HRU's) given by Maidment (1991). The hydrologic response is generated within each of these virtual areas and then the weighted average (by area) of the response from these virtual subbasins is taken to be the output of the subbasin. Since there can be large numbers of such combinations, a threshold is set. Only soil and landuse combinations forming a proportion larger than that of the threshold are considered. The threshold is arbitrary and is set by the user.

    The way the dominant soil and landuse are chosen in SWAT is that, first the dominant landuse is determined within the subbasin, and within this landuse the dominant soil prevalent is determined. This soil and landuse are taken to be representative of the subbasin. It should be noted this may not be the same as the dominant soil and landuse combination. For example this may occur within a subbasin where there are two different landuses A and B each occupying say 60% and 40% and landuse A has 10 different soils occupying 10% of the area while landuse B has just one soil occupying the whole area. In this case the soil landuse combination chosen will be the A landuse with one of the soils even though the other combination may occupy a larger proportion of the area. Since between soil and landuse, landuse affects stream flow more, this approach seems logical. In the case of the virtual basin approach, based on the threshold on landuse set by the user all the landuses occupying an area greater than the threshold are selected. Within these landuses, soils forming a proportion greater than that of the soil threshold set by the user is used to select the soils. Both the thresholds are set by the user. The effect of varying the threshold on the output obtained is studied.

    A GRASS GIS interface to the SWAT model was developed by Srinivasan and Arnold (1994). Given the appropriate data layers and data bases the interface extracts the data and writes the SWAT input file. A number of tools were incorporated allowing for automatic extraction of various inputs. These include tools for accessing the appropriate databases, hydrologic tools to retrieve the topographic and other attributes including automatic generation of the routing structure and aggregarion tools to aggregate the inputs at the subwatersed level. Once the input files are generated the model can be run and then the results visualized using the output interface. The GIS interface, facilitated to a large extent the study conducted.

    Among studies made to determine the impact of level of discretization on output of basin scale models are that of Wood et al. (1988) and that of Sasowsky and Gardner (1991). Wood et al. introduced the concept of the representative elementary area (REA) in hydrologic modeling. They refer to REA as a fundamental building block of catchment modeling. They argue that at smaller scales, actual patterns of variability of topography, soil or rainfall lead to differences in the output even though the underlying distribution is the same. As larger and larger scales are considered, more and more of the variability is sampled and then finally an area is obtained whose hydrologic response can be considered to be the net effect of the individual point hydrologic responses within the subbasin or basin. So a basin with all its variation in soils, topography, and weather can be represented by these REAs without much loss in quality of the output. To prove the existence of the REA, Wood et al. discretized the Coweta River experimental catchment in North Carolina which had an area of 17km^2, into 3, 19, 39 and 89 subcatchments by the method described by Band and Wood (1986). In order to emulate point hydrologic response which can be then averaged to form the basin hydrologic response, they applied TOPMODEL (Beven and Kirby, 1979; Beven, 1986) within each 30m pixel comprising the catchment. Then pixel output was aggregated to form the subbasin response. The subbasin responses where then arranged in increasing order of their areas and a running average of 15 subcatchments, moving in steps of 5, was taken. The mean area within each window was plotted against the mean average response. The graphs indicated that the areal response stablized at around 1km^2 area. The size was the same for all the outputs studied. Thus they concluded the REA for this catchment was 1km^2. They made further studies and remarked that the size of the REA is governed primarily by the topography. Soil and rainfall variability didn't have a big role in determining the size of the REA. In this study, variability in only soil, rainfall and topography were studied. In general, large catchments in addition to the above have land use variability to consider.

    Sasowsky and Gardner (1991) used three different configurations of the 146km$^2 Walnut Gulch watershed in Arizona: >= 2nd, >= 4th, >= 13th stream order with 28, 15, and 1 channel segments and 66, 37, and 3 contributing areas and made SPUR runs for each of these configurations. The runoff at the outlet was compared with observed data and then the results seemed to imply the >= 4th order stream network gave as good as results as the >= 2nd and hence they concluded that REA exists for the basin under consideration. However, again this study doesn't consider land use variability and they admit the evaluation criteria used affect the conclusions that could be drawn. Also calibration (the curve number) needed to be done on the model and again the the model evaluation results depended on the changes made in the curve number. Due to ambiguities, it is difficult to conclude that the study establishes the existence of the REA.

    METHDOLOGY

    A watershed in Texas of size 4297 km^2 was used in this study. It has originally been subdivided into 40 subbasins, composed of agriculture and range land. Figure 1 shows the watershed with the 40 subbasins. Using the "r.watershed" tool within the GRASS GIS and the 1:250,000 DEM, the watershed was discretized into 4, 8, 14, 20, 24, 29, 35, 40 and 54 subbasins. Measured stream flow data was available at two locations within the watershed. Since both these gages are not located at the outlet of the subbasin, the simulated flow draining into the basin where the gage was located was extracted and compared with the output. Statistics used in the comparison are the coefficient of determination, and the coefficient of efficiency of Nash and Sutcliffe (1971). A coefficient of efficiency of 1 indicates perfect agreement. If the results are highly correlated but biased, then the coefficient of efficiency will be less than the coefficient of determination (Aitken, 1973).





    Figure 1: The Bosque Watershed

    Simulations were made both for the dominant case where the dominant soil and landuse within the basin was considered to be the soil and landuse of each subbasin and the virtual basin approach with thresholds ranging from 5% to 20% for landuse and 10% to 40% for soil. For example a threshold of 10% for soil and 5% for landuse indicates that landuses which form at least 5% of the subbasin area and soils which form at least 10% of the area within each of the selected landuses will be taken as virtual basins. Results for all these cases are presented here. In the table of results the different thresholds are mentioned as "landuse threshold and soil threshold". For example "10% and 20%" in the results table indicate that 10% is the landuse threshold and 20% the soil threshold used for the corresponding results. Observed flow data was available for years 1965 to 1974 and 1975 to 1984 for two different USGS gages 5000 and 5200 within the basin. Simulations were made for these time periods and the results compared. A single simulation was not done for both these time periods since the rain gage data available changed in 1975. No calibration whatsoever was attempted throughout the study, so that the impact of spatial variability alone can be studied. Also to remove the impact of weather variability, the Thiessen polygon average of all the weather gages present in the basin was taken to be the rainfall and temperature data for all the subbasins within the basin.

    RESULTS

    Statistics, including mean, standard deviation, coefficient of efficiency, and coefficient of determination were computed for the simulations described previously. Results are presented for both USGS gages 5000 and 5200 within the study basin for years 1965-1974 and 1975 to 1984. These gages are upstream of the outlet, so the number of basins draining into the basin having these gages is different from the total number of basins in the watershed. In the result tables, the number of basins draining into these gages, as well as the total number of basins in the watershed are presented. The coefficient of efficiency (COE) is used as a measure of accuracy of the simulated results and presented here, though other measures generally followed the same trend.













    Table 1 gives the coefficient of efficiencies obtained for different basin configurations using dominant soil and landuse approach and various soil and landuse thresholds for the virtual basin approach for years 1965-1974 for gage 5000. As noted before, no calibration was attempted whatsoever of the results. From the table it is clear that as the number of basins increased so does the coefficient of efficiency. For example when using the dominant soil and landuse approach using just 1 basin, the coefficient of efficiency is 0.31 while with 28 basins the coefficient of efficiency rose to 0.72. Within the same configuration, accuracy in simulation increases as more and more soil and landuse combinations are simulated chosing smaller soil and landuse thresholds, with the best results obtained with the 5% landuse and 10% soil threshold. However, the increase in accuracy is minimal as more and more basins are chosen for simulation, i.e. for more and more detailed configurations. For example with 28 basins there is practically no increase in accuracy between the dominant approach and the other configurations with all giving a COE of around 0.72. However, with 1 basin using the dominant approach, the COE is 0.31 which increases to 0.68 with landuse and soil thresholds of 5% and 10%. The results for gage 5200 given in Table 2 are very similar.

    Results for 1975 to 1984 are given in Tables 3 and 4 for gages 5000 and 5200 respectively. The results are relatively poor compared to the previous case (with the maximum COE achieved at 0.74 for gages 5000 and 5200 for 1965-1974 but 0.48 and 0.49 for years 1975-84 for gages 5000 and 5200 respectively). The conclusions drawn however are the same as in previous case. As the number of basins used in simulations increased, so did the accuracy, for each soil and landuse threshold chosen. Also within each configuration as the number of soil and landuse combinations chosen increased, so did the accuracy. However, this increase was minimal for detailed configurations compared to less detailed configurations. From these results, two conclusions can be drawn. Firstly, the increase in accuracy can be obtained either by increasing the number of basins used in simulation or by increasing the number of soil and landuse combinations within each subbasin. Secondly there is a limit in the accuracy that could be obtained. Increased detail in soil and landuse combinations simulated or the basin configuration may not give rise to better results, but on the other hand increases the number of simulations made~(since simulations are needed for each soil, landuse and basin combination selected. For example if the configuration has 40 subbasins and within each subbasin on the average 5 soil and landuse combinations have been selected a total of 200 combinations need to be simulated). For example considering Table 1, using 54 subbasins (which leads to 35 subbasins flowing into the stream gage), the number of combinations simulated in the dominant case was 54, while using a 5% threshold for soil and 10% threshold for landuse the number of combinations were 362, however there is no increase in COE (both are 0.73). Similarly, examining the 5% and 10% threshold column in the table, there is practically no increase in accuracy with either 5 subbasin or 35 subbasins. However, this doesn't indicate that using 5 subbasins with the above threshold will always give the best results. For example for gage 5200 for year 1975 to 1984 (Table 4), the COE increased from 0.43 for 5 subbasins to 0.49 for 35 subbasins. Even though the increases is not much, it shows that 5 basins will not always give as good results as that of 54 subbasins.













    Even though the conclusions that could be drawn are similar for both 1965-74 and 1975-84, there seems to be a vast difference in accuracy. This may be because 1975-84 is drier (annual average 783.1 mm) compared to 1965-74 (annual average 859.1 mm) and SWAT is known to do better in wet conditions. Also the landuse might have changed between the two different simulations. This might have lead to relatively poor results. This however leads to interesting conclusions. For example looking at Table 2 one might conclude that reasonable results would be obtained using dominant soil and landuse with 14 basins with no increase in accuracy going to more basins or more detailed soil and landuse combinations. However, from Table 4, it is clear that this may not lead to the best possible results. The conditions being simulated seem to have an impact on the optimum basin configuration.

    Next, some studies were made to determine if possible, the optimal configuration. There are two different aspects to this problem. First is to chose the appropriate landuse and soil thresholds given a particular basin configuration and the second is to chose a appropriate basin configuration. The first of these two problems is addressed here.

    For each of the basin configurations and soil and landuse thresholds, the curve number was plotted against the percentage of area with the corresponding curve number. Also plotted on this curve is the curve number distribution obtained choosing this configuration but with 0 soil and landuse threshold. This is done for each of the basin configurations and the various thresholds for which simulation runs were made, in each the comparision being made with the corresponding curve number distribution of the same configuration except 0 soil and landuse threshold. In general it was found that, the closer the distribution is to this more detailed, 0 soil and landuse threshold distribution the better were the results. In order to quantify the results, the coefficient of efficiency was calculated between the detailed distribution mentioned above to that of different soil and landuse thresholds for the same configuration and presented in Table 5.

    A higher COE between the distributions in general lead to higher COE between the observed and simulation results and vice versa. For example when the watershed is comprised of 4 subbasins, the COE is very low for all soil and landuse thresholds except that when 5% and 10% thresholds are used. Looking at the results in Tables 1 and 3, the COE's are low for all combinations except for the 5% and 10% thresholds. A low COE in Table 5 doesn't necessarily indicate poor results in all cases. For example for 54 subbasins, even though COE increased from 0.60 in the dominant case to 0.93 for the 5% and 10% thresholds case, the accuracy of the simulations were identical for years 1965 to 1974 (Tables 1 and 2), with the COE between observed and simulated results being above 0.70. However, for 1975 to 1984 there is an improvement in the simulation results for this same configuration. For example in Table 3 the COE increased from 0.33 in the dominant case to 0.48 for the 5% and 10% thresholds. In most cases there is a big jump in COE from all other threshold to the one with 5% and 10% soil and landuse thresholds. Also, corresponding to this there is a jumb in COE, whih is especially noticeable in 1975-1984 results. Checking for the COE between the actual and the simulated distributions may be a convenient way of determining the optimal thresholds within a particular configuration.







    CONCLUSIONS

    With the integration of GIS and distributed parameter hydrologic models, a watershed can be divided into many subbasins. However, the effect of discretization on the quality of the simulated output has not been studied very widely. Using the concept of virtual basins, the different soil and landuses in the subbasin can be simulated to the level of detail intended. Using a 4300km^2 watershed in Texas, the present study was undertaken to examine the effect of increasing level of discretization and virtual basins on the accuracy of the output. The basin was divided into numerous configurations using the "r.watershed" tool within the GRASS GIS. Simulations were made for the various configurations and within each of these configurations various soil and landuse thresholds for selection of virtual basins. The results indicate that in general, increasing level of discretization and increase in the number of soil and landuse combinations simulated within each subbasin increases the accuracy of the simulation. There is a level beyond which the accuracy can't be improved, suggesting that more detailed simulation may not always lead to better results. It is important to determine the optimal configuration, so that reasonable results could be obtained without the necessity of detailed simulations. It was also noted that from the different time periods considered, some of the coarser levels of discretization may perform well for one period, but not perform well for another period, whereas the finer simulations performed well throughout.

    In order to examine the results further, the proportion of the watershed area having different curve numbers is plotted for each of the soil and landuse thresholds against that obtained when all soil and landuse combinations are considered within that particular configuration. The coefficient of efficiency is calculated and it was seen that in general within a particular configuration as smaller and smaller thresholds are considered, the curve number distribution better matches that using 0 thresholds and the simulation results improved. This approach seems to hold promise to determine the optimal soil and landuse threshold that need to be chosen within a particular configuration, though the optimal basin configuration can not be determined.

    There are some limitations to this study, one of them being the fact that for other model outputs like sediment, apart from curve number, other soil and topographic properties play a major role in determining the output. So in such cases, matching the curve number distribution may not indicate better results. Also no method to determine the optimal configuration has been given. Also the effect of weather variability has not been considered with a single weather used to simulate all the configurations. Using a detailed configuration the weather might be better represented as compared to coarser configurations. More studies need to be conducted on other watersheds of different sizes, and variability in order to validate these results.

    REFERENCES

    Assessing systematic errors in rainfall-runoff models. Journal of Hydrology, 20: 121--136.

    Arnold, J.G., Engel, B.A., and Srinivasan, R. 1993. Continuous time grid cell watershed model. In Heatwole, C.D. (Ed.), Application of Advanced Information Technologies: Effective Management of Natural Resources, 2950 Niles Rd, St. Joseph, Michigan 49085-9659 USA. Information and Electrical Technologies Division of ASAE, American Society of Agricultural Engineers.

    Arnold, J.G., Williams, J.R., Nicks, A., and Sammons, N.B. 1990. SWWRB-A basin Scale Simulation Model. College Station: Texas A&M Press.

    Arnold, J.G., Williams, J.R., Srinivasan, R., King, K.W., and Griggs, R.H. 1994. SWAT-Soil Water Assesssment Tool. 808 East Blackland Rd, Temple, TX-76502: USDA , Agricultural Research Service and Grassland, Soil and Water Research Laboratory.

    Band, L. E, 1986. Topographic partition of watershed with digital elevation models. Water Resources Research, 22 (1): 15--24.

    Beasely, D.B., Huggins, L. F., Monke, E.J. 1980. ANSWERS: A model for watershed planning. Transactions of the ASAE, 23(4): 938-944.

    Runoff production and flood frequeny in catchments of order n: an alternative approach. In Gupta, V.K., Rodriguez-Itrube, I., and Wood, E.F. (Eds.), Scale Problems in Hydrology.

    Beven, K.J. and Kirkby, M.J. 1979. A physically based variable contributing area model of basin hydrology. Hydrological Sciences Bulletin, 24(1): 43--69.

    Engel, B.A., Srinivasan, R., and Rewerts, C. 1993. A spatial decision support system for modeling and managing agricultural nonpoint source pollution. In Goodchild, M. F., Parks, B. O., and Steyart, L. T.(Eds.), Environmental Modeling with GIS, (pp. 231-237). Oxford University Press, New York, NY.

    Maidment, D.R. 1991. GIS and hydrologic modeling. In Prepared for Presentation at the First International Symposium/Workshop on GIS and Environmental Modeling, Boulder, Colorado.

    Nash, J.E. and Sutcliffe, J.V. 1971. River flow forecasting through conceptual models. Journal of Hydrology, 13: 297--324.

    Rewerts, C.C. and Engel, B.A. 1991. ANSWERS on GRASS: Integrating a watershed simulation with GIS. Number ASAE Paper No. 91-2621. American Society of Agricultural Engineers, St.Joseph, MI.

    Sasowsky, K.C. and Gardner, T.W. 1991. Watershed configuration and geographic information system parametrization for SPUR model for hydrologic simulations. Water Resources Bulletin, 27(1): 7--18.

    Srinivasan, R. and Arnold, J.G. 1994. Integration of a basin scale water quality model with gis. Water Resources Bulletin, 30(3): 453--462.

    Williams, J.R., Nicks, A.D., and Arnold, J.G. 1985. SWWRB, a simulator for water resources in rural basins. ASCE Hydraulics Journal, 111(6): 970--986.

    Wischmeier, W.M. and Smith, D.D. 1978. Predicting rainfall erosion losses - a guide to conservation planning. Technical Report Agri. Handbook No. 537, Science and Education Administration, USDA.

    Wood, E.F., Sivapalan, M., Beven, K., and Band, L. 1988. Effects of spatial variability and scale with implications to hydrologic modeling. Journal of Hydrology, 102: 29--47.

    Young, R.A., Onstad, C.A., Bosch, D.D., and Anderson, W.P. 1987. AGNPS, agricultural non-point-source pollution model: A watershed analysis tool. Technical Report Report 35, U.S Department of Agriculture.