## E

where qobs is observed discharge and qsim is simulated discharge at each time step, ~qobs is the mean observed discharge, the summations are over all time steps, a2 is the variance of the model residuals, and ao2 is the variance of the observations. This measure has a range from - infinity to 1. When the value is zero, the model will have no more predictive power than a simple 'model' that is the mean of the observations. Negative values mean that the fit is worse than this. When the value is 1, the fit is perfect.

Although widely used, the NSE is not an ideal measure (e.g. Beran, 1999; McCuen et al., 2006; Schaefli and Gupta, 2007). In particular, timing errors and, more generally, series of residuals showing temporal autocorrelation will reduce the value even though visually the shape of the hydrograph can appear to be good. A feature of the Nash-Sutcliffe efficiency measure is that the value is determined relative to the variance of the observed discharges. This means that, if a catchment has a relatively low observed variance (e.g. if it is dominated by a slowly changing baseflow component) the model fit must be very much better in absolute terms than if the catchment is flashy with high observed variance, to get an efficiency close to 1.

It is obvious that this form of calibration can only be carried out if there is an observed discharge record. It also soon became clear that, the simpler a model, and the fewer the parameters that had to be calibrated, the easier it was to fit the model. Rather early in the history of hydrological modelling Dave Dawdy and Terrence O'Donnell (1965) produced a model that was much simpler than the Stanford Watershed Model with a view to making automatic calibration easier, while Mike Kirkby (1975) suggested that the information content of a rainfall-discharge record was only sufficient to calibrate perhaps five or six parameters. It is also worth thinking about why it is worth calibrating a model at gauged sites when we already have some information about the catchment response. As we will see in the next section, in real applications models are most needed to predict catchment responses where we do not have data (the ungauged basin problem), but then we cannot calibrate parameter values and must try to make use of values determined at gauged sites.

### 12.7 The application of rainfall-runoff models

A basic distinction in the applications of rainfall-runoff models is between applications at gauged sites and those at ungauged sites. We will assume that we have some measurements of inputs to a model and want to predict the catchment responses to those inputs. At gauged sites we can calibrate the parameters of a model against the observed discharges; at ungauged sites we need to estimate the parameter values in order to make predictions.

Why would we need a model at a gauged site when we already have observed discharges? There are a number of reasons:

(a) We may wish, particularly for research purposes, to test a particular representation of the processes controlling how a catchment response works before using a model structure for a wider range of applications. This is often called model validation (though the word validation suggests an element of 'truthfulness' that may not be justified, model evaluation is preferable). Such model evaluation should be done with care, it is not enough to simply calibrate a model and show that it gives good results after adjusting the parameters; it is important to show how it gives good results for an independent period of data without further adjustments.

(b) We may wish to use the model to extend the period of discharge measurements in some way. It is quite often the case in the UK (and widely elsewhere) that there are longer periods of rainfall available (at least for daily totals) than there are discharge measurements. After fitting the model to the available discharges, a longer period of discharge can be generated using the observed rainfalls. There is an implicit assumption in doing so that the catchment hydrological response has not changed during the additional period. The longer record could then be used, for example, to improve flood frequency estimates.

(c) We may wish to predict the effects of some future change to the catchment. This could be a climate change (there have been many studies of this type) in which case some scenarios of future changes to precipitation and evapotranspiration can be used to provide a new sequence of inputs to a model with parameters fitted to a period of observed discharges. Somewhat more difficult is to predict the impacts of land-use change, including urbanisation. This would require a change in the model parameters but it is not always clear how to relate changes in the catchment (which may affect only part of the catchment) to effective values of model parameters. See Chapter 19 for examples in which hydrological models have been applied in this way.

(d) We may wish to calibrate the parameters on many gauged catchments with a view to using the calibrated parameter values in a regionalisation exercise to estimate the parameters required to run the model for other ungauged catchments.

(e) We may wish to use a calibrated model in real-time flood forecasting applications with a view to providing timely flood warnings or control of flood defences.

An important issue in all types of applications is how to define the values of the model parameters. Can they be calibrated against some measured responses; can they be estimated on the basis of catchment characteristics alone; can they be measured directly in the field; or can they be estimated from some regionalisation exercise (as in point (d) above)? All of these approaches have been tried; all result in significant uncertainties in predicting the response of a catchment. This should not be unexpected. The processes controlling the form of the hydrograph in a catchment are very complex, and there is no reason why they should be necessarily well represented by a simple model construct. In addition, the inputs to a catchment are often not very well measured, and we should not expect a model to perform better than we can define the inputs. There is an increasing appreciation of such uncertainties and a recognition that model predictions should be provided with an estimate of the associated uncertainties (see Section 12.9 below).

To illustrate the use of different types of hydrological models, in what follows we will describe several different rainfall-runoff models that span this range of applications. The different models are representative of the different generic types. Examples of lumped models are the PDM (probability-distributed model) developed in the 1980s by Moore and Clarke (1981) at the Institute of Hydrology at Wallingford, UK and the DBM (data-based mechanistic) modelling approach of Young (2001, 2003). A number of the applications illustrated in later chapters of this book have used the PDM. An example of a model that is lumped for calculation purposes but where the predictions can be mapped back into space, is Topmodel (topography derived model) developed in the 1970s by Beven and Kirkby (1979) at the University of Leeds but widely used elsewhere. Examples of models that are distributed are the SHE model (Système Hydrologique Européen) of Abbott et al. (1986) and the InHM (Integrated Hydrologic Model) of VanderKwaak and Loague (2001).

There are also a variety of models available developed for more specialised uses, including models of hydraulic models for predicting flood inundation (see Chapter 14); groundwater systems (see Chapter 15); and for predicting the response of urban drainage systems (see Chapter 18).

12.8 Examples of rainfall-runoff models 12.8.1 The probability-distributed model (PDM)

The PDM, developed by Bob Moore and Robin Clarke (1981) at the Institute of Hydrology in Wallingford (now the Centre for Ecology and Hydrology, CEH), represents one of the earliest attempts to allow for the spatial heterogeneity of runoff generation (though it is worth noting that the Stanford Watershed Model also used an 'infiltration' function that effectively allowed for a range of infiltration capacities uniformly distributed throughout a catchment). In doing so, Moore and Clarke, recognised that, in any application to a real catchment, it would not be possible to go out and measure the heterogeneity of the soil characteristics and that the heterogeneity of runoff generation might be produced in different ways or by different mechanisms. Accordingly, to keep things simple, they suggested representing the local storage deficits that needed to be satisfied before fast runoff generation would occur in a catchment conceptually as a probability distribution. Different catchments might then be represented by different forms of distribution. The characteristics of the distribution would then need to be calibrated, but they also had the idea that the model could be formulated so as to simplify the calibration process (Moore, 1985, 2007). A form of the PDM model is implemented in the Imperial College rainfall-runoff modelling MATLAB toolbox (see Wagener et al., 2004).

Essentially, the PDM assumes that at any point and at any time step, stormflow will be produced whenever any local storage deficit is filled so that q(t) = r(t) - D(t) - ea(t) (12.11)

where r(t) is the rainfall at time t, ea(t) is the actual evapotranspiration, and D(t) is the local storage deficit at time t. It is expected that the maximum possible local storage capacity (when dry) will vary throughout a catchment area while local deficits

INPUT

Rainfall Potential P evaporation, E

Rainfall Potential P evaporation, E

Probability distributed soil moisture storage

Groundwater recharge

Probability distributed soil moisture storage

Groundwater recharge

SURFACE STORAGE

Surface runoff, qs

SUBSURFACE STORAGE

Baseflow,

Fig. 12.8 Schematic representation of the probability-distributed model (PDM).

c max will vary as the catchment wets and dries. The approach taken by Moore and Clarke was to assume that the spatial variation takes the form of a statistical distribution function. They gave equations for the exponential, gamma and Weibull distributions, for all of which there is no maximum capacity, while Moore (1999) suggests that, from experience in the UK, a suitable function is the Pareto distribution, which does have a maximum. This has a cumulative density function of the form:

where c is the local storage capacity and cmax is the maximum storage capacity in the catchment (Fig. 12.8). For b = 1 the stores are uniformly distributed from zero to cmax; for b = 0, the storage capacity is the same everywhere in the catchment. This form allows the changing contributing area for fast runoff as the catchment wets and dries to be derived analytically (Moore, 2007).

At any time step, incoming rainfall will saturate part of the catchment and produce stormflow. The remaining net rainfall will decrease the available local storage deficit over the unsaturated part of the catchment, while any actual evapotranspiration will increase the deficits everywhere in the catchment. Integrating the local changes at each time step produces an estimate of the contributing area, stormflow and change in total storage. This allows the runoff generation in the model to be a non-linear function of the rainfall and antecedent wetness of the catchment.

The model is completed by a routing component. This has been implemented in a variety of different ways. In the original paper the runoff generated from the b distribution of stores was routed into an additional linear store. Later, a constant split into a fast flow pathway and a slow flow pathway was used. The two pathways were both treated as linear stores but with different mean residence time parameters. Later still, input into the slow pathway has been calculated as a function of bulk storage in the distribution of stores, while the stormflow has been routed directly into the fast pathway (Fig. 12.8).

A recent development of the PDM model has involved developing relationships between the model parameters and gridded soil and topographic information. Maximum soil storage capacity in a grid square is related to regional maximum gradient and storage capacities while the exponent b in (12.16) is treated as a function of the mean slope angle in a grid square (Bell and Moore, 1998; Cole and Moore, 2008). The analytical relationships of the PDM still hold at the grid scale level. The resulting grid to grid (G2G) model has been applied to predict runoff across the whole of the UK.

There have been some other models of this type, using functional forms to represent the non-linearity of stormflow generation based on an interpretation of spatially distributed storage deficits. Perhaps the most widely used has been the Xinanjiang/ Arno/Variable Infiltration Capacity (VIC) model. This was originally proposed by Zhao in China in the 1970s, then used for flood forecasting on the River Arno by Todini in Italy, and later adopted for use as a representation of land surface hydrology in large-scale grid elements of general circulation models (see Beven, 2001). Topmodel (see Section 12.8.3) is also based on predicting a pattern of storage deficits in a catchment, but in that case the pattern can be mapped back into the catchment, so this will be discussed later in this chapter.

12.8.2 Data-based mechanistic (DBM) modelling

The DBM modelling concepts were developed by Peter Young and his collaborators as a use of his CAPTAIN time series analysis program (which is now implemented as a MATLAB toolbox). CAPTAIN provides routines for calibrating linear transfer functions to input-output data. It can be used both for rainfall-runoff modelling and routing flood waves from upstream to downstream gauging stations (see Chapter 14). The idea is to let the data give an indication of the structure of the transfer function needed by fitting many different linear model structures and choosing the structure that gives the best compromise between goodness-of-fit and simplicity. The family of models from which the structures are chosen is essentially one or more linear storages, linked in either series or parallel, and with a suitable time delay. A general linear transfer function model can be written in the form:

Qt = a1 Qt-1 + a2 Qt-2 + anQt-n + bo Ut-1-d + b2 Ut-m-d

where Ut is the input at time t (here effective rainfall), Qt is the predicted variable at time t (here discharge), d is a time delay, and the a and b coefficients define the model. Introducing the backwards difference operator (z-1), which is defined by

Qt_ 1 = z 1Qt, then the transfer function can be written more simply as:

bo + b^-1 + b2z-2 + ...bmz-m ^ Qt -a1z-1 - a2z-2 + ...anz-n t-d

where A(z) and B(z) are polynomials in the backward difference operator with n a and m + lb parameters, respectively.

All transfer functions that are modelled as one or more linear stores (12.5) in series or in parallel are of this general type. The Nash cascade mentioned earlier in the discussion of the unit hydrograph is a specific example of a such a model, although DBM models generally have only an integer number of elements (the model order or number of a coefficients), do not constrain the time constants to be equal and include the possibility of including the time lag d. A further requirement is that the model structure should have a mechanistic interpretation. In the rainfall-runoff case this means that the transfer function should have a sensible time delay and have ordi-nates that are everywhere positive (see also the discussion of the Muskingum flood routing model as a linear transfer function in Section 14.2.5). One of the features of the DBM modelling approach is that application of the general form (12.14) allows many different model structures to be tried very quickly. Thus, the model structure appropriate to a particular data set can be determined rather than a specific model structure set beforehand. However, in doing so it is necessary to be careful to avoid over-parameterisation. By adding more a and b coefficients, it will often be the case that model fit will increase but, as with fitting any polynomial function, more coefficients does not necessarily mean that the model performs better in prediction. Too many coefficients (over-parameterisation) may mean that the model is not robust in prediction. Thus, the principle of parsimony should be followed (the simplest model that will provide acceptable predictive accuracy should be chosen). Parsimonious models should be more robust in prediction in that they will capture the dominant modes of the response even if they do not give the highest values of performance measures such as (12.10) in calibration. Over-parameterisation is an issue for many hydrological models, since trying to include more process understanding into models often means introducing more parameters. The DBM approach tries to let the data suggest how complex a model is needed.

In the same way as the unit hydrograph, this type of modelling approach is limited in its application to rainfall-runoff predictions by its assumption of linearity. Catchment rainfall-runoff relationships are not linear, so an additional model component is required to obtain a good fit to the data over a wide range of conditions. There are two ways of doing this. The first is to assume a conceptual structure for the non-linearity. This is the approach taken by the PDM model above, and the IHACRES model which simultaneously fits the parameters of a simple 'soil water store' and the transfer function parameters. IHACRES is an acronym of the Institute of Hydrology, Wallingford, and the Centre for Research on Environmental Systems at the Australian or

National University in Canberra who jointly developed the model (see Jakeman et al., 1990). It is interesting because, by fitting the model to many different gauged catchments, it has been used as the basis for some regionalisation studies in both the UK and Australia (see Sefton and Howarth, 1998; Post and Jakeman, 1996).

A second approach takes advantage of the fact that linear models can be used to both estimate outputs from inputs and to estimate inputs from outputs. Thus, having fitted an approximate DBM transfer function to rainfall-runoff data under wet conditions, that transfer function can then be used to estimate the effective rainfall inputs that would have produced the measured discharge outputs. In the DBM approach this is achieved by recursive (time step by time step) updating of a gain parameter on the rainfall inputs. When the catchment is wet, the gain should be high, when the catchment is dry, the gain should be low. This is one way of estimating a series of effective rainfall inputs, which gives more stable results than directly inverting the transfer function. Given a functional description of how the gain varies, the transfer function can be refitted, and so on. Early application of this approach revealed that, since the most readily available index of catchment wetness is the discharge itself, an effective rainfall function of the form

provided a good representation of the non-linearity (Young and Beven, 1994). This type of approach has now been used in a variety of contexts, including real-time flood forecasting and flood routing. It has also been generalised to allow more flexible forms of the non-linearity to be used, where the shape appears to be more complex (see Young, 2003). Again, the idea is to let the data suggest the correct form of non-linearity that should be used for that particular catchment, with a realistic interpretation.

In assessing model structures derived in this way, an important part of the methodology is to make an assessment of the model residuals to see whether there is any remaining structure that might be represented by an additional model component. Ideally the model residuals at the end of a modelling exercise should be 'white', that is they should be random with a variance that does not change with time or magnitude of the prediction, and with no clear autocorrelation from time step to time step. If this is not the case, then there is some feature of the response that has yet to be explained. This can be demonstrated by the application of the DBM methodology to one of the Coweeta experimental catchments reported in Young (2001) shown in Fig. 12.9. Having fitted the basic DBM model to the (daily time step) data (Fig. 12.9a,b), it seemed that there was a long-term seasonal component remaining in the residuals (Fig. 12.9c). This was then fitted by using mean daily temperature as an input to an additional linear transfer function component, improving the predictive capability of the model. The mechanistic interpretation is that there is an additional effect of evapotranspiration on effective rainfall that is not totally accounted for using the simple non-linear function of (12.15). Clearly, the additional component means that the model has more parameters, but the parameters are justified by the data (as revealed in the residuals) in this case (see Fig. 12.9c). The residuals of the more complex model were much closer to being 'white'. A Matlab toolbox3 is available to fit DBM models to data, while the basic concepts are included in the Lancaster University TFM4 program or the Imperial College rainfall-runoff modelling toolbox (Wagener et al., 2004).

Fig. 12.9 Application of a data-based mechanistic (DBM) model to modelling daily discharges from the small catchment at Coweeta, N. Carolina: (a) fitted power law non-linearity; (b) a section of the deterministic flow prediction in comparison with the observed discharges; (c) temperature-dependent seasonal component fitted to residual errors. (From Young, 2001, with permission of John Wiley & Sons.)

Fig. 12.9 Application of a data-based mechanistic (DBM) model to modelling daily discharges from the small catchment at Coweeta, N. Carolina: (a) fitted power law non-linearity; (b) a section of the deterministic flow prediction in comparison with the observed discharges; (c) temperature-dependent seasonal component fitted to residual errors. (From Young, 2001, with permission of John Wiley & Sons.)

The DBM approach to rainfall-runoff modelling is well suited to real-time flood forecasting (see Section 12.10 below). It has also been used to model the transport of pollutants in rivers as the aggregated dead zone (ADZ) model (Young and Wallis, 1993), and for flood routing along mainstream river channels (see Section 14.2.8).

### 12.8.3 Topmodel

Topmodel is a TOPography-based MODEL designed to simulate the runoff from hillslopes and source areas of gauged and ungauged catchments with inputs of rainfall. It was first suggested by Beven and Kirkby (1979) and has been widely used since. In Topmodel, the catchment is sub-divided into relatively homogeneous sub-catchment units based on the channel network and the separate outflows are routed downstream using a channel width function based, constant velocity, time-delay histogram to give the final catchment discharge.

The essential feature of the model is the prediction of saturated contributing areas based on the distribution of a topographic index in the catchment, previously suggested by Kirkby (1975), and the mean soil water deficit as it changes over time. As in the other models based on distribution functions described above, this greatly simplifies the calculations but, in the case of Topmodel, the topographic index can be mapped for the catchment, so that the predictions of saturated areas can be checked against any field information.

In making such a match, the assumptions are, however, critical. The three most important assumptions are as follows.

(1) The water table is nearly parallel to the soil surface, so that the hydraulic gradient is locally equal to the surface slope. This implies that the soil should not be too deep, that there should be a lower impermeable layer that is also near parallel to the surface, and that the slopes should not be flat or very steep.

(2) The water table takes up a configuration as if the storage at any point on a hillslope was being maintained by a constant uniform recharge rate over the ups-lope area draining through that point. This is treating the saturated zone storage as a succession of steady-state equivalent forms.

(3) The downslope transmissivity of the saturated zone can be represented as a simple function of the local storage deficit. In the original model, an exponential decline with increasing storage deficit was used, but other functions are also possible (Ambroise et al., 1996; Iorgulescu and Musy, 1997). Different transmissivity functions lead to different forms of topographic index that should be used in the Topmodel equations.

It is evident that such assumptions will not be applicable everywhere. A schematic representation of the sub-catchment model is given in Fig. 12.10.5

The essence of Topmodel lies in one equation that can be derived from the assumptions above. This gives the relationship between a local storage deficit resulting from gravity drainage, the sub-catchment mean storage deficit, and the distributions of topographic index and soil transmissivity in the catchment. Keeping with

the assumption of an exponential decline of transmissivity with deficit, i.e.

## Post a comment