Abstracts
Résumé
L'estimation du débit QT de période de retour T en un site est généralement effectuée par ajustement d'une distribution statistique aux données de débit maximum annuel de ce site. Cependant, l'estimation en un site où l'on dispose de peu ou d'aucune données hydrologiques doit être effectuée par des méthodes régionales qui consistent à utiliser l'information existante en des sites hydrologiquement semblables au site cible. Cette procédure est effectuée en deux étapes:
(a) détermination des sites hydrologiquemcnt semblables
(b) estimation régionale
Pour un découpage donné (étape a), nous proposons trois approches méthodologiques pour comparer les différentes méthodes d'estimation régionale. Ces approches sont décrites en détail dans ce travail. Plus particulièrement il s'agit de
- simulation par la méthode du bootstrap
- analyse de régression ou Bayes empirique
- méthode bayésienne hiérarchique
Mots-clés:
- Analyse de fréquence,
- estimation régionale,
- analyse bayésienne,
- bootstrap,
- comparaison
Abstract
Estimation of design flows with a given return period is a common problem in hydrologic practice. At sites where data have been recorded during a number of years, such an estimation can be accomplished by fitting a statistical distribution to the series of annual maximum floods and then computing the (1-1/T) -quantile in the estimated distribution. However, frequently there are no, or only few, data available at the site of interest, and flood estimation must then be based on regional information. In general, regional flood frequency analysis involves two major steps:
- determination of a set of gauging stations that are assumed to contain information pertinent to the site of interest. This is referred to as delineation of homogeneous regions.
- estimation of the design flood at the target site based on information from the sites ofthe homogeneous region.
The merits of regional flood frequency analysis, at ungauged sites as well as at sites where some local information is available, are increasingly being acknowledged, and many research papers have addressed the issue. New methods for delitneating regions and for estimating floods based on regional information have been proposed in the last decade, but scientists tend to focus on the development of new techniques rather than on testing existing ones. The aim ofthis paper is to suggest methodologies for comparing different regional estimation alternatives.
The concept of homogeneous regions has been employed for a long time in hydrology, but a rigorous detinition of it has never been given. Usually, the homogeneity concerns dimensionless statistical characteristics of hydrological variables such as the coefficient of variation (Cv) and the coefficient of skewness (Cs) of annual flood series. A homogeneous region can then be thought of as a collection of stations with flood series whose statistical properties, except forscale, are not significantly different from the regional mean values. Tests based on L-moments are at present much applied for validating the homogeneity of a given region. Early approaches to regional flood frequency analysis were based on geographical regions, but recent tendencies are to deline homogeneous regions from the similarity of basins in the space of catchment characteristics which are related to hydrologic characteristics. Cluster analysis can be used to group similar sites, but has the disadvantage that a site in the vicinity ofthe cluster border may be closer to sites in other clusters than to those ofits ovm group. Burn (1990a, b) has recently suggested a method where each site has its owm homogeneous region (or region of influence) in which it is located at the centre of gravity.
Once a homogeneous region has been delineated, a regional estimation method must be selected. The index flood method, proposed by Dalrymple (1960), and the direct regression method are among the most commonly used procedures. Cunnane (1988) provides an overview of several other methods. The general performance of a regional estimation method depends on the amount of regional information (hydrological as well as physiographical and climatic), and the size and homogeneity of the region considered relevant to the target site. Being strongly data-dependent, comparisons of regional models will be valid on a local scale only. Hence, one cannot expect to reach a general conclusion regarding the relative performance of different models, although some insight may be gained from case studies.
Here, we present methodologies for comparing regional flood frequency procedures (combination of homogeneous regions and estimation methods) for ungauged sites. Hydrological, physiographical and climatic data are assumed to be available at a large number of sites, because a comparison of regional models must be based on real data. The premises of these methodologies are that at each gauged site in the collection of stations considered, one can obtain an unbiased atsite estimate of a given flood quantile, and that the variance of this estimate is known. Regional estimators, obtained by ignoring the hydrological data at the target site, are then compared to the at-site estimate. Three difrerent methodologies are considered in this study:
A) Bootstrap simulation of hydrologic data
In order to preserve spatial correlation of hydrologic data (which may have an important impact on regional flood frequency procedures), we suggest performing bootstrap simulation of vectors rather than scalar values. Each vector corresponds to a year for which data are available at one or more sites in the considered selection of stations; the elements ofthe vectors are the different sites. For a given generated data scenario, an at-site estimate and a regional estimate at each site considered can be calculated. As a performance index for a given regional model, one can use, for example, the average (over sites and bootstrap scenarios) relative deviation ofthe regional estimator from the at-site estimator.
B) Regression analysis
The key idea in this methodology is to perform a regression analysis with a regional estimator as an explanatory variable and the unknown quantile, estimated by the at-site method, as the dependent variable. It is reasonable to assume a linear relation between the true quantiles and the regional estimators. The estimated regression coeflicients express the systematic error, or bias, of a given regional procedure, and the model error, estimated for instance by the method of moments, is a measure of its variance. It is preferable that the bias and the variance be as small as possible, suggesting that these quantities be used to order different regional procedures.
C) Hierarchical Bayes analysis
The regression method employed in (B) can also be regarded as the resultfrom an empirical Bayes analysis in which point estimates of regression coeflicients and model error are obtained. For several reasons, it may be advantageous to proceed with a complete Bayesian analysis in which bias and model error are considered as uncertain quantities, described by a non-informative prior distribution. Combination of the prior distribution and the likelihood function yields through Bayes, theorem the posterior distribution of bias and model error. In order to compare different regional models, one can then calculate for example the mean or the mode of this distribution and use these values as perfonnance indices, or one can compute the posterior loss.
Keywords:
- Frequency analysis,
- regional estimation,
- Bayes,
- bootstrap,
- comparison