Tony Ladson has 30 years experience in hydrology and river management and has worked on projects throughout Australia and in the US, Taiwan and PNG. He has a PhD from the University of Melbourne and a Master of Science from the University of Minnesota.
The Kuczera Symposium was held on June 12 and 13 which was a wonderful celebration of George’s contribution to hydrology. There was much discussion about Australian Rainfall and Runoff (ARR) on Day 1. Some of the things that I noted were:
More examples have been added in the ARR 2019 version of the Urban Book (book 9).
Losses in urban areas: There is advice associated with the losses available from the ARR data hub that: These losses are only for rural use and are NOT FOR DIRECT USE in urban areas. It is clear in the ARR text, that despite this advice, these loses may be used for pervious areas within urban catchments (Book 9, Chapter 6.4.2 and Book 5 Chapter 3.5.3).
There are differences in the terminology used by ARR and the Bureau of Meteorology. For example, ARR defines rare rainfall as 1 in 10 AEP to 1 in 100 AEP (Book 1, Figure 1.2.1). The BoM defines rare rainfall as 1 in 100 to 1 in 2000 (See the BoM IFD webpage).
Rural losses, available from the data hub for the whole of Australia, were estimated from only 38 small catchments. Therefore local loss estimates should be used where available.
Some ARR data are of high quality e.g. IFD data from the Bureau, others are only intended to be used when there is nothing better available e.g. losses from the data hub.
Research in NSW shows that using median pre-burst rainfall to calculate burst losses to use for modelling, results in underestimation of these losses. Work has been done in NSW to provide more accurate burst initial losses. Similar work has yet to be undertaken in other states.
The continuing losses from the data hub were shown to be too high in NSW.
There are some rogue temporal patterns in the ensembles provided on the data hub. Check and if necessary remove problem patterns.
When undertaking flood frequency analysis where there are many small floods, it may be better to use the GEV distribution fitted with L moments rather that discarding a large amount of data as suggested by the Grubbs-Beck test.
More peaked temporal patterns have a clear association with increasing temperatures (Wasko and Sharma, 2015). To take this into account we may need to weight peaked temporal patterns more highly in ensemble analysis.
There needs to be more testing of ARR procedures where all the various inputs and methods are applied. The work by WMA water on 165 catchments in NSW is a good start (link to report).
The management framework to take ARR to the next step is yet to be worked out. There was discussion around which institution should lead the next phase and who would be involved. This is yet to be resolved.
The previous post looked at the accuracy of the RFFE in NSW, here the analysis is extended to Victoria. Again, the focus is on how well the RFFE predicts the 1% AEP flood compared to at-site flood frequencies.
There isn’t a readily available independent set of flood frequency estimates to use for comparison, so an alternative approach was developed. The input data for the RFFE includes 186 gauges in Victoria and with nearby NSW gauges that are likely to be included in a region of influence for a Victoria site, the total number of gauges comes to 205. For these 205 gauges we can obtain the at-site flood frequency estimates for the 1% flood from the RFFE input data. We can also request the RFFE tool to return 1% AEP estimates at these same sites.
A comparison of the results is shown as a scatter plot in Figure 1. The points generally lie along the 1:1 line albeit with substantial variation.
Figure 1: Comparison of RFFE estimates: 1:1 line is shown dashed; blue line is a smooth (loess) through the data.
The same data are shown Figure 2 which plots the ratio of each pair of estimates against their average (a Bland-Altman plot). The approximate 95% limits of agreement are calculated as the overall average of the logs of the ratios +/- two times the standard deviation of the logs of the ratios. In this case, the limits of agreement are 5.45 and 0.22 which means, on overage, 95% of at-site estimates will be between 5.45 and 0.22 times the RFFE estimates. There is also a small under-prediction bias of 10% (the difference between the average of the ratios and a value of 1). If all the RFFE outputs were multiplied by 1.1, this would be eliminated.
Figure 2: Comparison as a Bland Altman plot, details of outliers are provided in Table 1
Note that this scatter in Figures 1 and 2 is not wholly caused by problems with the input data since we are comparing outputs where inputs are available. The uncertainty is largely due to the algorithm used to make the estimates of the 1% AEP values. This also means the uncertainty shown here is underestimate of the total error when predicting 1% AEP values at an ungauged site.
To improve the RFFE we need to include better input data but also improve the prediction approach. These was substantial research undertaken to develop the method (Rahman, et al., 2015) but further improvements may be possible.
A pointer to possible issues with input data are the outliers shown on the figures, with further details in Table 1. The errors for the arid zone sites on Mount Hope Creek and Richardson River are particularly large. Investigating these outliers may improve average performance. There are also other sites that need to be checked. For example, Reedy Creek at Wangaratta North (403209) is included in the RFFE inputs as a small rural stream, when in fact it is an anabranch of the much larger Ovens River. Similarly Tuppal Creek in NSW is included but it is an anabranch of the Murray River.
Station_id
Station name
Area
(km2)
AEP 1% flood
(RFFE input)
(m3/s)
AEP 1% flood
(RFFE out)
(m3/s)
220002
Stockyard Creek at Rocky Hall (Whitbys)
75
3483
411
223214
Tambo River u/s Smith Ck
32
8
41
235209
Aire River @ Beach Forest
21
2769
110
407236
Mount Hope Ck @ Mitiamo
1629
16600
616
415257
Richardson River @ Donald
1831
319
1780
With the RFFE being recommended by Australian Rainfall and Runoff, and being widely used in Victoria, undertaking work to improve its accuracy is important.
Code for analysis and figures is available as a gist.
The Regional Flood Frequency Estimation (RFFE) web app was developed as part of the revision to Australian Rainfall and Runoff. The app allows the estimation of design flood magnitudes anywhere in Australia using a simple web interface (Rahman et al., 2015).
Recent work in NSW allows an assessment of the accuracy of the RFFE in comparison with flood frequency estimates from stream gauge data. WMA Water analysed the flow records from 160 gauges using at-site flood frequency analysis which provides an independent dataset to compare with RFFE design flood values (Podger et al., 2019). Here, I’ve focused on the 1% AEP flood. Although there is often substantial uncertainty in estimating the magnitude of the 1% flood from the typically short records available, it is the most widely used design flood value, so it is important to quantify its reliability.
A comparison of the RFFE estimates with at-site values is shown in Figure 1. The figure shows the paired values compared to the 1:1 line. The RFFE over-predicts larger values with a smooth line through the data deviating to the right of this line. The bias (average of the differences FFA – RFFE) is -405 m3/s, the RMSE is 1460 m3/s and the Nash-Sutcliffe coefficient is -1.29. The variance increases as the magnitude of the 1% AEP values increase.
The results look better when log transformed, appearing to lie closer to the 1:1 line (Figure 2) and the Nash-Sutcliffe coefficient is 0.39. The over-prediction is still apparent above about 1000 m3/s.
Another way of comparing these data is by using a Bland-Altman plot, also known as a Tukey mean-difference plot. Where the variance increases with the mean, as is the case for this data set, the average of the paired values is plotted on the x-axis and the log of the ratio of the values is plotted on the y-axis (Bland and Altman, 1999) (Figure 3).
Figure 3: Bland Altman plot comparing at-site and RFFE values. Outliers are highlighted; the y-axis has been log transformed
The bias (average log ratio) is 0.88. If the RFFE estimates where multiple by 0.88 they would, on average, be equal to the at-site values.
The approximate limits of agreement between the datasets can be calculated as the mean of the log ratios +/- two times the standard deviation of the log ratios. For this data, the limits are 4.53 and 0.17. This means that on average, 95% of the time, the at-site values fall between 4.53 times the RFFE value, and 0.17 times the RFFE value.
These aren’t great results so its worth thinking about how the accuracy of the RFFE could be improved. One way would be to address the outliers as they have a major influence on the average accuracy. These outliers are labelled on Figure 3 as gauge numbers 210084 and 212040. Removing both outliers changes (improves) the upper and lower limits of agreement to 3.63 and 0.19 but increases the bias to 0.84. The bias increase because the over-prediction of the RFFE values becomes more apparent if the two outliers are removed.
Some information on the outliers are provided in Table 1. For Glennies Ck (210084) the RFFE underestimates the 1% AEP at-site value by a factor of 46. This site is challenging because a large dam was constructed in 1983 immediately upstream which has greatly reduced peak flows. The RFFE is not intended to provide accurate peak flow estimates where there are large upstream storages so perhaps this site should not be included. I don’t know of any particular issues with the other gauge, 212040.
Table 1: Details of outliers shown on Figure 3
Gauge No
Site
River
Catchment area
km-square
RFFE
(1% AEP)
(m3/s)
At-site
(1% AEP)
(m3/s)
210084
The Rocks No. 2
Glennies Ck
227
130
5985
212040
Pomeroy
Kialla Ck
96
138
1819
The accuracy of the RFFE would also be improved if there was better input data. The flood frequency information from the 160 sites studies by WMA Water could be included in the tool, along with information from flood studies where there has been a thorough analysis of flood behaviour and generation of accurate flood frequency information. It is important that we work to improve the RFFE so that it provides the best possible design flood estimates. Code to undertake the analysis and create the graphs is available as a gist.
Bland, J. M., & Altman, D. G. (1999). Measuring Agreement in Method Comparison Studies — A Review. Statistical Methods in Medical Research, 8, 135–160. https://doi.org/10.1007/0-8176-4422-9_13
Haddad, K., Rahman, A., Weinmann, P. E. (2010). Stream flow data preparation for regional flood frequency analysis : Lessons from southeast Australia. Australian Journal of Water Resources, 14(1)17-32. (link to abstract)
Lots of interesting things in the paper which reports on use of 7 rainfall runoff models as documented in 1,529 peer-reviewed papers.
1. Legacy rather than adequacy
Modellers use the model they are familiar with rather than looking to find the model that best fits their problem. Addor and Melsen found that in 74% of published studies, the chosen model could be predicted based solely on the affiliation of the first author. Models are usually associated with particular regions where they dominate. The same model was used for a wide variety of purposes, in various landscapes and at a range of temporal and spatial scales.
The reason models are selected for legacy reason is that “it works” i.e modelling gets done and papers get published in peer reviewed journals. Over time, hydrologists develop deep understanding of their models and develop ways to become highly productive. They also come to perceive their models are adequate to tackle the problems at hand.
2. Model evaluation
The Nash-Sutcliffe Coefficient, although widely used, isn’t necessarily a good measure of model performance.
An example of the weight of legacy in hydrology is the use of the Nash-Sutcliffe Efficiency to evaluate models. With the advent of automated calibration techniques, models can provide high NSE values in most catchments (emphasis added). Yet studies relying on more process-oriented metrics, stricter evaluation procedures and multiple data sets have demonstrated that the resulting simulations often involve unrealistic features and lack of robustness.
3. Are modellers missing opportunities?
Would better results be possible with a different choice of model? Perhaps.
First, there are large costs in setting up and using a model that is not familiar. Second, there is little guidance on selecting a model for a given purpose or landscape. Inter-comparisons of models that are applied to a specific task are rare. Third, prior to using a new model, it is not clear what the advantages would be compared business as usual. The upshot is that there are lots of risks in adopting a new model and rewards are uncertain.
However we are missing out on opportunities to improve models. Innovations tend to occur within the silo of single models rather than being adopted across the whole modelling field.
Tools, such as RRL (Rainfall Runoff Library) help tackle the problem of using and comparing a range of models. The RRL provides a modelling framework that facilitates setting up models and then using the same fitting methods and objective functions.
4. Seven widely used conceptual rainfall runoff models
The paper focuses on 7 widely used models:
HBV – Hydrologiska Byråns Vattenbalansavdelning model
VIC – the Variable Infiltration Capacity model
Sacramento – soil moisture accounting model
GR4J – Génie Rural 4 model
mHM – Mesoscale Hydrological Model
TOPMODEL – TOPography based hydrological model
PRMS – Precipitation Runoff Modelling System
The regions where these models are used are mapped in the paper. International graduate students learn about a model where they study and then apply it to local problems on their return. This has facilitated the spread of models, particularly from the US which is the destination for many international graduate students.
All these models (except mHM) are shown as being used in Australia but no model is dominant. However there was no mention of the ‘Australian’ rainfall runoff models AWBM, SimHyd or IHACRES. A quick search of Informit suggests that AWBM is the most cited model in Australian conferences and journals, followed by SimHyd then Sacramento, GR4J, IHACRES, and TOPMODEL. HBV and PRMS are not mentioned in Informit but the Addor and Melsen paper suggests there are a few Australian examples of their use.
At present there are 5 models in the RRL: AWBM, Sacramento, SimHyd, SMAR and Tank. It would be nice to think the capability could be expanded to take in more models, but its hard to see this becoming a priority. There has been little work on the RRL since 2004 during the exciting times, for hydrologists, of the CRCCH. We are probably stuck with the situation where the model that gets used, largely depends on who gets the modelling job.
Previous posts (here and here) have looked at Australian Rainfall and Runoff areal temporal patterns and identified some that appear to be unusual and may lead to problems of embedded bursts. This post reviews point patterns. Point patterns are suitable for modelling smaller catchments; those with areas less than 75 km2.
Embedded bursts occur when the rainfall associated with a period within a pattern is rarer than the event represented by the pattern as a whole. Whether embedded bursts will be an issue depends on local values of design rainfall so patterns need to be checked whenever hydrological models are applied to a new location. These checks are available in many hydrologic modelling programs.
Here I’ve used an approximate method to identify that patterns where there is potential for embedded bursts. This involves the calculation of a non-uniformity index (NUI) (explained here).
The top ten least uniform point patterns are shown in Table 1 with the top 8 graphed in figures 1 and 2.
Table 1: Least uniform point temporal patterns
EventID
Durtation (min)
Duration (hr)
Time step (min)
Region
AEP
NUI
Source Region
1208
8640
144
180
Rangelands (West)
frequent
419
Rangelands (West)
1182
10080
168
180
Rangelands (West)
intermediate
416
Rangelands (West)
1973
10080
168
180
Rangelands
frequent
375
Rangelands
1183
8640
144
180
Rangelands
intermediate
357
Rangelands
1181
8640
144
180
Rangelands (West)
intermediate
357
Rangelands (West)
2870
10080
168
180
Central Slopes
frequent
342
Central Slopes
1083
10080
168
180
Rangelands
intermediate
341
Rangelands
1079
10080
168
180
Rangelands (West)
intermediate
341
Rangelands (West)
Figure 1: Top 4 least-uniform patterns
Figure 2: Next 4 least uniform patterns
All of these are long patterns, 144 hours or 168 hours and contain extensive periods of little rain with isolated bursts. Nine of 10 patterns are from rangelands; arid areas where long rainstorms are relatively rare. Deciding on representative patterns for modelling long duration events in these areas is challenging.
Large peaks 24 hours apart
Analysis of areal patterns showed here were some that had large peaks 24 hours apart which could have been caused by accumulated rainfall in pluviograph records that had not been processed correctly. The same analysis was undertaken for point patterns. The four patterns with largest peaks 24 hours apart are shown in Figure 3. None of these appears to be particularly unusual.
Figure 3: Point patterns were large peaks are separated by 24 hours
In summary, point patterns do not appear to have the problems that were identify for areal patterns. Modellers still need to check for embedded bursts which depend on local values of design rainfall.
Code to reproduce the figures and analysis is available as a gist.
The previous post looked at some areal temporal patterns from the ARRdata hub that seemed odd, with a few large peaks. Matt Scorah pointed out that the peaks were commonly 24 hours apart so could be caused by accumulated rainfall totals in pluvio records.
This raises the question of whether there are other suspect patterns, with large peaks 24 hours apart. I searched for patterns where the largest and second largest values were greater than 15% and the difference in time between these values was 24 hours. Most of the patterns in the previous blog showed up but there were a few others. The two standouts are patterns 5829 and 5732 (Figure 1) which, as before, originate from the Murray Basin in the area between Canberra and Cooma. These patterns also occur as Central Slopes eventIDs 4049 and 3951.
Figure 1: Areal temporal patterns with large peaks 24 hours apart.
Recently, I had a task to troubleshoot some odd results from a hydrologic model. High flood discharges were being produced from very long duration events which was not expected. The issue came down to a few unusual temporal patterns.
The recommendation in Australian Rainfall and Runoff is that an ensemble of 10 temporal patterns are used for modelling. These depend on the region, catchment area and duration and are available from the ARR Data Hub. The temporal pattern regions are shown in Figure 1.
There are two types of patterns; point and areal. Areal patterns are appropriate when the catchment area exceeds 75 km2. This blog only considers areal patterns.
Figure 1: Regions used for Temporal Patterns (ARR Book 2, Figure 2.5.7)
An example of an unusual pattern is as follows. The 10 areal temporal patterns for the southern slopes (mainland) for 120 hours duration for 200 km2 catchment area are shown in Figure 2. Pattern 7 (event ID 6884) stands out because of the small number of large values. The largest peak in the pattern implies that 48.82% of the 120 hour rainfall occurs in a single 180 min (3 hour) increment. This is likely to cause an intense embedded burst that may produce misleading results from modelling.
Figure 2: Ensemble areal temporal patterns for Southern Slopes (mainland) for 120 hours duration, and catchment area of 200 km2
Embedded bursts
“Embedded bursts” occur when a period of rainfall within a temporal pattern has an annual exceedance probability rarer than the burst as whole. Embedded bursts can cause issues with modelling and modellers should consider if they should be removed or smoothed (see Book 2, Section 5.2.1 and 5.9.4; Scorah et al., 2016). We can check if pattern 7 is likely to lead to such a problem.
The areal temporal patterns in Figure 2 are for the Southern Slopes region which includes Melbourne (Figure 1), so let’s use the Melbourne CBD as an example (lat = -37.8139, lon = 144.9633). The 1 in 100, 120 hour rainfall for Melbourne is 194 mm (from the BoM). Applying pattern 7 means there would be a 3 hour period when 48.82% of this rainfall occurs i.e. 94.7 mm in 3 hours. Looking at the IFD data for Melbourne shows that an event of 94.7 mm in 3 hours has an AEP of between 1 in 500 and 1 in 1000 years. So the 1 in 100, 120 hour storm contains a very rare 3 hour embedded burst. If we run a model using this pattern, the results are likely to depend on this one 3 hour period of rain rather than being indicative of the long 120 hour storm what was of interest.
Where did this pattern come from?
The creation of design areal temporal patterns is explained in ARR Book2, Chapter 5.6. They are scaled from real storms but if there were insufficient patterns for a particular region then patterns may be donated from neighbouring regions. For example, the Southern Slopes region can take patterns from the Murray Basin (see ARR Book 2, Table 2.5.10); pattern 7 is an example of this. From the data hub, it is possible to download a file that contains information about the storms that were used to create the patterns. For the southern slopes region, this file is Areal_SSmainland_AllStats.csv. Looking at pattern 7 (Event ID 6884) shows this pattern came from Jerangle, in the Murray Basin, 100 km south of Canberra (lat = -35.8875, lon = 149.3375). IFD information for this location shows that the 1 in 100, 120 hour storm is 336 mm. Pattern 7 means that 48.82% of this rain (164 mm) falls in 3 hours. For the Jerangle region this is much larger than a 1 in 2000 year event (the 1 in 2000 year, 3 hour rainfall is 96.7 mm). This suggests that pattern 7 is based on a very unusual storm so may not be appropriate for modelling “ordinary risks” such as those associated with 1% events. Another possibility is that there is an error in the data.
Searching for other odd patterns
Finding a single strange pattern naturally leads the problem of searching for others. Whether a rare embedded burst is present depends on both the temporal pattern and the rainfall characteristics of a location but it was not practical to obtain and process pattern and IFD data for the whole of Australia. Instead a non-uniformity index (NUI) was developed which is a measure of the difference between an ARR temporal pattern and a uniform pattern (which has the same rainfall in each increment). The details of the NUI are provided below. A high value of the NUI won’t necessarily indicate an issue with embedded bursts but will identify patterns that require checking. The NUI was calculated for all 10,692 areal temporal patterns on the data hub. The top 17 least-uniform patterns are listed in Table 1.
The first 3 entries are actually the same pattern that originated in the Murray Basin and which has been used in the neighbouring regions of Central Slopes and Southern Slopes (mainland). This next two patterns are the same and also originated in the Murray Basin and there are similar examples throughout the table. Of the top 17 least-uniform patterns, 15 are from the Murray Basin and all originated in the Jerangle area in 5 separate but nearby locations as mapped on (Figure 3). The pattern listed in row 14 originated near Charleville, Qld about 600 km west of Brisbane and the final pattern in the table, from the Monsoonal North is based on a storm from Rum Jungle, 100 km south of Darwin, NT.
Table 1: Least uniform areal temporal patterns
No.
Event ID
Duration (hour)
Region
NUI
Source region
Area
Lat
Lon
1
4207
120
Central Slopes
481
Murray Basin
200
-35.89
149.34
2
5989
120
Murray Basin
481
Murray Basin
200
-35.89
149.34
3
6884
120
Southern Slopes (mainland)
481
Murray Basin
200
-35.89
149.34
4
4119
96
Central Slopes
385
Murray Basin
200
-35.89
149.34
5
5901
96
Murray Basin
385
Murray Basin
200
-35.89
149.34
6
5644
36
Murray Basin
378
Murray Basin
500
-35.91
149.46
7
3950
48
Central Slopes
347
Murray Basin
500
-35.96
149.51
8
5731
48
Murray Basin
347
Murray Basin
500
-35.96
149.51
9
4116
96
Central Slopes
335
Murray Basin
200
-35.94
149.49
10
5898
96
Murray Basin
335
Murray Basin
200
-35.94
149.49
11
6797
96
Southern Slopes (mainland)
335
Murray Basin
200
-35.94
149.49
12
4029
72
Central Slopes
307
Murray Basin
200
-35.91
149.41
13
5810
72
Murray Basin
307
Murray Basin
200
-35.91
149.41
14
4170
96
Central Slopes
252
Central Slopes
10000
-26.39
147.71
15
4043
72
Central Slopes
229
Murray Basin
500
-35.96
149.51
16
5822
72
Murray Basin
229
Murray Basin
500
-35.96
149.51
17
5004
96
Monsoonal North
210
Monsoonal North
100
-13.06
130.96
Figure 3: Map showing the origins of 15 of the 17 patterns in Table 1. This area is about 100 km south of Canberra, ACT
The patterns listed in Table 1 are plotted as follows. Murray Basin pattern 5989 (row 2) is shown in Figure 2 as pattern 7 (southern slopes eventID 6884). The next 4 Murray Basin patterns from Table 1 (rows 5, 6, 8 and 10) are shown in Figure 4 below. All have similar characteristics with one increment producing 40% to 60% of the rain for the whole pattern. Figure 5 shows the next lot of patterns (rows 13, 14, 16 and 17 in Table 1). By the time we get down to row 17 (Monsoonal North eventID 5004) the pattern appears closer to expectations.
Figure 4: Non-uniform patterns specified in Table 1 rows 5, 6, 8 and 10
Figure 5: Non-uniform patterns specified in Table 1 rows 13, 14, 16 and 17
At least to me, its not clear why a range of highly non-uniform patterns should all originate in the same area of NSW (Figure 3), but it does suggest the original data should be checked. In the interim, modellers could consider excluding these patterns and running ensemble or Monte Carlo analyses without them.
I haven’t looked at point temporal patterns but it would seem appropriate to check that they are reasonable and suitable for modelling.
Scorah, M., P. Hill, S. Lang and R. Nathan (2016). Addressing embedded bursts in design storms for flood hydrology 37th Hydrology & Water Resources Symposium. Tasmania, Engineers Australia. (link)
Appendix: NUI non-uniformity index
A non-uniformity index was used to rank patterns and find the least uniform. These were likely to contain rare embedded bursts. The approach was to use the chi-squared probability calculated using a quantile based on the difference between a particular temporal pattern and a uniform pattern. The number of degrees of freedom was the number of increments in the pattern minus one.
The quantile was calculated as:
Where
is the proportion of rainfall in a increment in the pattern
is the proportion of rainfall in an increment in a uniform pattern
The sum is over all the increments in the pattern
Degrees of freedom (df) is the number of increments less one
From the value of , and df, find the probability . I used the pchisq function in R do this.
This probability is a very small value for the highly non-uniform patterns so was transformed to be in a range that is easy to work with with maximum values about 500 for the most extreme patterns listed in Table 1.
Australian Rainfall and Runoff 2016 has an example of fitting a probability model to POT (peak over threshold) data (Section 2.8.11 in Book 3). It took me a long time to understand this example. To these similarly challenged, I hope this blog will help make it easier.
Calculations are available as an excel spreadsheet; R code is available as a gist. The R code reproduces all the results and figures from this blog.
The case study is the Styx River at Jeogla with the POT series providing 47 peaks above the threshold of 74 m3/s for a period of record of 47 years. The data are as follows:
Our goal is to fit a probability model to these data and estimate flood quantiles.
You can download the data as a csv file here. This is the same data set that was used in the 1987 edition of ARR (Book IV, Section 2.9.5). I’ve sourced the date of each flood from ARR1987 and included it in the file. Interestingly the earliest flood discharge is from 1943 and the last from 1985, a record of 43 years not 47. This is because the floods between 1939 and 1943 were all less than the 74 m3/s threshold (from 1939 to 1942 they were: 58.0, 13.0, 22.1, and 26.1 m3/s).
Start by calculating the the plotting positions for these data points. ARR2016 recommends using the inverse of the Cunnane plotting position formula to calculate recurrence intervals (see ARR 2016 equation 3.2.79, in Book 3, Chapter 2.7.1).
Where:
average interval between exceedances of the given discharge (years)
is the rank of flows in the POT series (largest flow is rank 1)
is the number of years of data.
Note that is the number of years of record, not the number of peaks.
Also, its important to recognise that equation 1 works differently for the POT series compared to the annual series. Let’s look at an example that shows this difference. Consider the case where we were analysing 94 peaks from a POT series from 47 years of data. In that case, the smallest peak would have a rank of 94. The recurrence interval would be:
Which is about half a year, or 6 months, which seems reasonable (there were 94 peaks in 47 years so there are 2 peaks per year on average and they will be about 6 months apart on average).
Interpreting the Cunnane plotting position as annual exceedance probability would give a result of:
That is, the probability is greater than 1 which doesn’t make sense. So we can’t use a plotting position formula to calculate the AEP of data from a peak-over-threshold series. Instead, the inverse of equation 1 can be interpreted as EY, the expected number of exceedances per year.
The ARR example fits the exponential distribution as the probability model. This has two parameters, and , which can be determined by calculating the first and second L moments of the POT series data.
L1 is just the mean of the pot series (266.36 m3/s). To calculate L2 we use the formulas developed by QJ Wang (Wang, 1996)
Where:
are the flows (the POT series) in order from smallest to largest
is the number of flows
L2 = 79.12
I’ve calculated L2 using this formula as shown in the companion spreadsheet but it does require a fudge. Excel doesn’t automatically recognise that 0Cn is equal to zero. This is a standard mathematical convention that Excel doesn’t seem to implement.
The parameters of the exponential distribution can be expressed in terms of the L moments (See ARR2016, Table 3.2.3.) We are using the formulas for the Generalized Pareto (GP) but setting to zero which results in the exponential distribution. The GP distribution may be useful in real world applications.
For the exponential distribution:
The example in ARR doesn’t include confidence intervals for these parameter estimates, but these can be calculated using bootstrapping.
The 95% confidence intervals are, for beta (37.4, 89.4) and (120.6, 244.7). The correlation between the parameters is about -0.5. Uncertainty in parameters is shown graphically in Figure 1.
Figure 1: Uncertainty in parameters. Ellipse shows 95% confidence limit
The exponential distribution relates the exceedance probability to the size of a flow. In this case, the probability that a peak flow, , exceeds some value in any POT event is:
The ARR example cross-references these formulas to Table 3.2.3, but the reference should be to Table 3.2.1. The final formula in Table 3.2.1 shows the cumulative distribution function for the exponential distribution. This is for the non-exceedance probability. In hydrology we usually use exceedance probabilities (the probability that a flood is greater than a certain value), therefore the required equation is one minus the equation shown.
We can convert the POT probabilities to expected number of exceedances per year, by multiplying by the average number of flood peaks above the threshold per year. The example in ARR states that the threshold is i.e. 68.12 m3/s but the threshold used to create the POT data was 74 m3/s. I’m not sure, but I think we should use the 74 m3/s threshold. There is no information available on how many peaks there are above 68.12 m3/s. It is likely the ambiguity has arise because of a typo in the example i.e. the threshold should not have been related to .
Sticking with the 74 m3/s threshold, there are there are 47 values in POT series and 47 years of data, so the average number of peaks per year is 1. Therefore, the EY can be expressed as follow:
Where latex is the average number of POT floods per year, which is 1 in this case. In real applications, is likely to be greater than 1. The discussion in ARR2016 mentions the flow threshold is typically selected so that there are 2 to 3 times more peaks in the POT series compared to the annual series (Book 3, Chapter 2.2.2.2). Although in Chapter 2.3.4 it is mentioned that there are circumstances where the POT and annual series should have the same number of peaks.
can also be related to as follows:
We can also relate to annual exceedance probability by the formula:
Therefore, a flood magnitude, can be related to an AEP, and an AEP to a flood magnitude, , through the following pair of equations (ARR2016 Equation 3.2.11).
These equations can be used to estimate quantiles at standard EY values. Confidence intervals (95%) have been estimated using bootstrapping (Table 1).
Table 1: Flood quantiles and 95% confidence limits at standard EY values
EY
AEP
AEP (1 in X)
ARI
Q
Lwr
Upr
1.00
0.63
1.6
1.0
68
39
90
0.69
0.50
2.0
1.4
127
106
150
0.50
0.39
2.5
2.0
178
151
215
0.22
0.20
5.1
4.5
308
253
404
0.20
0.18
5.5
5.0
323
267
427
0.11
0.10
9.6
9.1
417
335
565
0.05
0.05
20.5
20.0
542
434
754
0.02
0.02
50.5
50.0
687
548
965
0.01
0.01
100.5
100.0
797
627
1167
Now plot the data and the fit (Figure 2). The x values are the EY calculated from the inverse of equation 1; the y values are the flows. The fit is based on equation 6. Confidence intervals for quantiles were calculated using bootstrapping.
Figure 2: Partial series for the Styx River at Jeogla showing fit and confidence intervals
Personally I prefer the graph to be increasing to the right which is the way flood frequency curves are usually plotted, its also the way a similar example was presented in the 1987 edition of Australia Rainfall and Runoff. This just involves using the ARI as the ordinate (Figure 3).
Figure 3: Partial series for the Styx River at Jeogla showing fit and confidence intervals (same as Figure 2 except increasing to the right)
A plot of measured and predicted times of concentration using the Pilgrim McDermott method is shown in Figure 1. This is based on the 95 measurements that they used to develop their formula so, not surprisingly, the fit looks ok. The figure also shows a smooth fit through the data (the blue dashed line) which closely tracks the the 1:1 line.
An insightful comment from one of my undergraduate lectures was that “anything looks good on a log-log plot”, so lets check the fit without a log transformation (Figure 2). A noticeable feature is the increasing variance with increasing time, and a tendency to under-predict for large times. The Nash-Sutcliffe coefficient is 0.85 for the log transformed data and 0.70 without transformation.
Figure 1: Comparison of modelled and observed time of concentration values, Pilgrim McDermott formula, log scale
Figure 2: Comparison of modelled and observed time of concentration values, Pilgrim McDermott formula, linear scale
Now let’s check the Bransby Williams fit. The formula, as recommended in Australian Rainfall and Runoff, 1987 edition is:
Where:
is the time of concentration in minutes is the slope (m/km) is the main stream length measured to the catchment divide (km) is the area of the catchment (km2)
Both average slope and equal area slope are used. ARR 1987 acknowledges that Bransby Williams used the average slope but preferred the equal area slope and noted the reasonable fit to data in French et al. (1974), (although French et al. used the average slope in their analysis).
Figure 3 shows the quality of the fit using both slopes. There is a tendency for large over-prediction as time of concentration increases. The Nash Sutcliffe coefficients for average and equal area slope are 0.52 and 0.60 for log transformed data and -2.3 and -4.4 for non-transformed data.
Figure 3: Comparison of modelled and observed time of concentration values, Bransby Williams formula, linear scale
Issues with the Bransby Williams formula was one of the motivations for the work undertaken by Pilgrim and McDermott. Bransby Williams was also found to be inaccurate in comparisons with 129 catchments in the UK (Beran, 1979).
Based on the available Australian data, the Pilgrim McDermott formula is more accurate and should be used in preference unless there is local data that supports the application of Bransby Williams or some other approach. Of course, if you are using the Probabilistic Rational Method to estimate flows in ungauged catchments, the Pilgrim McDermott formula should be applied as it was used in the derivation of the runoff coefficients that are mapped in the 1987 edition of Australian Rainfall and Runoff.
Code to produce the figures and analysis is available as a gist.
Bransby Williams, G. (1922). Flood discharge and the dimensions of spillways in India. The Engineer (London) 121: 321-322. (link)
French, R., Pilgrim, D.H., Laurenson, E.M., 1974. Experimental examination of the rational method for small rural catchments. Civ. Eng. Trans. Inst. Engrs. Aust. CE16, 95–102. CE16 95-102.
McDermott, G. E. and D. H. Pilgrim (1982). Design flood estimation for small catchments in New South Wales, Department of National Development and Energy and Australian Water Resources Council. Research project No. 778/104. Technical paper No. 73. Australian Government Publishing Service.
Pilgrim, D. H. and G. E. McDermott (1982). Design floods for small rural catchments in Eastern New South Wales. Civil Engineering TransactionsCE24: 226-234.
“Time of concentration” is often assumed to be the time for runoff to flow from the furthest reaches of a catchment to the outlet. Pilgrim and McDermott take a different tack, and use the “characteristic response time of a catchment” which is the minimum time of rise of the flood hydrograph. This is the time taken for flow to increase from an initial low value to the hydrograph peak (French et al., 1974). They determine this characteristic response time from measurements at gauging stations and argue that it is a better approach because it is empirically based (McDermott and Pilgrim, 1982).
Pilgrim and McDermott also measured catchment characteristics: stream length, average slope, equal area slope and catchment area. They explored the data and ultimately adopted the formula that relates time of concentration to catchment area.
(hours) (equation 1)
where A is measured in km2.
Characteristic response times were collected for 95 stream gauging sites from Queensland, New South Wales, Victoria and Tasmania. Those sites where location information is available (62 of 95) are shown Figure 1. There are lots of sites in the Snowy Mountains and around Sydney and a scattering of sites elsewhere.
Figure 1: Locations of gauges (red dots) used to derive the Pilgrim McDermott formula for the time of concentration
Let’s have a look at the data. Figure 2 shows the relationship between response time and catchment area. The regression fit to the logged data is the same as that reported by Pilgrim and McDermott which is a good start. The fit seems reasonable and the Nash-Sutcliffe coefficient, and the adjusted r-squared, is 0.85.
However, there is substantial uncertainty. Also plotted on Figure 2 are the 95% prediction intervals which show that a large range of response times are possible for any given catchment size. For example, the fitted value for a 100 km2 catchment is 4.4 hours but the upper and lower prediction limits are 1.8 hours and 10.5 hours.
Figure 2: Catchment response time as a function of area. Figure shows the fitted equation (equation 1) and the 95% prediction intervals
The relationship between catchment response time and the other catchment characteristics is interesting (Figure 3 and Figure 4). There is a strong association with stream length and catchment slope (either equal area or average slope). Steep, short streams respond quickly so have shorter characteristic times.
Pilgrim and McDermott considered including stream length and slope in the equation to predict time of concentration but found they were highly correlated with area which we can see in figures 5 and 6. These figures show that large catchments have longer stream lengths, which is expected. Also, large catchments have lower slopes. The upshot is that the Pilgrim McDermott formula is taking account of the effect of stream length and slope because of their correlation with area. Including these variables in the formula for time of concentration only made, at best, a marginal improvement in accuracy. The adjusted r-squared increased from 0.848 to 0.854 with the inclusion of slope. There was no advantage from including stream length. An equation based solely on catchment area is also simpler to use.
Figure 3: Catchment response time as a function of slope
Figure 4: Catchment response time as a function of stream length
Figure 5: Stream length as a function of area
Figure 6: Relationship between catchment area and catchment slope
The data set used by Pilgrim and McDermott is available in a csv file. Code to reproduce the figures and analysis is available as a gist.
References
French, R., Pilgrim, D.H., Laurenson, E.M., 1974. Experimental examination of the rational method for small rural catchments. Civ. Eng. Trans. Inst. Engrs. Aust. CE16, 95–102. CE16 95-102.
McDermott, G. E. and D. H. Pilgrim (1982). Design flood estimation for small catchments in New South Wales, Department of National Development and Energy and Australian Water Resources Council. Research project No. 778/104. Technical paper No. 73. Australian Government Publishing Service.
Pilgrim, D. H. and G. E. McDermott (1982). Design floods for small rural catchments in Eastern New South Wales. Civil Engineering TransactionsCE24: 226-234.