Week Eleven: A Taste of Extreme Value Theory

This week we looked briefly at extreme value theory, the statistical tools we use when we’re more interested in extremes (maxima or minima) of our samples than in the more typical means or medians.

1 In Class

We defined return period, the expected amount of time for an extreme event of a certain magnitude to occur. For example, a 100-year flood of a river is a flood level that has a 1% chance of happening each year (and so an expected frequency of occurrence of once every 100 years).

One way to think about determing the magnitudes for a certain return period or the return periods for a certain magnitude of event would be to look at probabilities in some kind of distribution: what are all the things that could happen, and how far out in the tail is our event? The trick of this is that we need to find the right distribution.

We talked through some of the logic of why it’s not the normal distribution and collected some data (not fully successfully) to try to suggest this. Our dataset did seem like it had some fatter tails than the normal distribution, though, which is what we’re looking for when we’re taking something like the maximum of each of our samples.

Okay, so what distribution is it? Based on the Fisher-Tippett-Gnedenko Theorem, it is the Generalized Extreme Value Distribution:

\[G(x; \mu, \sigma, \xi) = \mathrm{exp}\left[-\left(1 + \xi \frac{x-\mu}{\sigma}\right)^{-1/\xi}\right],\]

where \(\mu\) is a location parameter (shifting the distribution left/right), \(\sigma > 0\) is a scale parameter (controlling the spread), and \(\xi\) is a shape parameter (affecting skew).

This is really a unified way of writing about three distributions that were known: the Weibull distribution (good for short tails – wind, temperature), the Fréchet distribution (heavy tails – floods, precipitation), and the Gumbel distribution (thinner long tails).

If we have data, we can find the optimal parameter values to represent that data with a Generalized Extreme Value Distribution. And then from there we can look at probabilities in the tails to find return periods like we wanted!

2 Some Further Resources

A climate-based introduction to extreme value theory

A slide presentation about extreme value theory

History and intro to extreme value theory (including an additional approach we didn’t discuss)

Dive into the Weibull distribution

An article using extreme value theory to analyze swimming competitions

An extreme value analysis of Queen Elizabeth II’s reign

An article about limitations of extreme value theory in the face of climate change

Combining the idea of return periods with some spatial statistics