Generalized Extreme Value (GEV) Distribution Simulator

Q: How do the three GEV families (Gumbel / Fréchet / Weibull) differ?

They are distinguished by the sign of the shape parameter ξ. ξ=0 gives Gumbel (Type I, light tail), where the tail decays exponentially — typical of annual maximum temperatures and daily maximum rainfall. ξ>0 gives Fréchet (Type II, heavy tail), where the tail decays as a power law and has no upper bound — appropriate for stock-market crashes, insurance losses or earthquake magnitudes where extreme outliers are possible. ξ<0 gives Weibull (Type III, finite upper bound), where the distribution is truncated at a maximum value — used for physical phenomena with a hard upper limit such as wind speed. This tool lets you vary ξ from -0.5 to 0.5 so you can compare the three families intuitively.

Q: What do "100-year probability" and "100-year return period" actually mean?

A T-year return period means the value is exceeded on average once every T years, so the annual exceedance probability is 1/T. The 100-year return value z_100 corresponds to an annual exceedance probability of 1% (0.01). The crucial point is that "100-year event" does NOT mean "safe for 100 years". The probability of at least one exceedance during a 30-year service life is 1−(1−0.01)^30 ≈ 26%. This tool surfaces that number directly as "Mission-period exceedance probability" so designers do not confuse return period with service life.

Q: How does the sample size N relate to the confidence interval?

GEV parameter estimation (maximum likelihood, L-moments) is asymptotically normal under large-sample theory, so the estimation error shrinks roughly as 1/√N. The 95% CI width reported here uses a simplified delta-method approximation with standard error of order σ·√(0.5·ln T / N). With N=50 years of data and T=100 years, the CI half-width is typically ±10–15, showing how uncertain extrapolation from N years of data to a T-year (≫N) return level can be. In practice POT methods or hierarchical Bayesian models are used for more rigorous CIs.

Q: When should I use Block Maxima (BM) versus Peaks-Over-Threshold (POT)?

Block Maxima takes one sample per period (typically the annual maximum) and fits a GEV to those samples. It is easy to organise and interpret, but sample efficiency is poor: even if several large events occur in the same year, only one is used. POT treats every exceedance above a chosen threshold as a sample and fits a Generalized Pareto Distribution (GPD). It dramatically increases the sample size and narrows the CI, but requires careful threshold selection and independence checking (storm declustering). POT is preferred for river discharges and typhoon intensity. This tool operates on the GEV parameters directly, corresponding to the BM approach.

Generalized Extreme Value (GEV) Distribution Simulator

Use the Generalized Extreme Value distribution to compute the "100-year event" and the probability that it is exceeded during a 30-year service life. Sliding the location μ, scale σ and shape ξ switches between the Gumbel, Fréchet and Weibull families and updates the T-year return level z_T together with its 95% confidence interval in real time.

Parameters

Location μ

Centre of the distribution (e.g. typical annual maximum)

Scale σ

Spread of the distribution (analogous to standard deviation)

Shape ξ

ξ>0: Fréchet (heavy tail) / ξ=0: Gumbel / ξ<0: Weibull (finite upper bound)

Return period T

Period in which the value is exceeded once on average (annual probability = 1/T)

Sample size N

Years of annual-maximum data. Larger N narrows the CI

Mission duration

Service life of the structure / equipment over which z_T may be exceeded

Results

—

Return level z_T

—

GEV family

—

Annual exceedance (%)

—

Mission exceedance (%)

—

95% CI width

—

Fit confidence (KS-p)

—

GEV PDF and return level — three-family comparison

Three curves: Gumbel (green, ξ=0), Fréchet (red, ξ>0) and Weibull (blue, ξ<0). The dashed line marks the return level z_T and the histogram at the bottom shows sampled annual maxima.

Return level z_T vs return period T (log axis)

GEV PDF — comparison of the three shapes for different ξ

Theory & Key Formulas

$$F(z) = \exp\left[-\left(1 + \xi\,\frac{z-\mu}{\sigma}\right)^{-1/\xi}\right]$$

GEV CDF. μ: location, σ: scale, ξ: shape. As ξ → 0 the right-hand side reduces to exp(−exp(−(z−μ)/σ)), the Gumbel form.

$$z_T = \mu + \frac{\sigma}{\xi}\left[\left(-\ln(1-1/T)\right)^{-\xi} - 1\right]$$

T-year return level z_T (T in years). For ξ=0 the expression becomes z_T = μ − σ·ln(−ln(1−1/T)).

$$P_{\text{mission}} = 1 - \left(1 - \tfrac{1}{T}\right)^{D}$$

Probability of at least one exceedance of z_T in a service life of D years. For T=100, D=30 this is about 26% — the "100-year level" is far from "safe for 100 years".

GEV Distribution — 100-Year Return Period & Gumbel/Fréchet/Weibull Families

🙋

News reports keep calling things "once-in-a-100-year floods", but it feels like they happen several years in a row. Are they really only every 100 years?

🎓

Good catch. A "100-year return period" means the value is exceeded once every 100 years on average — think of it as drawing independently each year with a 1% probability. So two years in a row is 1% × 1% = 0.01%. And the chance of at least one exceedance during a 30-year service life is 1−(1−0.01)^30 ≈ 26%. With the defaults set to T=100 and mission 30 years, the "Mission exceedance" card on the right is showing exactly that 26.0%.

🙋

A 1-in-4 chance?! I assumed 100 years meant safe. So how is the actual "100-year value" decided?

🎓

That is where GEV — the Generalized Extreme Value distribution — comes in. Take 50 years of "annual maximum rainfall", and Fisher–Tippett–Gnedenko guarantees that they follow GEV. Fit the three parameters (location μ, scale σ, shape ξ) by maximum likelihood or L-moments, then plug them into z_T = μ + (σ/ξ)·[(−ln(1−1/T))^(−ξ) − 1] to get z_100. With the defaults you can see z_100 ≈ 137.6.

🙋

The family card reads "Fréchet heavy tail". What is actually different between Gumbel, Fréchet and Weibull?

🎓

They are three sister distributions selected by the sign of ξ. ξ=0 is Gumbel (light tail) — annual maximum temperature, where huge outliers are rare. ξ>0 is Fréchet (heavy tail) — power-law decay for earthquake magnitudes, stock-market crashes, insurance losses where unexpected giants exist. ξ<0 is Weibull (finite upper bound) — truncated at some maximum, like wind speed where physics caps the tail. Slide ξ in the panel and you can watch the curve morph continuously.

🙋

A CI width of 12.6 sounds huge. Does that mean the true 100-year value could really sit anywhere between about 131 and 144?

🎓

Yes — that is the essential difficulty of extreme-value statistics. You are extrapolating a 100-year value from only 50 years of data, so estimation error is unavoidable. Push N to 200 and the CI shrinks roughly by half (√N scaling); push T to 1000 and the CI widens further. That is why a "conservative design level" is sometimes set at the upper 95% confidence limit. Switching from annual maxima to POT sampling also tightens the CI by extracting more samples per year.

🙋

So in practice you have to decide both "what return period to design for" AND "what mission-period exceedance probability you can tolerate" as a pair.

🎓

Exactly. Seismic design codes for buildings often use a 475-year return period (10% exceedance in 50 years). Nuclear facilities go to 10,000 years. Insurance PML (Probable Maximum Loss) is usually 200 or 250 years. The right return period varies enormously by industry and asset. Use the sliders to hunt for the combination of return period and tolerable mission-period exceedance that fits your own design target.

Frequently Asked Questions

They are distinguished by the sign of the shape parameter ξ. ξ=0 gives Gumbel (Type I, light tail), where the tail decays exponentially — typical of annual maximum temperatures and daily maximum rainfall. ξ>0 gives Fréchet (Type II, heavy tail), where the tail decays as a power law and has no upper bound — appropriate for stock-market crashes, insurance losses or earthquake magnitudes where extreme outliers are possible. ξ<0 gives Weibull (Type III, finite upper bound), where the distribution is truncated at a maximum value — used for physical phenomena with a hard upper limit such as wind speed. This tool lets you vary ξ from -0.5 to 0.5 so you can compare the three families intuitively.

A T-year return period means the value is exceeded on average once every T years, so the annual exceedance probability is 1/T. The 100-year return value z_100 corresponds to an annual exceedance probability of 1% (0.01). The crucial point is that "100-year event" does NOT mean "safe for 100 years". The probability of at least one exceedance during a 30-year service life is 1−(1−0.01)^30 ≈ 26%. This tool surfaces that number directly as "Mission-period exceedance probability" so designers do not confuse return period with service life.

GEV parameter estimation (maximum likelihood, L-moments) is asymptotically normal under large-sample theory, so the estimation error shrinks roughly as 1/√N. The 95% CI width reported here uses a simplified delta-method approximation with standard error of order σ·√(0.5·ln T / N). With N=50 years of data and T=100 years, the CI half-width is typically ±10–15, showing how uncertain extrapolation from N years of data to a T-year (≫N) return level can be. In practice POT methods or hierarchical Bayesian models are used for more rigorous CIs.

Block Maxima takes one sample per period (typically the annual maximum) and fits a GEV to those samples. It is easy to organise and interpret, but sample efficiency is poor: even if several large events occur in the same year, only one is used. POT treats every exceedance above a chosen threshold as a sample and fits a Generalized Pareto Distribution (GPD). It dramatically increases the sample size and narrows the CI, but requires careful threshold selection and independence checking (storm declustering). POT is preferred for river discharges and typhoon intensity. This tool operates on the GEV parameters directly, corresponding to the BM approach.

Real-World Applications

Civil and hydraulic engineering (floods, rainfall, earthquakes): Levees and dams are designed by fitting GEV to historical annual maximum discharges to find the 100-year and 1000-year return flows. Major Japanese rivers commonly use a 200-year return period for the basic design flow, while European practice often relies on the Gumbel approximation (ξ=0). For seismic design the building code standard is a 475-year return period (10% exceedance in 50 years), and nuclear facilities consider events up to 10,000-year return periods.

Structural engineering (wind and wave loads): The design of high-rise buildings, bridges and offshore wind turbines fits GEV to local annual maximum wind speeds or significant wave heights. Typical shape parameters are ξ ≈ −0.1 to 0.0 for wind speed and ξ ≈ 0.0 to 0.1 for wave height. Design loads are derived from the 100-year return value multiplied by a load factor. For typhoons and rogue waves, POT is used to exploit multiple events per year and improve estimation accuracy.

Reliability engineering (component life, warranty period): The weakest-link life of products often follows a Weibull distribution (the ξ<0 case of GEV) and is used to set warranty periods for automotive and aerospace parts. "How many years until 1 out of 100 units fails" is estimated with Weibull to define the warranty window. In reliability engineering the GEV framework is more often used for "minimum life (weakest link)" rather than "maximum life".

Finance and insurance (VaR, PML): In market-risk management, GEV is fitted to annual maximum daily price drops to assess the tail risk of Value at Risk (VaR). Returns frequently exhibit Fréchet (ξ>0) behaviour, which is precisely why "black swans" exist. Insurers estimate Probable Maximum Loss (PML) at 200-year or 250-year return periods to guide their reinsurance purchasing decisions.

Common Misconceptions and Pitfalls

The biggest pitfall is the intuitive but completely wrong belief that "once-in-100-years = safe for 100 years". The probability of exceeding z_100 during a 30-year service life is about 26%, and about 40% over 50 years. The reason this tool surfaces "Mission-period exceedance probability" as one of its six headline numbers is precisely to break that habit of mind. In design, always decide the return period T and the service life D as a pair, and back-solve for T from the maximum acceptable exceedance probability (for example, 10% over D years).

Next on the list is the temptation to fix the shape parameter ξ at zero. Gumbel (ξ=0) is mathematically tractable and Excel even has a GUMBEL.DIST function, so it is tempting to fit Gumbel to everything. In reality ξ is often slightly positive (heavier tail), and forcing ξ=0 substantially under-estimates the T-year return level. For example, fitting Gumbel to data that truly have ξ=0.1 underestimates the 1000-year return level by 30–50%. Always estimate ξ as well, and compare Gumbel vs Fréchet by AIC/BIC.

Finally, keep the danger of extrapolating from N years of data to T (≫N) years firmly in mind. Estimating a 1000-year return level from 30 years of data means extrapolating 33× beyond the data range, and even with the simplified CI in this tool the half-width is ±20–30. On top of that, climate-driven non-stationarity (μ and σ drifting in time) can break the stationary-GEV assumption entirely. Modern practice uses non-stationary GEV with time-dependent parameters or spatial extreme-value models that pool data across many sites — particularly important now that extreme events appear to be intensifying.

Generalized Extreme Value (GEV) Distribution Simulator

GEV Distribution — 100-Year Return Period & Gumbel/Fréchet/Weibull Families

Frequently Asked Questions

Real-World Applications

Common Misconceptions and Pitfalls

How to Use

Worked Example

Practical Notes