When should I prefer quantile regression?

Choose QR when (1) you care about the tails — income inequality (the 90/10 ratio), high-end housing prices, peak demand; (2) the data are heavy-tailed or contain outliers, as in financial risk where Value at Risk and Expected Shortfall live at tau = 0.95 to 0.99; (3) the variance is heteroscedastic and you want to see how slopes change across tau (OLS would average everything into a single line); and (4) in climate extremes or survival-time analysis, where the tail behaviour is the question of interest.

Quantile Regression Simulator — Compare with OLS

Q: How is quantile regression different from OLS?

OLS (ordinary least squares) estimates the conditional mean E[Y|X]; quantile regression (QR) estimates the conditional quantile Q_tau(Y|X) directly — the median at tau = 0.5, the 90th percentile at tau = 0.9, and so on. OLS minimises squared errors, while QR minimises the asymmetric pin-ball (check) loss rho_tau(u) = u(tau - 1[u<0]). That lets you see not only the centre of the conditional distribution but also its tails, and gives slope estimates that stay robust under heavy-tailed noise or contaminating outliers.

Q: What is the pin-ball (check) loss?

The pin-ball loss is rho_tau(u) = u (tau - 1[u<0]) — it weights positive residuals by tau and negative residuals by (1 - tau), an asymmetric version of the absolute-error loss. At tau = 0.5 it is symmetric and reduces to the mean absolute deviation, giving median regression. At tau = 0.9 it punishes positive errors more heavily and pulls the fit toward the upper 10 percent of the data. Minimisation is a linear program and was introduced in Koenker and Bassett (1978).

Q: How efficient is quantile regression compared with OLS?

Under clean normal errors the median regression (tau = 0.5) has an asymptotic efficiency of about 64 percent of OLS (= 2/pi). In this simulator the efficiency ratio SE_OLS / SE_QR sits near 0.8 in that case, so OLS wins on precision when the noise is well behaved. The picture flips for Cauchy-like heavy tails (OLS standard errors blow up while QR stays finite) and for contaminated samples (OLS slopes are dragged by outliers while the QR slope stays close to the truth). Robustness, not raw efficiency, is the reason to reach for QR.

Quantile Regression Simulator

Place quantile regression (QR) side by side with ordinary least squares. Move the sample size, the quantile tau, the noise distribution and the outlier share to watch QR and OLS slopes, standard errors and the efficiency ratio update in real time — and feel why QR is the go-to tool when noise is heavy-tailed or contaminated.

Parameters

Sample size N

Larger N shrinks the standard error of the estimate

Quantile τ

τ = 0.5 gives median regression, τ = 0.9 the 90th percentile

True slope β₁

True intercept β₀

Noise std-dev σ

Noise distribution

Shape decides whether QR or OLS wins

Outlier share

Drags OLS; QR holds steady

Results

—

True slope β₁(τ)

—

OLS slope

—

QR slope

—

QR SE

—

OLS SE

—

Efficiency OLS/QR

—

Scatter + OLS / QR fits

Blue dots: regular data; red dots: outliers. The red line is OLS, the blue lines are QR at several τ. Raising noise or outliers tilts the OLS line while QR stays near the truth.

QR coefficient vs quantile τ

Pin-ball (check) loss vs τ

Theory & Key Formulas

$$\hat\beta(\tau) = \arg\min_\beta \sum_i \rho_\tau\!\left(y_i - x_i^{\mathsf T}\beta\right),\qquad \rho_\tau(u) = u\bigl(\tau - \mathbf{1}[u\lt 0]\bigr)$$

ρ_τ is the pin-ball / check loss. τ = 0.5 gives median regression (minimum absolute error); τ = 0.9 picks the 90th percentile. The asymmetric loss makes minimisation a linear program that solves cleanly even at large scale.

$$\mathrm{SE}_{\hat\beta(\tau)} \;\approx\; \frac{\sqrt{\tau(1-\tau)}}{f\!\bigl(F^{-1}(\tau)\bigr)\sqrt{N}}, \qquad \mathrm{SE}_{\hat\beta_{\mathrm{OLS}}} = \frac{\sigma}{\sqrt{N}}$$

The asymptotic QR standard error is inversely proportional to the density f at the quantile. For Normal noise with τ = 0.5 the ratio OLS/QR is about 0.798 (QR is ~64% as efficient as OLS); for Cauchy or contaminated samples the OLS SE diverges and QR wins.

Quantile Regression — a perspective OLS cannot give

🙋

"Quantile regression" — I have honestly never heard of it. It draws a line through the data just like OLS, so what is actually different?

🎓

Good place to start, because that is the heart of it. OLS — least squares — draws the line that goes through the conditional mean E[Y|X]. Quantile regression draws the line that goes through a conditional quantile: τ = 0.5 is the median, τ = 0.9 is the 90th percentile. From one scatter plot you can fit one line for the "typical" individual and a separate line for the "top 10 percent" individual. That is exactly what economists need for inequality work, what banks need for Value at Risk, and what climatologists need for extreme rainfall.

🙋

I see — and the "QR coefficient vs τ" chart up there means the slope itself depends on τ? So we are really drawing a different line at each τ.

🎓

Exactly. It is regression that exposes the shape of the distribution. Switch the noise distribution to "Heteroscedastic" and watch the chart — slopes get steeper as τ grows. That tells you the spread of Y grows with X; a single OLS line would never reveal that. Educational researchers use this all the time: "does parental income matter more for students at the low end or at the high end of the test-score distribution?"

🙋

When I raise the outlier slider, the OLS slope drifts away from 1.5 fast, but the QR slope barely moves.

🎓

That is the robustness property. OLS uses squared errors, so a faraway point matters with a quadratic weight — one outlier can twist the whole slope. Median regression (τ = 0.5) minimises absolute residuals, so an outlier counts as a single point and nothing more. The same principle that lets the median ignore the 30 percent worst values lets QR keep the slope honest. Financial VaR and insurance tail estimation simply cannot work without this kind of resistance.

🙋

Then is QR strictly better than OLS? It feels like we should just use QR everywhere.

🎓

Tempting but no. Switch the noise back to "Normal" and look at the efficiency ratio OLS/QR — about 0.8, meaning the QR standard error is larger. Under clean Normal noise OLS is the asymptotically efficient estimator (100%), and median regression is only about 64 percent (= 2/π) as efficient. You need roughly 1.57× more data with QR to match OLS precision. So the rule of thumb is: clean Normal data → OLS; heavy tails or outliers → QR; want to see the full shape of the conditional distribution → QR run at many τ.

🙋

That is why the efficiency ratio flips once I switch to Cauchy. The loss chart underneath looks like an upside-down parabola — is that the pin-ball loss?

🎓

It is the optimum value of ρ_τ(u) = u·(τ − 1[u<0]) plotted across τ. At τ = 0.5 the loss is symmetric (a plain absolute value); for other τ the slopes on the two sides differ. That asymmetry is exactly what lets you target "the top 10 percent" or "the 5 percent VaR" with a single linear program. Koenker and Bassett wrote the founding paper in 1978, and quantile regression became standard equipment in econometrics, climate science and medical statistics. Once the loss function clicks, the way you look at data becomes three-dimensional.

Frequently asked questions

OLS (ordinary least squares) estimates the conditional mean E[Y|X]; quantile regression (QR) estimates the conditional quantile Q_τ(Y|X) directly — the median at τ = 0.5, the 90th percentile at τ = 0.9, and so on. OLS minimises squared errors, while QR minimises the asymmetric pin-ball (check) loss ρ_τ(u) = u(τ − 1[u<0]). That lets you see not only the centre of the conditional distribution but also its tails, and gives slope estimates that stay robust under heavy-tailed noise or contaminating outliers.

The pin-ball loss is ρ_τ(u) = u (τ − 1[u<0]) — it weights positive residuals by τ and negative residuals by (1 − τ), an asymmetric version of the absolute-error loss. At τ = 0.5 it is symmetric and reduces to the mean absolute deviation, giving median regression. At τ = 0.9 it punishes positive errors more heavily and pulls the fit toward the upper 10 percent of the data. Minimisation is a linear program and was introduced in Koenker and Bassett (1978).

Choose QR when (1) you care about the tails — income inequality (the 90/10 ratio), high-end housing prices, peak demand; (2) the data are heavy-tailed or contain outliers, as in financial risk where Value at Risk and Expected Shortfall live at τ = 0.95 to 0.99; (3) the variance is heteroscedastic and you want to see how slopes change across τ (OLS would average everything into a single line); and (4) in climate extremes or survival-time analysis, where the tail behaviour is the question of interest.

Under clean normal errors the median regression (τ = 0.5) has an asymptotic efficiency of about 64 percent of OLS (= 2/π). In this simulator the efficiency ratio SE_OLS / SE_QR sits near 0.8 in that case, so OLS wins on precision when the noise is well behaved. The picture flips for Cauchy-like heavy tails (OLS standard errors blow up while QR stays finite) and for contaminated samples (OLS slopes are dragged by outliers while the QR slope stays close to the truth). Robustness, not raw efficiency, is the reason to reach for QR.

Real-world applications

Income distribution and inequality (economics): The OECD and World Bank track the "90/10 ratio" (income at the 90th percentile divided by income at the 10th) across countries. Running QR with covariates such as education, age and gender produces a separate slope for the working poor, the median earner and the top decile, instead of an aggregated mean. Buchinsky (1994) used QR to show that the U.S. return to education is larger at upper quantiles than at lower ones.

Financial risk management (VaR and Expected Shortfall): Bank trading desks compute the τ = 0.99 (or 0.995) quantile of portfolio P&L as Value at Risk every day. Engle and Manganelli's CAViaR (Conditional Autoregressive VaR) extends QR to time series and is a workhorse model under the Basel market-risk capital rules. OLS cannot describe the tail behaviour these regulators care about.

Climate and environmental extremes: When the question is not the average but the upper 5 percent — heatwaves, heavy rainfall, river floods — researchers fit τ = 0.95 quantile regressions on temperature or time covariates to see how warming shifts the high quantiles. Friederichs and Hense (2007) and many follow-up papers made QR a standard tool alongside GEV (generalised extreme-value) distributions in climate science.

Medical statistics and survival analysis: A new drug may be judged not by mean survival but by the 25th-percentile survival of the worst-affected patients. Birth-weight studies focus on the lowest 10 percent rather than the average baby. Quantile regression became standard in medical statistics from the early 2000s precisely because looking only at the mean hides the most at-risk groups.

Common misconceptions and pitfalls

The most common misconception is "QR is always better than OLS". As this simulator shows, under clean Normal noise the asymptotic efficiency of QR is only about 64 percent (= 2/π) of OLS. The median regression standard error runs about 1.25× the OLS standard error, which means you need roughly 1.57× more data to match the precision. For clean Normal-looking samples without outliers, OLS is the right default; reach for QR when you have heavy tails, contamination, or want to inspect the full distribution.

Second, a textbook complaint that the "pin-ball loss is not differentiable, so we cannot optimise it with gradient methods". True, ρ_τ(u) has a kink at u = 0 — but QR is naturally written as a linear program and solved with the simplex method or interior-point algorithms (R's quantreg, Python's statsmodels, scikit-learn's QuantileRegressor). You do not need gradient methods, and convexity guarantees a global optimum.

Third, the quantile-crossing problem. Conditional quantiles are monotone in τ in theory, but in finite samples the fitted line at the 95th percentile can end up below the 90th. That is incoherent as a distribution forecast, so practitioners post-process with constrained joint estimation (Bondell et al., 2010), pointwise sorting, or Chernozhukov-style rearrangement. If you publish five or more quantiles in the same chart, always check for crossings before showing the picture to anyone.

Quantile Regression Simulator

Quantile Regression — a perspective OLS cannot give

Frequently asked questions

Real-world applications

Common misconceptions and pitfalls

How to Use

Worked Example

Practical Notes