Stratified Sampling Simulator

A tool for experiencing stratified sampling, a way to boost the accuracy of Monte Carlo integration for free. Split the interval [0,1] into strata and draw samples evenly from each, and the random "clumping" disappears so the standard error of the estimate drops. Change the strata count, total sample size and integrand to compare against plain Monte Carlo in real time.

Parameters

Number of strata S

Number of equal-width sub-intervals (strata) the interval [0,1] is split into. S=1 equals plain Monte Carlo

Total sample size n

Total number of random points used for the estimate, allocated roughly evenly across strata

Random seed

Seed of the pseudo-random generator. The same value reproduces the result exactly

Integrand f(x)

The target of the integral I = ∫₀¹ f(x) dx to estimate over [0,1]

Results

—

Plain MC estimate

—

Stratified estimate

—

Exact value

—

Plain MC std. error

—

Stratified std. error

—

Variance reduction (%)

—

Strata boundaries and sample placement — coverage on the f(x) curve

The interval [0,1] is split into strata by vertical dividers, and a sample point (cyan) lands for certain in every stratum. The point of stratification is that no "clump" or "empty stratum" appears, unlike plain Monte Carlo.

Convergence of the estimate — estimate vs sample count

Standard error vs number of strata

Theory & Key Formulas

$$\hat I_{strat}=\sum_{k=1}^{S} w_k\,\bar f_k,\qquad w_k=\frac{1}{S}$$

Stratified estimator. The interval [0,1] is split into S equal-width strata; each stratum mean f̄_k is weighted by the stratum width w_k and summed. With equal widths w_k = 1/S.

$$\operatorname{Var}(\hat I_{strat})=\sum_k \frac{w_k^2\,\sigma_k^2}{m_k}\le \operatorname{Var}(\hat I_{plain})$$

Variance of the stratified estimator. σ_k² is the variance of f within stratum k, m_k is the number of samples allocated to stratum k. It never exceeds the variance of plain Monte Carlo.

The gain from stratification grows with the number of strata S and with the between-strata variation of the function f.

What is the Stratified Sampling Simulator?

🙋

When you compute an integral with Monte Carlo, you just scatter lots of random numbers and average them, right? How is "stratified sampling" different from that?

🎓

Roughly speaking, it is a way to tidy up how you scatter the random numbers and gain accuracy for free. Ordinary Monte Carlo integration scatters points completely at random inside [0,1]. But fully random points can clump together by chance, or leave a sub-interval completely empty. Stratified sampling divides the interval into equal-width "strata" in advance and puts a fixed number of points into every stratum. So the accident of "no points right here" never happens.

🙋

I see... but if you line the points up, it doesn't feel like random sampling any more. Do you still get the correct integral?

🎓

Good question. You still use randomness inside each stratum. With 10 strata, for instance, stratum 1 gets a uniform random point in [0,0.1], stratum 2 a uniform random point in [0.1,0.2], and so on. Weighting each stratum mean by the stratum width and summing gives the estimator Î_strat = Σ w_k f̄_k. This is an unbiased estimator aimed at the same true integral as plain Monte Carlo — the centre of the target does not move. Only the spread changes. In the f(x) curve plot above, you can see a point sitting inside every divider.

🙋

So the centre of the target is the same and only the spread shrinks. Why does splitting into strata reduce the spread?

🎓

The key is the "decomposition of variance". The total spread splits into two parts: the spread between strata and the spread within strata. Fully random Monte Carlo takes the full hit of the between-strata spread — whether the points happen to lean left or right. Stratified sampling, by drawing from every stratum, cleanly removes that between-strata spread, leaving only the within-stratum spread. The more the function changes from one stratum to the next, the larger the between-strata spread, and the larger the gain from stratification.

🙋

So the more strata you use, the smaller the error? When I raise the "number of strata" slider, the standard error keeps dropping.

🎓

Exactly. The finer the strata, the narrower each one becomes, and within a narrow stratum the function looks nearly constant. The within-stratum variance σ_k² then shrinks, and the standard error drops monotonically. In the extreme, dividing down to one point per stratum gives an almost regular grid and the smallest error. One caution though: if the number of strata exceeds the total sample count, some strata get no point at all and the allocation breaks. The golden rule is to keep at least one point in every stratum.

🙋

If more strata is just better in one dimension, it seems like you should use this for everything. Are there pitfalls?

🎓

The biggest pitfall is the "curse of dimensionality". In one dimension you just split into 10 strata, but in ten dimensions splitting each dimension into 10 gives 10 to the 10th power — ten billion strata. Putting one point in every stratum is plainly impossible. So in high dimensions you use Latin hypercube sampling (LHS), which stratifies each dimension and combines the strata cleverly. It is the multidimensional version of stratified sampling. This tool handles a one-dimensional integral, so it lets you feel the effect of stratified sampling directly. In practice, the same idea drives reliability analysis and financial scenario generation.

Frequently Asked Questions

Stratified sampling is a variance-reduction technique for Monte Carlo estimation. The sampling interval [0,1] is split in advance into several non-overlapping sub-intervals (strata), and a fixed number of samples is allocated to and drawn from each stratum. With plain Monte Carlo the random points can clump together by chance or leave a sub-interval completely empty, but stratified sampling guarantees that every stratum receives samples. This gives an even picture of the integrand and lowers the standard error of the estimate.

The total variance can be decomposed into within-stratum variance and between-stratum variance (the law of total variance). Because stratified sampling draws from every stratum for certain, it removes the between-stratum variance entirely, leaving only the within-stratum variance. The more the function f changes from one stratum to the next, the larger the between-stratum variance, and the larger the gain from stratification. Conversely, if f is nearly constant within a stratum, the samples from it are used efficiently. Even with equal (proportional) allocation rather than the optimal one, stratified sampling never increases variance above plain Monte Carlo.

In principle, more strata means narrower strata and smaller within-stratum variance, so the standard error decreases monotonically. In the extreme, splitting down to one sample per stratum gives a near-regular grid and the smallest error. In practice, however, if the number of strata exceeds the total sample count, some strata receive no sample at all and the allocation breaks down. In many dimensions there is also the curse of dimensionality, where the number of strata explodes exponentially. This tool is one-dimensional, so more strata is always favourable, but keep at least one sample per stratum.

Stratified sampling is the general idea of dividing each dimension (or the target space) into strata and drawing from every stratum. In one dimension it simply means splitting the interval evenly. In many dimensions, filling every cell of the full grid makes the number of strata explode, so Latin hypercube sampling (LHS) is the clever multidimensional version: it stratifies each dimension and randomly combines the strata so that each stratum of each dimension is used exactly once. This tool handles a one-dimensional integral, so it lets you experience the most basic form of stratified sampling itself.

Real-World Applications

CAE and probabilistic design (Monte Carlo FEM): In probabilistic design for structural or fluid analysis that accounts for scatter in material properties, loads and dimensions, many samples are drawn from the input probability distributions and the analysis is run for each. Because each run is expensive, stratifying the input space and drawing for certain from every stratum sharply cuts the number of runs needed for the same confidence-interval width. Stratification works especially well when the response varies monotonically with the input.

Financial engineering and risk scenario generation: When evaluating a portfolio's profit-and-loss distribution or VaR (value at risk) with Monte Carlo, the distributions of risk factors such as equity returns are split into strata and scenarios are generated evenly from each. Fully random sampling may happen to produce only a few "extreme downturn" scenarios, but stratification reliably covers the tail region too, stabilizing the tail-risk estimate.

Survey and statistical sample design: Stratified sampling originated in sample selection for statistical surveys. The population is split into strata by age, region, occupation and so on, and a set number of respondents is drawn from each, preventing a particular stratum from being missed by chance and biasing the estimate. "Stratified sampling" is one of the absolute basics of survey statistics, and this tool's numerical-integration version extracts its mathematical core in one dimension.

Rendering and computer graphics: Photorealistic rendering (path tracing) uses Monte Carlo integration for sampling within a pixel and for sampling light directions. Fully random sampling makes noise (graininess) conspicuous, so "jittered sampling" — stratifying a pixel into small cells and taking one sample from each — is standard, yielding smooth images with fewer samples.

Common Misconceptions and Pitfalls

The biggest misconception is that stratified sampling changes the estimate itself and introduces bias. The truth is the opposite: the stratified estimator Î_strat is an unbiased estimator aimed at the same true integral as the plain Monte Carlo estimator. Each stratum mean is an unbiased estimate of the integral within that stratum, and as long as it is correctly weighted by the stratum width and summed, stratification introduces no bias. In this simulator too, the plain MC estimate and the stratified estimate both aim at the exact value (for example 1/3 for f(x)=x²); only the spread changes. What shrinks is the error, not the centre of the target.

Next, the assumption that equal allocation is always optimal. This tool uses "proportional (equal) allocation", giving the same number of samples to every stratum. This is simple to implement and always at least as accurate as plain Monte Carlo, but it is not optimal. The variance-minimizing choice is "Neyman allocation", which gives more samples to strata with larger within-stratum spread σ_k — concentrating points where the function changes sharply. When you need that last squeeze of accuracy in practice, Neyman allocation or adaptive stratification is worth considering. That said, equal allocation never does harm.

Finally, the misconception that stratification can be used as-is in any number of dimensions. In one dimension you just split the interval into S strata, but in d dimensions, splitting each dimension into S and filling the full grid makes the number of strata explode to S to the power of d. Ten dimensions with ten strata each gives ten billion strata, and putting one point in every stratum is realistically impossible. To avoid this curse of dimensionality, high-dimensional problems use Latin hypercube sampling (LHS) or low-discrepancy sequences (quasi-Monte Carlo). The plain effect of stratified sampling shows up most clearly on low-dimensional problems like this tool. Remember that the higher the dimension, the more design effort the stratification needs.

Stratified Sampling Simulator

What is the Stratified Sampling Simulator?

Frequently Asked Questions

Real-World Applications

Common Misconceptions and Pitfalls

How to Use

Worked Example

Practical Notes