How does Latin Hypercube Sampling (LHS) differ from simple random sampling?

LHS divides the range of each input variable into N equally probable intervals and selects one sample from each interval. While pure random sampling can lead to clustering of points, LHS ensures uniform coverage of the input space, resulting in an estimated variance that is 30–50% lower for the same number of samples.

How should the number of samples for LHS be determined?

A rule of thumb is that N ≥ 10d (where d is the number of input dimensions) is the minimum. For surrogate model construction, N = 2d to 5d × 10 is generally recommended. However, the sample size should be increased if the response exhibits strong nonlinearity.

What is the relationship between LHS and Design of Experiments (DOE)?

LHS is a type of DOE widely used for initial point placement in computer experiments. While orthogonal arrays and Box-Behnken designs focus on estimating factor effects, LHS excels at uniformly exploring the input space.

What is the maximin distance criterion?

It is an optimization criterion that maximizes the minimum distance between sample points. When sample points are too close, information redundancy increases. Optimizing LHS placement using the maximin criterion improves space-filling properties and enhances surrogate model accuracy.

Latin Hypercube Sampling (LHS)

Category: V&V・不確かさ定量化 | 更新 2026-04-12

Latin Hypercube Sampling grid visualization showing stratified intervals and space-filling property

ラテン超方格サンプリング（LHS）— 入力空間を均等に分割し、各層から1点ずつ選択する層化サンプリング手法

Theory and Physics

Basic Principles of LHS

🧑‍🎓

Professor, what makes Latin Hypercube Sampling better than ordinary random sampling? Just hearing the name makes it sound incredibly difficult...

🎓

The name sounds fancy, but what it does is simple. It evenly divides the range of each input variable into $N$ intervals and selects one point from each interval. With random sampling, points can coincidentally cluster in certain regions—a phenomenon called clumping—but LHS structurally prevents this.

🧑‍🎓

Huh, does just dividing evenly have that much effect?

🎓

It makes a big difference. Compared to the Monte Carlo method with the same sample size, variance is often reduced by 30-50%. This means you can obtain accurate statistical information with fewer simulation runs. In CAE, a single analysis can sometimes take hours, so being able to reduce the sample count is a huge computational cost advantage.

🧑‍🎓

I see! For example, in what situations is it used?

🎓

There are three typical use cases.

Uncertainty Quantification (UQ): How variability in material properties affects stress distribution
Sensitivity Analysis: Identifying which input parameter has the greatest effect on the output
Initial Point Placement for DOE (Design of Experiments): Determining training data points when building surrogate models (e.g., Kriging)

For example, in automotive crash analysis, when there is variability in three variables—sheet thickness, yield stress, and friction coefficient—placing 50 sample points using LHS can estimate the distribution of the Head Injury Criterion (HIC) with accuracy comparable to 100 Monte Carlo points.

Mathematics of Stratified Sampling

🧑‍🎓

How would you write "dividing evenly" in mathematical terms?

🎓

Consider generating $N$ samples for $d$-dimensional input variables $\mathbf{X} = (X_1, X_2, \ldots, X_d)$. Let $F_j$ be the cumulative distribution function (CDF) of each variable $X_j$. The core of LHS is stratifying the marginal distribution of each variable into $N$ equiprobable strata.

The $k$-th stratum ($k = 1, 2, \ldots, N$) of the $j$-th variable is defined by the interval:

$$ I_{j,k} = \left[ F_j^{-1}\!\left(\frac{k-1}{N}\right),\; F_j^{-1}\!\left(\frac{k}{N}\right) \right] $$

Since we sample one point from each stratum, the probability that any sample point falls into the $k$-th stratum is:

$$ P(X_j \in I_{j,k}) = \frac{1}{N}, \quad k = 1, 2, \ldots, N $$

Specifically, the position within the stratum is determined by a uniform random number $U_{j,k} \sim \text{Uniform}(0,1)$, and the $k$-th sample value is generated by:

$$ x_{j,k} = F_j^{-1}\!\left(\frac{\pi_j(k) - 1 + U_{j,k}}{N}\right) $$

Here, $\pi_j$ is a random permutation of $\{1, 2, \ldots, N\}$, generated independently for each variable. This permutation randomizes the combination of strata across variables.

🧑‍🎓

So it's like creating "even shelves" and then randomly picking a point within each shelf?

🎓

That image is perfect. It's easier to understand in two dimensions. Imagine an $N \times N$ chessboard and placing exactly one point in each row and each column—this is the Latin square condition. The name "Latin square" comes from the "Latin Square" studied by mathematician Leonhard Euler, which is an arrangement where each symbol appears exactly once in each row and column.

Proof of Variance Reduction Effect

🧑‍🎓

Can you explain mathematically why variance decreases with LHS?

🎓

Let's compare the variance of the expected value estimator $\bar{Y}_N = \frac{1}{N}\sum_{i=1}^{N} g(\mathbf{x}^{(i)})$ for the output $Y = g(\mathbf{X})$. According to the classical result by McKay, Beckman & Conover (1979):

$$ \text{Var}_{\text{LHS}}(\bar{Y}_N) = \frac{\sigma^2}{N} - \frac{1}{N}\sum_{j=1}^{d} \text{Cov}\!\left(g(\mathbf{X}),\, E[g(\mathbf{X}) | X_j]\right) + O\!\left(\frac{1}{N^2}\right) $$

The first term $\sigma^2 / N$ is the variance of simple Monte Carlo itself. The second term is the variance reduction amount due to LHS, and this term becomes larger when $g$ is monotonic with respect to each input variable (i.e., has a large additive component). In other words:

$$ \text{Var}_{\text{LHS}}(\bar{Y}_N) \leq \text{Var}_{\text{MC}}(\bar{Y}_N) $$

Equality holds only when $g$ is completely non-additive (purely interactive) with respect to all input variables. In actual CAE problems, an additive component almost always exists, so LHS is always more advantageous than Monte Carlo.

🧑‍🎓

I see, so the more "well-behaved" the input-output relationship is, the greater the benefit of LHS.

🎓

Exactly. Stress responses in structural analysis are generally monotonic with respect to variables, so the variance reduction effect is extremely effective. On the other hand, for chaotic systems (like instantaneous values in turbulence), the benefit may be smaller. However, theoretically, LHS never performs worse than Monte Carlo, so using LHS by default is a safe bet.

Space-Filling Criteria

🧑‍🎓

Can it happen that even with LHS sampling, the points feel somewhat clustered?

🎓

Yes, it can. Basic LHS guarantees stratification of marginal distributions but does not guarantee uniformity in the multidimensional space. For example, with two variables, it's possible to have points concentrated in the upper right and lower left, leaving the middle sparse. That's why we use space-filling criteria to optimize the placement.

Representative space-filling criteria are the following three:

1. Maximin Distance Criterion

Maximizes the minimum distance among all sample point pairs:

$$ \max_{\mathbf{X}} \; \min_{i \neq j} \| \mathbf{x}^{(i)} - \mathbf{x}^{(j)} \|_p $$

Here, $\| \cdot \|_p$ is the $L_p$ norm (usually Euclidean distance with $p=2$). Intuitively, it seeks a configuration that "spreads points as far apart as possible."

2. $\phi_p$ Criterion based on Maximin Distance

A continuous relaxation by Morris & Mitchell (1995):

$$ \phi_p = \left( \sum_{i=1}^{N} \sum_{j=i+1}^{N} d_{ij}^{-p} \right)^{1/p} \to \min $$

As $p$ increases, it asymptotically approaches the maximin criterion. $p = 15 \sim 50$ is practical. The advantage is that it is differentiable, allowing gradient-based optimization.

3. Discrepancy Criterion

Measures the deviation of the point set from a uniform distribution:

$$ D_N^{*}(\mathbf{X}) = \sup_{\mathbf{u} \in [0,1]^d} \left| \frac{1}{N}\sum_{i=1}^{N} \mathbf{1}_{[\mathbf{0},\mathbf{u}]}(\mathbf{x}^{(i)}) - \prod_{j=1}^{d} u_j \right| $$

The smaller this star discrepancy $D_N^{*}$ is, the closer the configuration is to a uniform distribution. It is directly linked to the error bound of numerical integration via the Koksma-Hlawka inequality.

🧑‍🎓

In practice, which should I use, maximin distance or discrepancy?

🎓

For building surrogate models (Kriging, RBF, etc.), maximin distance is the standard because prediction accuracy depends on the distance between the prediction point and its nearest training point. On the other hand, if the goal is numerical integration (estimation of expected value), discrepancy is theoretically more correct. If in doubt, choosing maximin distance won't lead you far astray.

Correlation Control

🧑‍🎓

Can unintended correlations arise between variables generated by LHS?

🎓

Sharp question. Since random permutations are applied independently to each variable, spurious correlations can arise by chance between variables. When $N$ is small, fairly large correlations can appear. The method to prevent this is the Iman & Conover (1982) correlation control method.

The core of the algorithm is as follows:

Set the target correlation matrix $\mathbf{C}_{\text{target}}$ (if independent, use the identity matrix $\mathbf{I}$)
For the rank matrix $\mathbf{R}$ from LHS, calculate the current rank correlation matrix $\mathbf{C}_R$
Perform Cholesky decomposition $\mathbf{C}_{\text{target}} = \mathbf{L}\mathbf{L}^T$
Rearrange the ranks using the correction matrix $\mathbf{Q} = \mathbf{L} \cdot \mathbf{C}_R^{-1/2}$

The rank correlation matrix after correction approximately matches the target correlation matrix:

$$ \rho_{ij}^{\text{rank}}(\mathbf{X}_{\text{corrected}}) \approx C_{\text{target},ij} $$

🧑‍🎓

So it can also be used when input variables are correlated (e.g., Young's modulus and Poisson's ratio are correlated)!

🎓

Exactly. Being able to sample while reflecting correlations in material properties is extremely important in practice. It's often used in combination with Nataf transformation or Copulas.

Algorithms and Implementation

Basic LHS Generation Algorithm

🧑‍🎓

What specific steps should I follow to actually generate LHS in a program?

🎓

It's simple. For $N$ samples, $d$ variables:

For each variable $j = 1, \ldots, d$, generate a random permutation $\pi_j$ of $\{1, 2, \ldots, N\}$
For each sample $i$ and each variable $j$, generate a uniform random number $u_{ij} \sim U(0,1)$
Calculate coordinates on the unit hypercube: $s_{ij} = \frac{\pi_j(i) - 1 + u_{ij}}{N}$
Transform to the desired distribution via inverse CDF: $x_{ij} = F_j^{-1}(s_{ij})$

Step 3 is the core of LHS, where $\pi_j(i) - 1$ determines the stratum and $u_{ij}$ randomly selects the position within the stratum.

🧑‍🎓

It's simpler than I thought! What's the computational complexity?

🎓

Generating basic LHS is $O(Nd)$, the same as Monte Carlo. Optimized LHS (maximin distance criterion) adds an optimization loop, so it becomes $O(N^2 d \cdot T)$ (where $T$ is the number of iterations), but the cost of sample generation is negligible compared to the cost of a single CAE analysis run.

Optimized LHS

🧑‍🎓

How do you maximize the maximin distance? Exhaustive search is impossible, right?

🎓

Exhaustive search has $(N!)^d$ combinations, so of course it's impossible. Practical methods include:

ESE Method (Enhanced Stochastic Evolutionary): Randomly swaps two rows to improve. Proposed by Jin et al. (2005), easy to implement.
Simulated Annealing: Proposed by Morris & Mitchell (1995). Can escape local minima.
Genetic Algorithm: Effective for large-scale problems but computationally expensive.
Column Pairwise Exchange: Fast algorithm by Viana et al. (2009). Optimizes two variables at a time.

In practice, the ESE method or simulated annealing are standard. They are also implemented in pyDOE and SciPy.

Python Implementation Example

🧑‍🎓

How do you write it in Python?

🎓

Using SciPy's qmc.LatinHypercube is the current standard. Optimized LHS (maximin criterion) can also be generated in one line.

import numpy as np
from scipy.stats.qmc import LatinHypercube
from scipy.stats import norm, uniform

# --- Basic LHS (5 variables, 50 samples) ---
d, N = 5, 50
sampler = LatinHypercube(d=d, seed=42, optimization="random-cd")
unit_samples = sampler.random(n=N)  # LHS on [0,1]^d

# --- Transform to target distribution via inverse CDF ---
# Variable 1: Normal distribution N(210e3, 5e3)  ← Young's modulus [MPa]
# Variable 2: Normal distribution N(0.3, 0.02)   ← Poisson's ratio
# Variable 3: Uniform distribution U(1.5, 2.5)   ← Sheet thickness [mm]
x1 = norm.ppf(unit_samples[:, 0], loc=210e3, scale=5e3)
x2 = norm.ppf(unit_samples[:, 1], loc=0.3, scale=0.02)
x3 = uniform.ppf(unit_samples[:, 2], loc=1.5, scale=1.0)
# ... and so on

🧑‍🎓

What is optimization="random-cd"?

🎓

It's an optimization option that minimizes the Centered Discrepancy. There's also "Lloyd". The default (None) gives basic LHS without optimization. In practice, always include an optimization option because raw LHS can sometimes have poor space-filling properties.

Convergence and Determining Sample Size

🧑‍🎓

Is the convergence rate of LHS faster than Monte Carlo?

🎓

In terms of order, both are $O(1/\sqrt{N})$, but the constant term differs. LHS has a smaller constant, so it requires fewer samples to reach the same error level. Let's summarize specific guidelines in a table.

Application	Dimension $d$	Recommended Sample Count $N$	Notes
Mean/Standard Deviation Estimation	Any	$N \geq 10d$	Minimum line
Probability Distribution Estimation (Histogram)	$\leq 10$	$N \geq 50d$	Ensuring tail accuracy
Surrogate Model Construction (Kriging)	$\leq 20$	$N = 10d \sim 50d$	Optimize with space-filling criteria
Sobol Sensitivity Index Estimation	Any	$N \geq 100(d+1)$	Used in conjunction with Saltelli method

Practical Guide

LHS Application Flow in CAE Analysis

🧑‍🎓

Please explain the overall flow of using LHS in an actual CAE project.

🎓

The typical flow has 5 steps:

Selection of Input Variables: Decide which parameters to vary (material properties, geometric dimensions, load conditions, etc.). Variables with low sensitivity are screened out (
Related Topics
V&VMonte Carlo Simulation (Basics)V&VSobol感度指標 V&VKriging Surrogate Model V&VMorris Screening Method V&VPolynomial Chaos Expansion (PCE)V&VUncertainty Propagation (Forward UQ)

関連シミュレーター
この分野のインタラクティブシミュレーターで理論を体感しよう
シミュレーター一覧

関連する分野
構造解析流体解析熱解析

この記事の評価
ご回答ありがとうございます！
参考に
なった
もっと
詳しく
誤りを
報告
参考になった
0
もっと詳しく
0
誤りを報告
0

Written by NovaSolver Contributors
Anonymous Engineers & AI — サイトマップ
About the Authors