How many simulation runs are required for the Monte Carlo method?

Convergence follows the 1/√N rule. To halve the estimation error of the mean, the number of samples must be quadrupled. In practice, 1,000 runs yield an estimation accuracy (95% confidence interval) of approximately 6%.

What is the difference between LHS (Latin Hypercube Sampling) and pure random sampling?

LHS stratifies the distribution range of each input variable into equiprobable intervals and samples one point from each stratum. Compared to pure random sampling, it can cover the input space more uniformly with fewer samples, thereby reducing variance.

How is a confidence interval calculated from Monte Carlo results?

The 95% confidence interval for the sample mean is calculated as μ̂ ± 1.96·σ̂/√N, where σ̂ is the sample standard deviation and N is the number of samples.

What variance reduction techniques are available?

Key techniques include LHS (Latin Hypercube Sampling), Antithetic Variates, Control Variates, and Importance Sampling. Selecting the appropriate technique for the objective can significantly improve estimation accuracy for the same sample size.

Monte Carlo Simulation — A Fundamental Method for CAE Uncertainty Quantification

Category: V&V・不確かさ定量化 | 更新 2026-04-12

Monte Carlo simulation convergence visualization showing random sampling and 1/sqrt(N) error reduction

モンテカルロシミュレーション：ランダムサンプリングによる出力分布の推定と$1/\sqrt{N}$収束

Theory and Physics

Overview — What is the Monte Carlo Method?

🧑‍🎓

Professor, does the Monte Carlo method involve repeating simulations with random numbers? How many times do you need to run it?

🎓

Good question. Roughly speaking, it's a method that randomly samples from the probability distributions of input parameters and repeatedly executes FEM or CFD analyses hundreds or thousands of times. The goal is to statistically determine the variation (distribution) of the output.

🧑‍🎓

Thousands of times!? Isn't the computational cost huge?

🎓

Yes, that's also the biggest weakness of the Monte Carlo method. However, its greatest strength is that the convergence rate does not change regardless of the dimensionality of the input variables. With sensitivity analysis, if there are 10 variables, the cost increases exponentially, but Monte Carlo converges at the same pace whether it's 10 dimensions or 100 dimensions.

🧑‍🎓

Independent of dimensionality... That's amazing. Specifically, when is it used?

🎓

For example, a case evaluating how the variation in material strength (coefficient of variation CoV ≈ 5%) of an automotive component affects its fatigue life. When multiple parameters like Young's modulus, yield stress, plate thickness, load amplitude, etc., vary simultaneously, investigating how their "combined effects" propagate to the output is the forte of the Monte Carlo method.

Monte Carlo Simulation (MCS) is a method that repeatedly executes a deterministic model (FEM, CFD, etc.) from probabilistic input variables to estimate the statistical properties (mean, variance, distribution shape, confidence interval) of the output. The name originates from the Monte Carlo district in Monaco, famous for its casinos, due to the use of random numbers.

Basic Principles and Convergence Rate

🧑‍🎓

How is the convergence rate determined? You said earlier it's "independent of dimensionality"...

🎓

The core of the Monte Carlo method lies in the Law of Large Numbers and the Central Limit Theorem. We independently sample $N$ samples $\mathbf{X}_1, \mathbf{X}_2, \ldots, \mathbf{X}_N$ from the input distribution and evaluate the model $Y = f(\mathbf{X})$ for each. Then the sample mean is:

$$\hat{\mu}_Y = \bar{Y} = \frac{1}{N}\sum_{i=1}^{N} Y_i = \frac{1}{N}\sum_{i=1}^{N} f(\mathbf{X}_i)$$

🎓

The standard error of this sample mean is expressed by the following formula:

$$\text{SE}(\bar{Y}) = \frac{\sigma_Y}{\sqrt{N}}$$

🧑‍🎓

Ah, you divide by $\sqrt{N}$! So, increasing the sample number reduces the error, but only by the square root...

🎓

Exactly. This is the famous $1/\sqrt{N}$ convergence rule. The important points are:

To halve the error, you need 4 times the samples
To reduce the error to 1/10, you need 100 times the samples
This convergence rate does not depend on the dimensionality $d$ of the input variables (it is not affected by the curse of dimensionality)

The sample variance estimator is similarly defined:

$$\hat{\sigma}_Y^2 = \frac{1}{N-1}\sum_{i=1}^{N} (Y_i - \bar{Y})^2$$

Confidence Interval Estimation

🧑‍🎓

When writing results in a report, just saying "the mean is about this much" isn't enough, right? How do you get the confidence interval?

🎓

Thanks to the Central Limit Theorem, if $N$ is sufficiently large (in practice, from about $N \geq 30$), the sample mean can be approximated by a normal distribution. Therefore, the 95% confidence interval can be written as:

$$\text{CI}_{95\%} = \bar{Y} \pm z_{0.025} \cdot \frac{\hat{\sigma}_Y}{\sqrt{N}} = \bar{Y} \pm 1.96 \cdot \frac{\hat{\sigma}_Y}{\sqrt{N}}$$

🧑‍🎓

1.96 is the value from the normal distribution, right? Specifically, if you run it 1000 times, what kind of accuracy do you get?

🎓

Let's calculate the relative estimation accuracy (Relative Accuracy). If the coefficient of variation is $\text{CoV}_Y = \sigma_Y / \mu_Y$:

$$\text{RA}_{95\%} = \frac{1.96 \cdot \hat{\sigma}_Y / \sqrt{N}}{\bar{Y}} = \frac{1.96 \cdot \text{CoV}_Y}{\sqrt{N}}$$

🎓

For example, when the output's coefficient of variation is $\text{CoV}_Y = 0.10$ (10%):

$N = 100$: $\text{RA} = 1.96 \times 0.10 / \sqrt{100} = 1.96\%$
$N = 1{,}000$: $\text{RA} = 1.96 \times 0.10 / \sqrt{1000} \approx 0.62\%$
$N = 10{,}000$: $\text{RA} \approx 0.20\%$

With 1,000 runs, the width of the 95% confidence interval is about 6% of the mean value (when $\text{CoV}_Y \approx 1.0$) is a practical guideline.

How to Determine the Number of Samples

🧑‍🎓

So in practice, how many times should you run it? Is 100 times enough, or do you need 10,000...?

🎓

It's completely different depending on the purpose. Roughly summarized, it looks like this:

Purpose	Guideline for Required Sample Number	Reason
Mean / Standard Deviation Estimation	100〜1,000	Sufficient convergence by $1/\sqrt{N}$ rule
95th Percentile Estimation	1,000〜5,000	To accurately capture the tail of the distribution
99.9th Percentile (Rare Event)	10,000〜100,000	Many samples needed for tail accuracy
Failure Probability $P_f = 10^{-k}$ Estimation	$\geq 10^{k+2}$	Required for CoV($\hat{P}_f$) < 10%

🧑‍🎓

A failure probability of $10^{-6}$ requires $10^8$ runs!? That's impossible to run, isn't it...?

🎓

That's why for rare event estimation, instead of pure Monte Carlo, special techniques like Importance Sampling or Subset Simulation are used. Or, a surrogate model is inserted to reduce computational cost.

Theoretical Background of Convergence Rate

Law of Large Numbers: As $N \to \infty$, the sample mean $\bar{Y}$ converges in probability to the true expected value $\mu_Y = E[Y]$. This guarantees the consistency of the Monte Carlo estimator.
Central Limit Theorem: $\sqrt{N}(\bar{Y} - \mu_Y) / \sigma_Y \to \mathcal{N}(0,1)$. As long as there is finite variance, regardless of the original distribution shape, the sample mean asymptotically follows a normal distribution. This is the theoretical basis for constructing confidence intervals.
Meaning of the $1/\sqrt{N}$ Rule: The error decreases as $O(N^{-1/2})$. Compared to the convergence rate of grid-based numerical integration (e.g., trapezoidal rule) $O(N^{-2/d})$, Monte Carlo is slower in low dimensions ($d \leq 3$) but advantageous in high dimensions ($d \geq 5$).

Sampling Methods and Variance Reduction

Pure Random Sampling (Crude Monte Carlo)

🧑‍🎓

The "random sampling" used in the earlier formulas, does that just mean generating random numbers?

🎓

The simplest method is yes. Generate pseudo-random numbers independently from the probability distribution of each input variable to create samples. This is called Pure Random Sampling (Crude Monte Carlo / Simple Random Sampling).

🧑‍🎓

But since it's random, there can be bias, right? Samples might concentrate in some areas, or leave blank spots...

🎓

Sharp observation. Especially when the sample number is small (around $N < 100$), pure random sampling tends to produce uneven coverage of the input space. For example, with 2 variables and 100 samples, the corners of the distribution domain might get no samples at all. That's where LHS (Latin Hypercube Sampling) comes in.

Latin Hypercube Sampling (LHS)

🧑‍🎓

I've heard the name LHS. How is it different from ordinary random numbers?

🎓

The idea of LHS is simple. Partition the distribution of each input variable into $N$ equiprobable strata and take exactly one sample from each stratum. Then randomly shuffle the combinations between variables.

🧑‍🎓

Ah, I see! So you "taste" each variable's distribution evenly. Like Sudoku, where each row and column must have exactly one entry?

🎓

That's the right image. Mathematically, for each variable $X_j$ ($j = 1, \ldots, d$), divide the range $[0, 1]$ of its cumulative distribution function $F_j$ into $N$ equal parts:

$$X_j^{(i)} = F_j^{-1}\left(\frac{\pi_j(i) - U_j^{(i)}}{N}\right), \quad i = 1, \ldots, N$$

Here $\pi_j$ is a random permutation of $\{1, 2, \ldots, N\}$, and $U_j^{(i)} \sim \text{Uniform}(0, 1)$ randomizes the position within the stratum.

🎓

Summarizing the benefits of LHS:

Can uniformly cover the input space with a small number of samples
The marginal distribution of each variable is accurately reproduced
Variance becomes smaller compared to pure random sampling (though the improvement depends on the problem)
In CAE practice, sufficient accuracy is often obtained with LHS of around $N = 50\text{ to }200$

Comparison Item	Pure Random (SRS)	LHS
Spatial Coverage	Biased	Uniform
Marginal Distribution Reproduction	Guaranteed as $N \to \infty$	Guaranteed by each stratum
Variance Reduction Effect	None (baseline)	Problem-dependent but generally favorable
Correlation Control	Difficult	Possible with correlation-controlled LHS
Ease of Adding Samples	Easy	Redesign required

Variance Reduction Techniques

🧑‍🎓

Besides LHS, are there other methods to improve accuracy with a small number of samples?

🎓

There is a group of techniques known as "Variance Reduction Techniques." They effectively reduce the $\sigma_Y$ in the standard error $\text{SE} = \sigma_Y / \sqrt{N}$, or cleverly design the estimator to improve accuracy even with the same $N$.

Technique	Principle	Applicable Scenarios
Antithetic Variates	Uses pairs of samples $\mathbf{U}$ and $1-\mathbf{U}$ to introduce negative correlation	Monotonic response functions
Control Variates	Corrects the estimator using auxiliary variables with known expected values	When an analytical approximation exists
Importance Sampling	Samples densely from region of interest, corrects with likelihood ratio	Rare event / failure probability estimation
Stratified Sampling	Divides input space into strata and samples within each	Non-uniform response functions

🧑‍🎓

Antithetic Variates seems very simple. Does it reduce variance with almost zero additional cost?

🎓

It can be very effective if the response function is monotonic with respect to the input (e.g., if material strength increases, life also increases). When $\mathbf{U}$ produces a large output, $1-\mathbf{U}$ produces a small one, so averaging the two cancels out variance. However, for non-monotonic functions, it can sometimes have the opposite effect, so caution is needed.

Mathematical Formulation of Importance Sampling

Consider estimating the failure probability $P_f = P[g(\mathbf{X}) \leq 0]$. Under the original density function $f_{\mathbf{X}}$:

$$P_f = \int I[g(\mathbf{x}) \leq 0] \, f_{\mathbf{X}}(\mathbf{x}) \, d\mathbf{x} = \int I[g(\mathbf{x}) \leq 0] \frac{f_{\mathbf{X}}(\mathbf{x})}{h(\mathbf{x})} h(\mathbf{x}) \, d\mathbf{x}$$

Sample from a proposal distribution $h(\mathbf{x})$ and weight with the likelihood ratio $w(\mathbf{x}) = f_{\mathbf{X}}(\mathbf{x}) / h(\mathbf{x})$:

$$\hat{P}_f^{IS} = \frac{1}{N}\sum_{i=1}^{N} I[g(\mathbf{X}_i) \leq 0] \cdot w(\mathbf{X}_i), \quad \mathbf{X}_i \sim h$$

The optimal proposal distribution is one concentrated in the failure region; a common method is to place a normal distribution near the design point from FORM/SORM.

Practical Guide

Practical Workflow

🧑‍🎓

When actually running Monte Carlo analysis in CAE, what are the steps?

🎓

The basic flow is like this:

Identify uncertain input parameters — Material properties, geometric dimensions, load conditions, etc.
Set probability distribution for each parameter — Normal distribution, lognormal distribution, uniform distribution, etc.
Create sampling plan — Pure random or LHS, decide sample number $N$
Execute $N$ CAE analyses — Batch processing or parallel execution
Statistical processing of results — Calculate mean, standard deviation, histogram, CDF, confidence intervals
Convergence check — Plot cumulative mean / cumulative standard deviation to confirm stability

Setting Input Distributions

🧑‍🎓

How do you decide the probability distribution for input parameters? Can you just arbitrarily set it to a normal distribution?

🎓

That's the most troublesome part; "Garbage In, Garbage Out" applies most critically to the setting of input distributions. Ideally, you fit them from manufacturer quality data or experimental data, but in practice, the following guidelines are often used:

Parameter	Typical Distribution	Coefficient of Variation (CoV) Guideline
Young's Modulus	Normal Distribution	2〜5%
Yield Stress	Lognormal Distribution	5〜10%
Plate Thickness / Dimensions	Normal Distribution	1〜3% (assuming $\sigma$ = tolerance/6)
Fatigue Strength	Lognormal or Weibull	10〜30%
Load Amplitude	Lognormal or Gumbel	10〜20%
Temperature Conditions	Normal Distribution or Unif