Monte Carlo Simulation — A Fundamental Method for CAE Uncertainty Quantification
Theory and Physics
Overview — What is the Monte Carlo Method?
Professor, does the Monte Carlo method involve repeating simulations with random numbers? How many times do you need to run it?
Good question. Roughly speaking, it's a method that randomly samples from the probability distributions of input parameters and repeatedly executes FEM or CFD analyses hundreds or thousands of times. The goal is to statistically determine the variation (distribution) of the output.
Thousands of times!? Isn't the computational cost huge?
Yes, that's also the biggest weakness of the Monte Carlo method. However, its greatest strength is that the convergence rate does not change regardless of the dimensionality of the input variables. With sensitivity analysis, if there are 10 variables, the cost increases exponentially, but Monte Carlo converges at the same pace whether it's 10 dimensions or 100 dimensions.
Independent of dimensionality... That's amazing. Specifically, when is it used?
For example, a case evaluating how the variation in material strength (coefficient of variation CoV ≈ 5%) of an automotive component affects its fatigue life. When multiple parameters like Young's modulus, yield stress, plate thickness, load amplitude, etc., vary simultaneously, investigating how their "combined effects" propagate to the output is the forte of the Monte Carlo method.
Monte Carlo Simulation (MCS) is a method that repeatedly executes a deterministic model (FEM, CFD, etc.) from probabilistic input variables to estimate the statistical properties (mean, variance, distribution shape, confidence interval) of the output. The name originates from the Monte Carlo district in Monaco, famous for its casinos, due to the use of random numbers.
Basic Principles and Convergence Rate
How is the convergence rate determined? You said earlier it's "independent of dimensionality"...
The core of the Monte Carlo method lies in the Law of Large Numbers and the Central Limit Theorem. We independently sample $N$ samples $\mathbf{X}_1, \mathbf{X}_2, \ldots, \mathbf{X}_N$ from the input distribution and evaluate the model $Y = f(\mathbf{X})$ for each. Then the sample mean is:
The standard error of this sample mean is expressed by the following formula:
Ah, you divide by $\sqrt{N}$! So, increasing the sample number reduces the error, but only by the square root...
Exactly. This is the famous $1/\sqrt{N}$ convergence rule. The important points are:
- To halve the error, you need 4 times the samples
- To reduce the error to 1/10, you need 100 times the samples
- This convergence rate does not depend on the dimensionality $d$ of the input variables (it is not affected by the curse of dimensionality)
The sample variance estimator is similarly defined:
Confidence Interval Estimation
When writing results in a report, just saying "the mean is about this much" isn't enough, right? How do you get the confidence interval?
Thanks to the Central Limit Theorem, if $N$ is sufficiently large (in practice, from about $N \geq 30$), the sample mean can be approximated by a normal distribution. Therefore, the 95% confidence interval can be written as:
1.96 is the value from the normal distribution, right? Specifically, if you run it 1000 times, what kind of accuracy do you get?
Let's calculate the relative estimation accuracy (Relative Accuracy). If the coefficient of variation is $\text{CoV}_Y = \sigma_Y / \mu_Y$:
For example, when the output's coefficient of variation is $\text{CoV}_Y = 0.10$ (10%):
- $N = 100$: $\text{RA} = 1.96 \times 0.10 / \sqrt{100} = 1.96\%$
- $N = 1{,}000$: $\text{RA} = 1.96 \times 0.10 / \sqrt{1000} \approx 0.62\%$
- $N = 10{,}000$: $\text{RA} \approx 0.20\%$
With 1,000 runs, the width of the 95% confidence interval is about 6% of the mean value (when $\text{CoV}_Y \approx 1.0$) is a practical guideline.
How to Determine the Number of Samples
So in practice, how many times should you run it? Is 100 times enough, or do you need 10,000...?
It's completely different depending on the purpose. Roughly summarized, it looks like this:
| Purpose | Guideline for Required Sample Number | Reason |
|---|---|---|
| Mean / Standard Deviation Estimation | 100〜1,000 | Sufficient convergence by $1/\sqrt{N}$ rule |
| 95th Percentile Estimation | 1,000〜5,000 | To accurately capture the tail of the distribution |
| 99.9th Percentile (Rare Event) | 10,000〜100,000 | Many samples needed for tail accuracy |
| Failure Probability $P_f = 10^{-k}$ Estimation | $\geq 10^{k+2}$ | Required for CoV($\hat{P}_f$) < 10% |
A failure probability of $10^{-6}$ requires $10^8$ runs!? That's impossible to run, isn't it...?
That's why for rare event estimation, instead of pure Monte Carlo, special techniques like Importance Sampling or Subset Simulation are used. Or, a surrogate model is inserted to reduce computational cost.
Theoretical Background of Convergence Rate
- Law of Large Numbers: As $N \to \infty$, the sample mean $\bar{Y}$ converges in probability to the true expected value $\mu_Y = E[Y]$. This guarantees the consistency of the Monte Carlo estimator.
- Central Limit Theorem: $\sqrt{N}(\bar{Y} - \mu_Y) / \sigma_Y \to \mathcal{N}(0,1)$. As long as there is finite variance, regardless of the original distribution shape, the sample mean asymptotically follows a normal distribution. This is the theoretical basis for constructing confidence intervals.
- Meaning of the $1/\sqrt{N}$ Rule: The error decreases as $O(N^{-1/2})$. Compared to the convergence rate of grid-based numerical integration (e.g., trapezoidal rule) $O(N^{-2/d})$, Monte Carlo is slower in low dimensions ($d \leq 3$) but advantageous in high dimensions ($d \geq 5$).
Sampling Methods and Variance Reduction
Pure Random Sampling (Crude Monte Carlo)
The "random sampling" used in the earlier formulas, does that just mean generating random numbers?
The simplest method is yes. Generate pseudo-random numbers independently from the probability distribution of each input variable to create samples. This is called Pure Random Sampling (Crude Monte Carlo / Simple Random Sampling).
But since it's random, there can be bias, right? Samples might concentrate in some areas, or leave blank spots...
Sharp observation. Especially when the sample number is small (around $N < 100$), pure random sampling tends to produce uneven coverage of the input space. For example, with 2 variables and 100 samples, the corners of the distribution domain might get no samples at all. That's where LHS (Latin Hypercube Sampling) comes in.
Latin Hypercube Sampling (LHS)
I've heard the name LHS. How is it different from ordinary random numbers?
The idea of LHS is simple. Partition the distribution of each input variable into $N$ equiprobable strata and take exactly one sample from each stratum. Then randomly shuffle the combinations between variables.
Ah, I see! So you "taste" each variable's distribution evenly. Like Sudoku, where each row and column must have exactly one entry?
That's the right image. Mathematically, for each variable $X_j$ ($j = 1, \ldots, d$), divide the range $[0, 1]$ of its cumulative distribution function $F_j$ into $N$ equal parts:
Here $\pi_j$ is a random permutation of $\{1, 2, \ldots, N\}$, and $U_j^{(i)} \sim \text{Uniform}(0, 1)$ randomizes the position within the stratum.
Summarizing the benefits of LHS:
- Can uniformly cover the input space with a small number of samples
- The marginal distribution of each variable is accurately reproduced
- Variance becomes smaller compared to pure random sampling (though the improvement depends on the problem)
- In CAE practice, sufficient accuracy is often obtained with LHS of around $N = 50\text{ to }200$
| Comparison Item | Pure Random (SRS) | LHS |
|---|---|---|
| Spatial Coverage | Biased | Uniform |
| Marginal Distribution Reproduction | Guaranteed as $N \to \infty$ | Guaranteed by each stratum |
| Variance Reduction Effect | None (baseline) | Problem-dependent but generally favorable |
| Correlation Control | Difficult | Possible with correlation-controlled LHS |
| Ease of Adding Samples | Easy | Redesign required |
Variance Reduction Techniques
Besides LHS, are there other methods to improve accuracy with a small number of samples?
There is a group of techniques known as "Variance Reduction Techniques." They effectively reduce the $\sigma_Y$ in the standard error $\text{SE} = \sigma_Y / \sqrt{N}$, or cleverly design the estimator to improve accuracy even with the same $N$.
| Technique | Principle | Applicable Scenarios |
|---|---|---|
| Antithetic Variates | Uses pairs of samples $\mathbf{U}$ and $1-\mathbf{U}$ to introduce negative correlation | Monotonic response functions |
| Control Variates | Corrects the estimator using auxiliary variables with known expected values | When an analytical approximation exists |
| Importance Sampling | Samples densely from region of interest, corrects with likelihood ratio | Rare event / failure probability estimation |
| Stratified Sampling | Divides input space into strata and samples within each | Non-uniform response functions |
Antithetic Variates seems very simple. Does it reduce variance with almost zero additional cost?
It can be very effective if the response function is monotonic with respect to the input (e.g., if material strength increases, life also increases). When $\mathbf{U}$ produces a large output, $1-\mathbf{U}$ produces a small one, so averaging the two cancels out variance. However, for non-monotonic functions, it can sometimes have the opposite effect, so caution is needed.
Mathematical Formulation of Importance Sampling
Consider estimating the failure probability $P_f = P[g(\mathbf{X}) \leq 0]$. Under the original density function $f_{\mathbf{X}}$:
$$P_f = \int I[g(\mathbf{x}) \leq 0] \, f_{\mathbf{X}}(\mathbf{x}) \, d\mathbf{x} = \int I[g(\mathbf{x}) \leq 0] \frac{f_{\mathbf{X}}(\mathbf{x})}{h(\mathbf{x})} h(\mathbf{x}) \, d\mathbf{x}$$Sample from a proposal distribution $h(\mathbf{x})$ and weight with the likelihood ratio $w(\mathbf{x}) = f_{\mathbf{X}}(\mathbf{x}) / h(\mathbf{x})$:
$$\hat{P}_f^{IS} = \frac{1}{N}\sum_{i=1}^{N} I[g(\mathbf{X}_i) \leq 0] \cdot w(\mathbf{X}_i), \quad \mathbf{X}_i \sim h$$The optimal proposal distribution is one concentrated in the failure region; a common method is to place a normal distribution near the design point from FORM/SORM.
Practical Guide
Practical Workflow
When actually running Monte Carlo analysis in CAE, what are the steps?
The basic flow is like this:
- Identify uncertain input parameters — Material properties, geometric dimensions, load conditions, etc.
- Set probability distribution for each parameter — Normal distribution, lognormal distribution, uniform distribution, etc.
- Create sampling plan — Pure random or LHS, decide sample number $N$
- Execute $N$ CAE analyses — Batch processing or parallel execution
- Statistical processing of results — Calculate mean, standard deviation, histogram, CDF, confidence intervals
- Convergence check — Plot cumulative mean / cumulative standard deviation to confirm stability
Setting Input Distributions
How do you decide the probability distribution for input parameters? Can you just arbitrarily set it to a normal distribution?
That's the most troublesome part; "Garbage In, Garbage Out" applies most critically to the setting of input distributions. Ideally, you fit them from manufacturer quality data or experimental data, but in practice, the following guidelines are often used:
| Parameter | Typical Distribution | Coefficient of Variation (CoV) Guideline |
|---|---|---|
| Young's Modulus | Normal Distribution | 2〜5% |
| Yield Stress | Lognormal Distribution | 5〜10% |
| Plate Thickness / Dimensions | Normal Distribution | 1〜3% (assuming $\sigma$ = tolerance/6) |
| Fatigue Strength | Lognormal or Weibull | 10〜30% |
| Load Amplitude | Lognormal or Gumbel | 10〜20% |
| Temperature Conditions | Normal Distribution or Unif |
Related Topics
なった
詳しく
報告