Design of Experiments (DOE) — CAE Glossary

Category: Glossary | 2026-03-28
CAE visualization for design of experiment - technical simulation diagram

What is Design of Experiments (DOE)?

🧑‍🎓

I hear the term Design of Experiments (DOE) often, but in the context of CAE, what is it exactly? It's not about physical experiments, right?


🎓

Design of Experiments, or DOE for short. It was originally systematized by R.A. Fisher in the 1920s as a statistical method to improve agricultural experiment efficiency. In CAE, it is used as a method to systematically determine which combinations of design variables to execute in the analysis. For example, if you want to run a crash analysis on a car bumper by varying three factors—plate thickness, material, and rib placement—total enumeration would mean hundreds of cases. With DOE, you can efficiently capture main effects and interactions with just dozens of analyses.


🧑‍🎓

I see—reducing the number of analyses while retaining critical information. Is it because total enumeration would cause computational costs to explode?


🎓

Exactly. CAE analyses can take hours to days per case. For example, with 5 factors and 5 levels each in full factorial design, you get $5^5 = 3{,}125$ cases. At 2 hours per case, that's about 260 days of computation. With DOE, the same 5 factors can be understood with 50–100 cases, capturing main effects and major interactions efficiently.


Why is DOE necessary in CAE?

🧑‍🎓

Can't you just run parametric studies by varying parameters arbitrarily a few times? I'm not convinced why DOE specifically is needed.


🎓

Arbitrary variation is the most dangerous approach. In practice, engineers often run only a handful of cases based on intuition and conclude something like "increasing thickness improves strength." But in reality, there may be interactions between thickness and material—perhaps increasing thickness with a certain material actually reduces fatigue life. DOE is a statistical safeguard against such oversights.


🧑‍🎓

Interactions—that's the combined effect of factors, right? Can One-Factor-At-a-Time (OFAT) not detect that?


🎓

No, it cannot. OFAT—"fix all but one factor and vary it"—is the most basic parametric study method. It cannot detect interactions and only covers a fraction of the design space. DOE varies multiple factors simultaneously in a systematic way, so it can separate and estimate both main effects and interactions. Statistically, the information efficiency is far superior.


Representative DOE Methods

Full Factorial Design

🧑‍🎓

Please start with full factorial design. That's the most basic, right?


🎓

It's the method where you run every combination of all factor levels. With $k$ factors at $n$ levels each, you need $n^k$ experiments. Two factors at 3 levels = $3^2 = 9$ cases, but 10 factors at 3 levels = $3^{10} = 59{,}049$ cases. Practically impossible in CAE, so it's limited to cases with 3–4 factors and few levels.


Latin Hypercube Sampling (LHS)

🧑‍🎓

Latin hypercube sampling sounds cool, but how does it actually work?


🎓

Roughly: "divide each factor's value range into $N$ equal intervals and sample exactly one point from each interval." For example, if you want 20 samples of sheet thickness between 1.0–5.0 mm, you create 20 intervals of 0.2 mm each and randomly pick one point from each. Do this independently for all factors. This ensures even coverage of the design space without clustering—a common problem with pure random sampling.


🧑‍🎓

I often see LHS in CAE optimization tools like modeFRONTIER and OptiSLang. How is it used in practice?


🎓

A typical flow is: "Generate initial samples with LHS → Run CAE analyses → Build a surrogate model (response surface) → Optimize on the surrogate." For example, in airfoil shape optimization, you sample 100 points from 10 wing parameters using LHS, run 100 CFD cases, build a kriging model, then run genetic algorithm on that model. This replaces 100,000 CFD runs with just 100.


The mathematical definition of LHS: For $k$ factors and $N$ samples, each factor $x_i$ (with range $[a_i, b_i]$) is divided into $N$ equal intervals. The value of factor $i$ in sample $j$ is:

$$x_i^{(j)} = a_i + \frac{\pi_i(j) - u_{ij}}{N}(b_i - a_i), \quad j=1,\ldots,N$$

where $\pi_i$ is a random permutation of $\{1,2,\ldots,N\}$ and $u_{ij} \sim U(0,1)$ is a uniform random number.

Taguchi Method (Orthogonal Arrays & S/N Ratio)

🧑‍🎓

The Taguchi method is Japanese, right? Is it still used in CAE?


🎓

Very much so. Japanese automotive and electronics manufacturers especially value quality engineering traditions. The heart of Taguchi is two things: orthogonal arrays and S/N ratio. Orthogonal arrays ($L_9$, $L_{18}$, $L_{27}$, etc.) distribute factor combinations uniformly with minimal experiments. The S/N ratio quantifies robustness against variability in three categories: nominal-is-best, smaller-is-better, and larger-is-better.


🧑‍🎓

Give me a concrete example. How would you apply it to, say, press forming simulation?


🎓

Good example. Suppose you want to prevent wrinkling and tearing in press forming. Control factors: blank holder force, punch speed, die radius, and lubrication (4 factors at 3 levels each). Using $L_9$ orthogonal array, you run only 9 forming simulations. Add noise factors—material lot variation (sheet thickness tolerance, r-value change)—in an outer orthogonal array. Evaluating results with S/N ratio immediately shows "blank holder force has the strongest effect, but punch speed doesn't matter for tearing."


S/N ratio for nominal-is-best characteristic:

$$\text{S/N ratio} = 10\log_{10}\frac{\bar{y}^2}{s^2} \quad [\text{dB}]$$

where $\bar{y}$ is the mean response and $s^2$ is the variance. A larger S/N ratio indicates a design robust against variability.

Response Surface Methodology (RSM)

🧑‍🎓

Response surface methodology is often mentioned alongside DOE. How are they related?


🎓

If DOE answers "where to compute," RSM answers "build a continuous mathematical model from the results." The two are usually paired. Run CAE at sample points chosen by DOE, then approximate the response (stress, displacement, drag coefficient) as a function of input variables. This surrogate model (proxy model) enables fast exploration.


🧑‍🎓

What types of surrogate models are there?


🎓

Three main types. Polynomial approximation (second-order response surface) is simplest: $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_{11}x_1^2 + \beta_{22}x_2^2 + \beta_{12}x_1 x_2$. Coefficients are determined using Box-Behnken or CCD (central composite design). Kriging (Gaussian process regression) passes through sample points and gives high interpolation accuracy, good for highly nonlinear responses. Neural network surrogates are increasingly common too.


🧑‍🎓

In practice, would that be like optimizing a heat exchanger fin shape with RSM?


🎓

Exactly. Take 4 fin parameters—pitch, height, thickness, angle—sample 60 points with LHS, and run conjugate heat transfer CFD on 60 cases. Build kriging models for heat transfer (Nu) and pressure drop (Δp). Multi-objective optimize for a Pareto front. One CFD case takes 3 hours, so 60 cases = 180 hours, about a week. Total enumeration would take years.


General form of a second-order polynomial response surface:

$$\hat{y}(\mathbf{x}) = \beta_0 + \sum_{i=1}^{k}\beta_i x_i + \sum_{i=1}^{k}\beta_{ii}x_i^2 + \sum_{iThe coefficient vector $\boldsymbol{\beta}$ is estimated by least squares: $\hat{\boldsymbol{\beta}} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{y}$

DOE Workflow in CAE

🧑‍🎓

Can you walk me through the actual project workflow for applying DOE, step by step?


🎓

A typical workflow has 5 steps.
Step 1: Problem Definition — Clarify the objective function (response to minimize/maximize) and design variables (factors) with their ranges. Example: "Minimize chest acceleration during crash; design variables: A-pillar thickness (1.2–2.0 mm) and front member section (3 types)."
Step 2: Choose DOE Method — Select a technique based on factor count, levels, and budget. Many continuous factors → LHS. Many discrete factors → orthogonal array. High-precision nonlinear modeling needed → CCD.
Step 3: Run Analyses — Execute CAE at all DOE-prescribed cases. Automation scripts (Python + Abaqus/OpenFOAM) are typical.
Step 4: Response Analysis — Use main effect plots, interaction plots, and ANOVA to quantify factor contributions. Build surrogate model if needed.
Step 5: Optimize & Validate — Search the surrogate for optimal points and run confirmation analyses to verify predictions.


🧑‍🎓

In Step 4, what exactly does ANOVA (analysis of variance) tell us?


🎓

ANOVA shows "what percent of total response variation does Factor A explain? Factor B? Their interaction? Residual?" In a crash analysis, you might find thickness explains 65%, material 20%, interaction 10%, residual 5%. This tells you "optimize thickness first for maximum impact." Budget constraints become clearer.


Method Comparison Table

🧑‍🎓

So which method should I choose? What's the decision rule?


🎓

Three criteria: (1) Factor count and budget: 3 factors or fewer → full factorial. 5+ factors → LHS or orthogonal array. (2) Purpose: Robust design → Taguchi. Continuous optimization → LHS + surrogate. Quadratic model sufficient → CCD. (3) Factor nature: Mostly continuous (dimensions, temperature) → LHS. Many discrete (material types, shapes) → orthogonal array. When in doubt, LHS is the most versatile choice.


🧑‍🎓

I've heard of Bayesian optimization and adaptive DOE lately. How do they differ from classical DOE?


🎓

Classical DOE is batch-mode: "decide all sample points upfront, then run everything." Adaptive DOE (sequential DOE) is iterative: "analyze a few points → update surrogate → auto-select next point → repeat." Bayesian optimization is a classic example using Acquisition Functions (EI: Expected Improvement) to balance "likely optimal" vs. "uncertain" regions. Same budget, but often reaches the optimum more efficiently.


Accurate understanding of CAE terminology is the foundation of team communication. — Project NovaSolver supports learning for practitioners.

Share the challenges you face with Design of Experiments in your work

Project NovaSolver aims to solve issues CAE engineers encounter daily—setup complexity, computational cost, result interpretation. Your practical experience fuels better tool development.

Contact (Coming Soon)
Rate this article
Thank you for your feedback!
Helpful
More
detail
Report
error
Helpful
0
More detail
0
Report error
0
Written by NovaSolver Contributors
Anonymous Engineers & AI — Sitemap