Numerical Methods for CAE
FEM, FVM, FDM Theory and Solver Selection
Every CAE software — Abaqus, Ansys, OpenFOAM, COMSOL — is ultimately a large numerical computation engine. Understanding the mathematical methods beneath the GUI doesn't just satisfy academic curiosity: it determines whether you interpret results correctly, recognize when a simulation is going wrong, and know how to fix it efficiently.
1. PDE Classification and Numerical Methods Overview
I keep hearing "parabolic," "elliptic," and "hyperbolic" PDEs. Do these classifications actually matter for choosing a solver, or is it just academic?
It absolutely matters — the classification determines the mathematical character of the solution, which in turn dictates appropriate numerical methods. Elliptic problems (like steady-state heat conduction) have smooth solutions where every point depends on every other point simultaneously — you must solve the whole domain at once. Parabolic problems (like transient heat) propagate smoothly forward in time — you can march step by step. Hyperbolic problems (like shock waves) propagate at finite speed with possible discontinuities — you need upwind schemes that respect wave direction. Use the wrong method for the problem type and you get instability, false diffusion, or wrong wave speeds.
1.1 PDE Classification for a 2nd-Order PDE
For $A u_{xx} + 2B u_{xy} + C u_{yy} + \ldots = 0$, the discriminant $\Delta = B^2 - AC$ determines:
| Discriminant | Type | Prototype equation | Physical analogy |
|---|---|---|---|
| Δ < 0 (B² < AC) | Elliptic | $\nabla^2 u = f$ (Poisson) | Steady-state heat, electrostatics, elastic equilibrium |
| Δ = 0 | Parabolic | $u_t = \alpha\nabla^2 u$ (diffusion) | Transient heat, mass diffusion, viscous flow |
| Δ > 0 (B² > AC) | Hyperbolic | $u_{tt} = c^2\nabla^2 u$ (wave) | Sound waves, structural dynamics, compressible shocks |
1.2 Methods Overview
| Method | Acronym | Mesh type | Conservation | Primary domain |
|---|---|---|---|---|
| Finite Difference | FDM | Structured (Cartesian) | Not automatic | Simple geometry, wave propagation, research codes |
| Finite Element | FEM | Unstructured (any) | Weak form | Structural, electromagnetic, coupled physics |
| Finite Volume | FVM | Unstructured (any) | Cell-exact | CFD (Fluent, OpenFOAM, STAR-CCM+) |
| Boundary Element | BEM | Surface only | — | Acoustics, unbounded EM, potential flow |
| Spectral Element | SEM | Structured | Spectral accuracy | Computational fluid dynamics research, seismic |
2. Finite Difference Method (FDM)
FDM seems simpler than FEM — why don't CFD codes just use it instead?
FDM is indeed simpler and gives very efficient codes on structured Cartesian grids — that's why FDTD (finite difference time domain) is still king in electromagnetic wave simulation, and many research CFD codes use it. The problem is that most engineering geometries — car bodies, pipe junctions, turbine blades — are not nice rectangular boxes. Fitting a structured Cartesian grid to an arbitrary geometry requires body-fitted coordinates or immersed boundary methods, which add complexity. FVM naturally handles unstructured meshes with triangles, tetrahedra, and arbitrary polyhedra, making it far easier to mesh complex CAD geometries. That's the main reason CFD uses FVM.
2.1 Finite Difference Approximations
From Taylor series expansion, the first derivative:
Second derivative:
2.2 Stability Analysis: Von Neumann Method
For the 1D diffusion equation $u_t = \alpha u_{xx}$ with explicit FDM (FTCS):
Von Neumann stability analysis: assume $u_i^n = g^n e^{i\xi x_i}$. The amplification factor must satisfy $|g| \leq 1$:
This CFL-like stability condition is the fundamental constraint of explicit time integration: halving the grid spacing requires quartering the time step — a severe restriction for 3D fine meshes.
2.3 1D Advection Equation — Upwind Schemes
For $u_t + c\,u_x = 0$ (wave speed c > 0), central differences give an unstable scheme. The first-order upwind scheme is stable:
The CFL number ν must be ≤ 1 for stability — information cannot travel more than one cell per time step.
3. Finite Element Method (FEM) — Mathematical Foundations
I understand the basic concept of FEM — divide into elements, solve locally, assemble globally. But what's the mathematical justification? Why does this actually work and give the right answer?
The justification is the Galerkin method: instead of satisfying the PDE exactly at every point, we require it to be satisfied on average — weighted by test functions. The beautiful result from functional analysis is that among all possible approximations within our finite-dimensional function space (spanned by the shape functions), the Galerkin solution is the best possible in terms of the energy norm. It minimizes the projection error orthogonally. As you add more elements or use higher-order polynomials, the solution space grows and the approximation error provably converges to zero at predictable rates. That's the rigorous foundation.
3.1 The Variational (Ritz-Galerkin) Foundation
For the model problem $-\nabla\cdot(k\nabla u) = f$ with $u = 0$ on boundary:
Minimization form: The FEM solution minimizes the potential energy functional:
Galerkin (weak) form: Find $u^h \in V^h$ such that $\forall v^h \in V^h$:
3.2 Galerkin Orthogonality and Best Approximation
The Galerkin error $e = u - u^h$ satisfies:
This "Galerkin orthogonality" means the error is orthogonal to the approximation space in the energy inner product. Cea's lemma then gives the best approximation property:
3.3 Convergence Rates
For polynomial order p elements and smooth solutions, the h-convergence rate:
where h is the mesh size and s is the smoothness of the exact solution. For smooth problems with linear elements (p=1): energy error converges as h¹, L₂ error as h².
3.4 Element Types and Polynomial Orders
| Element | Nodes | Polynomial order | Best for |
|---|---|---|---|
| Linear tetrahedral (Tet4) | 4 | p=1 | Complex geometry; low accuracy per DOF; avoid for bending |
| Quadratic tetrahedral (Tet10) | 10 | p=2 | Industry standard for curved geometries |
| Linear hexahedral (Hex8) | 8 | p=1 | Structured regions; beware shear locking |
| Serendipity hex (Hex20) | 20 | p=2 | High accuracy; best efficiency for structural |
| Linear wedge (Penta6) | 6 | p=1 | Thin layers (boundary layers, laminates) |
| Linear shell (S4R) | 4 | p=1 (in-plane) | Thin-walled structures; efficient |
3.5 Locking Phenomena and Remedies
Low-order elements can exhibit artificial stiffness ("locking") in special cases:
| Locking type | When it occurs | Remedy |
|---|---|---|
| Shear locking | Linear elements in bending-dominated problems | Reduced integration (RI), incompatible modes (Q6), higher-order elements |
| Volumetric locking | Nearly incompressible materials (ν → 0.5), plasticity | Mixed formulation (u-p elements), reduced integration, B-bar method |
| Membrane locking | Curved shell elements | Assumed Natural Strain (ANS), MITC elements |
4. Finite Volume Method (FVM)
I've heard FVM is "conservative" while FDM might not be. What does conservation mean in this context and why does it matter for CFD?
Conservation means the numerical method exactly balances fluxes across each cell — whatever flows out of one cell flows exactly into the adjacent cell, with no numerical leakage. This is critical in CFD because it ensures total mass, momentum, and energy are preserved to machine precision, regardless of mesh size. In a simulation of a flow through a duct, a non-conservative scheme might accumulate a net mass error that grows over time, eventually giving completely wrong answers. FVM achieves this automatically because it integrates the conservation law in integral form over each cell — the cell-face fluxes cancel exactly when you sum over adjacent cells.
4.1 FVM Discretization
Integrating a conservation law $\partial_t(\rho\phi) + \nabla\cdot(\mathbf{F}) = S$ over a control volume $V_P$:
Approximating as:
The challenge: computing the face flux $\mathbf{F}_f$ from cell-center values requires interpolation — the choice of interpolation scheme determines accuracy, stability, and conservation.
4.2 Gradient Reconstruction
Cell-center gradients are needed for the viscous fluxes and for second-order interpolation to faces. Least-squares gradient reconstruction:
where sum is over all face neighbors N, and $\mathbf{w}_{PN}$ are weights. This is more accurate than the Green-Gauss theorem approach on skewed meshes.
4.3 Flux Limiters for Convection
High-order (2nd+ order) convection schemes can produce oscillations (Gibbs phenomenon) near sharp gradients. TVD (Total Variation Diminishing) limiters enforce monotonicity:
Common limiters: Minmod ($\psi = \max(0, \min(1, r))$), Van Leer ($\psi = (r+|r|)/(1+|r|)$), Superbee. Each offers different tradeoffs between accuracy and monotonicity.
5. Linear System Solvers (Direct vs. Iterative)
My Abaqus analysis says it's spending most of its time in the "direct solver" phase. What is a direct solver doing, and should I switch to an iterative solver?
A direct solver performs LU or Cholesky factorization — essentially Gaussian elimination with pivoting — and gives the exact solution (within floating-point precision) in one pass. It's robust and doesn't need tuning, but memory and time scale badly: for n unknowns, cost is roughly O(n^1.5) for 2D problems and O(n^2) for 3D — that's why a 10 million DOF model can take hours and 100 GB of RAM. Iterative solvers work differently: start with an initial guess and improve it iteratively using matrix-vector products, which only cost O(n) memory. For very large 3D models (millions of DOFs), preconditioned iterative solvers (PCG, GMRES, AMG) are much faster. Switch to iterative if: you have millions of DOFs, the problem is well-conditioned, and the solver supports it.
5.1 Direct Solvers: Cholesky Factorization
For symmetric positive definite (SPD) systems (elastic FEM without rigid body motion): $\mathbf{K} = \mathbf{L}\mathbf{L}^T$.
Steps: (1) Analyze fill-in pattern → optimal reordering (AMD, nested dissection). (2) Symbolic factorization → allocate sparse storage. (3) Numerical factorization O(n_fill). (4) Forward/back substitution for each RHS.
| Direct Solver | Matrix type | Used in |
|---|---|---|
| MUMPS | General sparse, parallel | OpenFOAM (optional), Code_Aster |
| PARDISO | SPD and general, parallel | Abaqus, Ansys, COMSOL |
| SuperLU | General sparse, sequential/distributed | CalculiX, research codes |
| MA57/MA97 | Symmetric, parallel | Abaqus (classic) |
5.2 Iterative Solvers: Krylov Methods
Krylov methods build an approximation from the Krylov subspace $\mathcal{K}_k = \text{span}\{r_0, \mathbf{A}r_0, \mathbf{A}^2r_0, \ldots, \mathbf{A}^{k-1}r_0\}$:
- CG (Conjugate Gradient): For SPD systems. Converges in at most n steps; with good preconditioning, much fewer. Computational cost: 1 matrix-vector product per iteration.
- GMRES: For general (non-symmetric) systems. Minimizes residual over Krylov subspace. More memory intensive (stores full Krylov basis), often restarted as GMRES(m).
- BiCGSTAB: For non-symmetric systems; fixed memory cost per iteration; less stable than GMRES but cheaper.
5.3 Preconditioning
Convergence rate depends on the condition number $\kappa(\mathbf{A}) = \lambda_{\max}/\lambda_{\min}$. Preconditioning replaces $\mathbf{A}\mathbf{x} = \mathbf{b}$ with:
| Preconditioner | Quality | Cost per iteration | Parallel scaling |
|---|---|---|---|
| Jacobi (diagonal) | Poor (for structured meshes) | Very cheap | Perfect |
| ILU(0) | Moderate | Cheap | Moderate |
| ILU(k) | Good (larger fill k) | Moderate | Moderate |
| AMG (Algebraic Multigrid) | Excellent (mesh-size independent) | Moderate | Good |
| ILU + domain decomposition | Good | Moderate | Excellent |
Algebraic Multigrid (AMG) is the state-of-the-art preconditioner for FEM structural problems — it achieves nearly O(n) computational complexity. Ansys uses MEBA (Mechanical AMG), OpenFOAM uses GAMG for pressure equations.
6. Eigenvalue Solvers
For modal analysis of a 5 million DOF model, the solver finds 100 modes in a few minutes. How does it do that without computing the full n×n eigenproblem?
The key insight is that you only need the lowest eigenvalues — the first few dozen modes that actually have physical significance. The Lanczos algorithm does this beautifully: it builds a small tridiagonal matrix (say 200×200) by iterating with matrix-vector products involving K and M. The eigenvalues of this small matrix converge to the lowest eigenvalues of the original n×n problem. The accuracy improves as you extend the Lanczos sequence. For a 5-million DOF model, each Lanczos step costs one K-factorization (done once) plus vector operations — so computing 100 modes might cost the equivalent of 150 linear solves. That's exactly what Abaqus/Ansys do internally.
6.1 The Lanczos Algorithm
For $(\mathbf{K} - \lambda\mathbf{M})\boldsymbol{\phi} = 0$, the shift-and-invert form avoids instability for clustered or zero eigenvalues:
After m steps, the original eigenvalues near shift σ are approximated by eigenvalues of the m×m tridiagonal matrix. Convergence of the extreme Ritz values is rapid (super-linear for well-separated eigenvalues).
6.2 AMLS and Component Mode Synthesis
For models with millions of DOFs where even Lanczos is too expensive, AMLS (Automated Multi-Level Substructuring) partitions the model into substructures, computes local eigenmodes, and assembles a reduced-order model. This can reduce a 10M DOF problem to a 50,000 DOF eigenvalue problem, with less than 1% error for the lowest modes. Abaqus, Nastran, and Ansys all offer this option.
7. Numerical Accuracy and Error Management
How do I know if my FEM result is accurate enough? My manager always asks "is this mesh converged?" but I'm not sure how to answer rigorously.
The rigorous answer is: run the analysis at least three times with systematically refined meshes (say, element size h, h/2, h/4) and check that your quantity of interest (peak stress, natural frequency, displacement) is converging. If the change from level 2 to level 3 is less than your required accuracy (say 5%), you're done. For a more formal estimate, Richardson extrapolation gives you the exact solution in the limit — and the Grid Convergence Index (GCI) gives a confidence interval. Checking a single mesh is not sufficient; you need at least two levels of refinement to demonstrate convergence.
7.1 Richardson Extrapolation and GCI
Given results on three successively refined meshes with refinement ratio r:
Safety factor $F_s = 1.25$ (recommended by ASME V&V standards). GCI provides a reliable error bound for the discretization error.
7.2 Sources of Error in CAE
| Error source | Type | Mitigation |
|---|---|---|
| Mesh discretization error | Truncation (goes to 0 with refinement) | Mesh convergence study, higher-order elements |
| Time discretization error | Truncation (goes to 0 with smaller Δt) | Time step convergence study |
| Round-off / floating-point | Grows with n for ill-conditioned systems | Check condition number; use double precision |
| Model idealization error | Not reduced by refinement | Geometry simplification validation; compare to test |
| Material parameter uncertainty | Input error | Sensitivity study; use measured data, not handbook values |
| Convergence error (iterative) | Solver residual | Reduce tolerance; monitor physical quantity, not just residual |
7.3 Condition Number and Ill-Conditioned Systems
The condition number of the stiffness matrix:
For structural FEM with linear elements: $\kappa \sim (L/h)^2$ — scales with the square of the mesh fineness. For double precision (64-bit), significant digits lost = $\log_{10}(\kappa)$. With κ = 10⁸ and 15 significant digits, you have 7 digits of accuracy — usually fine. With κ = 10¹² (very fine mesh or near-singular problem), only 3 digits remain — dangerous.
8. Solver Selection Guide
When I open Abaqus, I see Abaqus/Standard and Abaqus/Explicit. And OpenFOAM has like 20 different solver options. How do I choose?
The fundamental split in structural analysis: implicit (Standard) vs. explicit (Explicit). Implicit is better for: static, quasi-static, low-frequency dynamics, anything requiring large time steps — Abaqus/Standard, Ansys Mechanical. Explicit is better for: high-speed impact, crash, metal forming, any problem with severe contact or very short event duration — Abaqus/Explicit, LS-DYNA. For CFD: steady incompressible → SIMPLE-based (OpenFOAM simpleFoam, Fluent pressure-based steady). Transient incompressible → PISO or PIMPLE (OpenFOAM pimpleFoam). Compressible → density-based (OpenFOAM rhoCentralFoam, Fluent density-based).
8.1 Structural Analysis Solver Selection
| Analysis type | Recommended approach | Solver examples |
|---|---|---|
| Linear static | Direct sparse (Pardiso, MA57) | Abaqus/Standard, Ansys Mechanical, CalculiX |
| Nonlinear static | N-R + direct solver | Abaqus/Standard, Marc, Code_Aster |
| Modal analysis | Lanczos (Abaqus), PCG Lanczos (Ansys) | Abaqus, Ansys, Nastran |
| Transient, low frequency | Implicit HHT-α; AMG preconditioned | Abaqus/Standard Dynamic, Ansys Transient |
| Impact/crash | Explicit central difference; small Δt | LS-DYNA, Abaqus/Explicit, PAM-CRASH |
| Large-scale (10M+ DOF) | Iterative + AMG; consider domain decomp. | Ansys Mechanical (MEBA), OpenRadioss |
8.2 CFD Solver Selection
| Flow type | Algorithm | Solver |
|---|---|---|
| Steady incompressible RANS | SIMPLE/SIMPLEC | simpleFoam, Fluent steady pressure-based |
| Transient incompressible | PISO/PIMPLE | pimpleFoam, pisoFoam |
| Natural convection | SIMPLE + Boussinesq | buoyantSimpleFoam, Fluent steady |
| Subsonic compressible (Ma<0.3) | Density-based or low-Ma correction | rhoPimpleFoam, Fluent |
| Transonic/supersonic | Density-based (Roe, AUSM) | rhoCentralFoam, Fluent density-based |
| LES/DES | Explicit or low-storage Runge-Kutta | pimpleFoam (large Δt), Fluent LES |
- PDE type (elliptic/parabolic/hyperbolic) determines appropriate numerical methods
- FEM: variational foundation; converges provably at h^p rate; best for structural/EM
- FVM: exact cell conservation; handles arbitrary meshes; industry standard for CFD
- Direct solvers: O(n^1.5–2) cost, robust; iterative + AMG: O(n) cost, requires tuning
- Mesh convergence study (GCI) is the only rigorous way to quantify discretization error
- Implicit integration for low-frequency dynamics; explicit for impact/crash
Related Interactive Tools
- Mesh Convergence (GCI) Calculator — Richardson extrapolation and grid convergence index
- Truncation Error Visualizer — Compare FD scheme accuracy orders
- 1D Heat Diffusion (FTCS) — Explore stability limit with CFL number
- Uncertainty Propagation — Monte Carlo error propagation for CAE inputs