Polynomial Chaos Expansion (PCE) Simulator

Q: Why use Hermite polynomials for a Gaussian input?

The probabilist Hermite polynomials He_k(xi) are orthogonal with respect to the standard normal weight e^(-xi^2/2)/sqrt(2*pi). That gives E[He_i He_j] = i! * delta_ij and lets us compute each coefficient as y_k = E[Y He_k] / k! independently. The orthogonality also means the output variance reduces to Var[Y] = sum_{k>=1} y_k^2 k! with no cross terms. Change the input distribution and the optimal basis changes too — the Wiener-Askey scheme is the catalogue that pairs distributions with their natural orthogonal bases.

Q: How do I get Sobol sensitivity indices from a PCE?

Each PCE coefficient y_k contributes y_k^2 * k! to the output variance (Hermite case). Group the basis functions that involve a given input (or combination of inputs), divide their variance contribution by the total variance, and you have the Sobol sensitivity index for that subset — no extra samples required. For the 1-input model Y = a1*X + a2*X^2 the ratios y_1^2 and 2*y_2^2 to the total variance are the linear and quadratic shares respectively. Compared with MC-based Sobol estimation, PCE turns it into a pure post-processing step.

For a Gaussian input X ~ N(μ, σ²) and the model Y = a₁X + a₂X², this tool builds a Hermite Polynomial Chaos Expansion in real time and reads off the mean, variance and Sobol indices analytically. Watch the coefficients y_k change and see Monte Carlo samples line up with the PCE distribution.

Parameters

Input mean μ

Mean of the input Gaussian X ~ N(μ, σ²)

Input std. deviation σ

Input spread. Larger σ amplifies the non-linear (quadratic) contribution.

PCE order P

Truncation order. P ≥ 2 is exact here because Y is quadratic.

Model linear coefficient a₁

Linear term of the response Y = a₁X + a₂X².

Model quadratic coefficient a₂

Non-linearity. Setting a₂ = 0 makes the response purely linear.

Results

—

PCE order P

—

Output mean E[Y]

—

Output variance Var[Y]

—

Output std. dev. σ_Y

—

1st-order Sobol (%)

—

2nd-order Sobol (%)

—

Input distribution → response → output distribution

Particles sampled from the input Gaussian flow through Y = f(X) and land in the output distribution on the right. The smooth curve is the PCE-built analytic distribution; the bars are a Monte Carlo histogram for comparison.

PCE coefficients y_k vs order k

Response curve Y(ξ) for ξ ∈ [-3, 3]

Theory & Key Formulas

$$Y(\xi) = \sum_{k=0}^{P} y_k\,He_k(\xi),\quad E[Y]=y_0,\quad Var[Y]=\sum_{k=1}^{P} y_k^2\,k!$$

He_k are the probabilist Hermite polynomials (He₀=1, He₁=ξ, He₂=ξ²-1, He₃=ξ³-3ξ, He₄=ξ⁴-6ξ²+3). Coefficients y_k come from an orthogonal projection against the standard Gaussian weight; the mean is y₀ and the variance is a weighted sum of y_k² with weights k!.

$$y_0 = a_1\mu + a_2(\mu^2+\sigma^2),\quad y_1 = (a_1 + 2 a_2\mu)\sigma,\quad y_2 = a_2\sigma^2$$

Closed-form result for Y = a₁X + a₂X² with X = μ + σξ, using ξ² = He₂(ξ) + 1. All coefficients with k ≥ 3 vanish exactly.

$$S_1 = \frac{y_1^2}{\mathrm{Var}[Y]},\qquad S_2 = \frac{2\,y_2^2}{\mathrm{Var}[Y]}$$

Sobol sensitivity indices directly from the Hermite-PCE: S₁ is the share of variance carried by the linear term, S₂ the share carried by the quadratic (non-linear) term.

Polynomial Chaos Expansion (PCE) and Uncertainty Propagation

🙋

"Uncertainty quantification" is just "if the input wobbles, the output wobbles", right? Why represent that with polynomials?

🎓

Exactly — when the input X is a random variable, the output Y is also random, and UQ is about propagating that randomness efficiently. PCE has an elegant trick: it writes the output as a sum of polynomials of a standard random variable ξ. Think of ξ as the input standardised to N(0,1). Then Y(ξ) = y₀ He₀ + y₁ He₁(ξ) + y₂ He₂(ξ) + …, and once you have the coefficients y_k, the mean, variance and even the full distribution shape pop right out.

🙋

Looking at the "PCE coefficients y_k" chart on the right, y₃ onwards really is zero. Is that a coincidence?

🎓

Not at all, it's the theory talking. Our model is Y = a₁X + a₂X², which only goes up to second power in X. Plug in X = μ + σξ and the highest power of ξ you can get is ξ². Rewrite ξ² as He₂(ξ) + 1 and you get y₀, y₁, y₂ as the only non-zero coefficients; k ≥ 3 is exactly zero. So in this toy problem, truncating at P = 2 is "zero truncation error". In real CAE use cases a smooth response usually converges by P = 3 to 5.

🙋

The Sobol indices read 98% and 2% — what do those numbers actually mean?

🎓

Think of them as a breakdown of the output variance. S₁ = 98% says "98% of Var[Y] is explained by the linear response to X (the a₁X part)", and S₂ = 2% says "the remaining 2% is the non-linear contribution from a₂X²". Because each PCE order contributes independently to the variance, you just take the ratio of y₁² to 2·y₂² to get those numbers. That's the big advantage over MC-based Sobol estimation, which would need tens of thousands of extra samples.

🙋

When I push σ up the S₂ share grows quickly. Why?

🎓

Good observation. y₁ = (a₁ + 2 a₂μ)σ is linear in σ, but y₂ = a₂σ² is quadratic. So if you double σ, y₁ doubles but y₂ quadruples, and the quadratic variance share (2 y₂²) pulls ahead. In CAE speak: "while input spread is small, a linear surrogate is fine, but once spread grows, you cannot ignore the non-linear terms." That intuition is exactly what you need when you're sanity-checking a design tolerance.

🙋

In a real CAE workflow the solver is a black box, not a tiny polynomial. How do you actually get the y_k?

🎓

The standard recipe is "non-intrusive PCE". You leave the solver alone, run it at a small set of carefully chosen samples (Gauss-quadrature nodes, Sobol or Latin Hypercube), and recover the y_k by least-squares regression or Galerkin projection. Twenty to a couple of hundred runs is enough, so it is feasible even when each FEM evaluation takes hours. Once the coefficients are fixed, mean, variance and Sobol indices fall out analytically — this simulator visualises that "final step".

Frequently Asked Questions

PCE is an uncertainty-quantification technique that represents an output as a finite sum of orthogonal polynomials of standard random variables. We write Y(ξ) = Σ y_k He_k(ξ) and obtain the coefficients y_k by Galerkin projection, regression or sparse sampling. The optimal basis depends on the input distribution (the Wiener-Askey scheme): Hermite for Gaussian input, Legendre for uniform, Laguerre for gamma. For smooth responses PCE delivers accurate output statistics with far fewer model evaluations (typically tens to hundreds) than Monte Carlo.

The probabilist Hermite polynomials He_k(ξ) are orthogonal with respect to the standard normal weight e^(-ξ²/2)/√(2π). That gives E[He_i He_j] = i! δ_ij and lets us compute each coefficient as y_k = E[Y He_k]/k! independently. The orthogonality also means the output variance reduces to Var[Y] = Σ_{k≥1} y_k² k! with no cross terms. Change the input distribution and the optimal basis changes too — the Wiener-Askey scheme is the catalogue that pairs distributions with their natural orthogonal bases.

Monte Carlo (MC) is simple and dimension-agnostic but converges only at the 1/√N rate, so quantifying uncertainty for an expensive CAE solver typically demands thousands of runs. If the response is smooth (well approximated by a low-degree polynomial of the inputs), a low-order PCE (P = 3 to 5) reaches the accuracy of tens of thousands of MC samples with only tens of solver calls. PCE struggles with discontinuous responses, hard thresholds or shocks; in those cases switch to MC, stratified sampling or sparse grids.

Each PCE coefficient y_k contributes y_k²·k! to the output variance (Hermite case). Group the basis functions that involve a given input (or combination of inputs), divide their variance contribution by the total variance, and you have the Sobol sensitivity index for that subset — no extra samples required. For the 1-input model Y = a₁X + a₂X² the ratios y₁² and 2·y₂² to the total variance are the linear and quadratic shares respectively. Compared with MC-based Sobol estimation, PCE turns it into a pure post-processing step.

Real-World Applications

Stochastic Finite Element Methods (SFEM): When material constants (Young's modulus, yield stress, damping), boundary loads or contact stiffnesses are uncertain, engineers want the distribution of responses such as peak stress, maximum displacement or natural frequency. The standard approach is "non-intrusive PCE": reuse the deterministic FEM solver, evaluate it at Gauss-quadrature nodes only, and regress the PCE coefficients. Replacing 10,000 MC runs with around 50 PCE evaluations is routine in automotive, aerospace and nuclear reliability assessments.

CFD and thermofluid uncertainty propagation: Engineers propagate uncertain turbulence-model coefficients (the Cμ of k-ε, for example), inlet velocities and surface roughness through to drag coefficients and temperature distributions. Because each CFD run takes hours or days, MC is essentially impossible — sparse PCE combined with Smolyak sparse grids can characterise the output distribution in 20 to 100 runs. NASA, ONERA and other agencies have many published cases in the aircraft-certification workflow.

Robust design of control and electronics: Sensor noise, resistor tolerances, temperature drift — designers need to know how badly these wobble the control performance or circuit response, without resorting to brutally pessimistic worst-case bounds. Once you have a PCE surrogate, computing Sobol indices on top is cheap, so you can pinpoint which tolerance dominates the variance and feed that into a cost-optimal tolerance allocation. Semiconductor and power-electronics teams have adopted this in recent years.

Climate, geoscience and financial engineering: Sensitivity studies of climate models, uncertain subsurface flow simulations, parameter-uncertain option pricing — large simulations with multi-dimensional random inputs all rely on PCE. In high dimension the tensor-product basis explodes, so practitioners use sparse PCE, adaptive PCE or low-rank tensor decompositions to keep only the basis functions that actually matter.

Common Misconceptions and Pitfalls

The biggest trap is using low-order PCE on a non-smooth response. PCE converges exponentially when Y(ξ) is smooth in ξ; but for buckling, impact or contact problems with discontinuities or hard thresholds, the low-order Hermite expansion oscillates like a Gibbs phenomenon. The mean may be right but the variance and tail probability can be badly wrong. The practical test: increase P by 1 and check whether the coefficient norm decays geometrically. If it doesn't, polynomial PCE is the wrong tool and you should switch to sparse grids, hybrid MC, or domain-decomposed PCE.

Next, mismatching the basis to the input distribution. The Wiener-Askey scheme tells you the optimal basis for each distribution (Gaussian → Hermite, uniform → Legendre, exponential → Laguerre, beta → Jacobi). Using Hermite on a uniform input breaks the orthogonality, mixes coefficients and ruins both convergence and accuracy. When several different distributions are involved, transform each input to its standard random variable (Rosenblatt or Nataf transform) before applying the matching basis.

Finally, extrapolating a PCE outside its training range. A PCE is an approximation around the sample cloud (typically ±3σ). Push the input five times further out and Y(ξ) — being a polynomial of ξ — will blow up, but that blow-up is mathematical, not physical. For tail probabilities and extreme-value studies, declare the PCE's region of validity explicitly and supplement it with MC or importance sampling outside that region.

Polynomial Chaos Expansion (PCE) and Uncertainty Propagation

Frequently Asked Questions

Real-World Applications

Common Misconceptions and Pitfalls

How to Use

Worked Example

Practical Notes