LU Decomposition Simulator — Direct Solver for Ax=b
Factor a 3x3 matrix into L and U and solve Ax=b by forward substitution Ly=b and back substitution Ux=y. Vary the diagonals and the right-hand side to see determinant, solution and residual.
Determinant and residual. det(A) is the signed product of U's diagonal; the residual measures numerical accuracy:
$$\det(A) = (-1)^{s}\prod_i U_{ii}, \qquad r = \|A\,x - b\|_2$$
Cost is O(n^3) for the factorization but only O(n^2) per right-hand side once L and U are formed — the practical advantage in FEM multi-load-case analyses.
What is the LU Decomposition Simulator
🙋
Solving a system of linear equations is just like the elimination method we learned in school, right? Does a computer really do the same thing?
🎓
Fundamentally yes — it is Gaussian elimination. In practice we save the intermediate work as two matrices L and U, which is LU decomposition. Look at the second canvas above. The green L records the "multiplier used for each row during elimination", and the blue U is "the upper-triangular shape after elimination". Their product reconstructs A.
🙋
Why bother decomposing? If you can solve it once, isn't that enough?
🎓
That is exactly the trick. In FEM linear static analysis you often want to solve the same stiffness matrix K against 10 or 100 different load vectors. Once you have L and U, each new right-hand side only needs forward and back substitution. The factorization is O(n^3), but each subsequent solve is O(n^2). For a 1000-equation system you can solve roughly 1000 right-hand sides for the price of one factorization.
🙋
What is "partial pivoting"? I see it mentioned under the canvas.
🎓
If the pivot you divide by is zero, elimination breaks; if it is tiny, round-off errors explode. So at each step we swap rows to bring the largest absolute value to the diagonal, then eliminate. Try setting A[1,1] to 1 and b[1] to -20 in the simulator — a row swap kicks in, and the solver still produces a correct answer.
🙋
If the residual card is almost zero, does that prove the calculation is right?
🎓
Yes — the residual is the norm of A times x minus b. In exact arithmetic it is zero; in double precision you typically see something around 10^-10. If you ever see 1 or 10, either the matrix is nearly singular (huge condition number) or there is a bug. Always check the residual as a self-test of your numerical solution.
Frequently Asked Questions
Both are LU decomposition variants. Doolittle fixes the diagonal of L to 1 (so U holds the pivots), while Crout fixes the diagonal of U to 1. This tool uses Doolittle, which is the convention in most textbooks. The numerical results are essentially equivalent; the choice is mostly about memory layout and the order in which the entries are computed.
The condition number cond(A)=||A||*||A^-1|| measures how much input error is amplified into the solution. A rough rule is that for a condition number of 10^k, you lose about k digits of accuracy in double precision (which has roughly 16). When the condition number exceeds 10^16 you lose all accuracy. In practice people use iterative refinement, scaling, or reformulate the problem to control conditioning.
Large FEM and CFD matrices are sparse (more than 99% zero). A naive LU produces "fill-in" — new non-zeros at positions that were zero — and quickly exhausts memory. In practice we use sparse LU libraries such as SuperLU, PARDISO, or MUMPS combined with fill-in reducing orderings (AMD, METIS) that reorder the rows and columns first. Beyond a few million unknowns, iterative solvers (CG, GMRES, algebraic multigrid) typically win.
For linear elasticity the stiffness matrix K is symmetric positive definite, and the first choice is Cholesky (K=L*L^T), which costs roughly half of a general LU in both memory and operations. For contact, plasticity, or coupled fluid-structure problems where K is non-symmetric, general LU is used. In every case the win of the direct method is reuse: factor once, then solve cheaply for many right-hand sides.
Real-World Applications
FEM linear static analysis: The core of structural solvers (NASTRAN, Abaqus, ANSYS, Marc, etc.) is a direct solve against the stiffness matrix. Cholesky is used when symmetric positive definite, general LU otherwise. Reuse pays off when the same K is solved against many load cases (gravity, wind, seismic, thermal). Sparse implementations such as SuperLU, PARDISO, and MUMPS are the industry standard and scale to tens of millions of degrees of freedom.
Circuit analysis (SPICE): Nodal equations G*v=i, where G is a sparse symmetric nodal admittance matrix, are commonly solved with LU. SPICE-class simulators solve a linearized system at each integration step, and reuse of the LU factor across steps where the matrix pattern is unchanged gives huge speedups.
Normal equations in optimization and machine learning: The normal equations (X^T X)*beta = X^T y for linear least squares and (K + sigma^2 I)*alpha = y for Gaussian process regression involve symmetric positive definite matrices and are solved with Cholesky (a specialized LU). Efficient factorization implementations directly drive performance in batch training and Bayesian inference.
Lyapunov equations in control: The continuous-time Lyapunov equation A^T P + P A = -Q used in stability analysis is solved by Bartels-Stewart: first reduce A to Schur form, then substitute through a triangular system — essentially the same idea as LU. Control toolboxes such as MATLAB Control Toolbox implement this routinely.
Common Misconceptions and Cautions
The most common misconception is to view LU decomposition as a way to compute the inverse A^-1. In fact you almost never need the explicit inverse; LU is enough to solve Ax=b. Forming the inverse is roughly three times more expensive in memory and operations and is less accurate. This is the reason MATLAB recommends A\b (the backslash operator, internally LU) over inv(A)*b. The tool above never forms an inverse — it gets the answer purely from two triangular solves Ly = Pb and Ux = y.
The next common error is to believe that partial pivoting alone guarantees numerical stability. Partial pivoting prevents the diagonal from becoming too small, but if the matrix itself is poorly conditioned (almost singular), the accuracy of the solution still degrades. In the simulator, try setting A[1,1]=A[2,2]=A[3,3]=1: the determinant shrinks and the residual tends to grow. In production work people estimate the condition number (rcond) in advance and use iterative refinement to recover accuracy.
Finally, beware of missing the O(n^3) cost and applying direct methods to large problems. A 3x3 system is instantaneous, but factoring a million-equation FEM matrix as dense LU would consume astronomical memory and time. Even sparse libraries such as SuperLU and PARDISO are often outperformed by iterative solvers (CG, GMRES, algebraic multigrid) beyond a few million unknowns. Always check "symmetric?", "positive definite?", "sparse?", and "how many unknowns?" before choosing a solver. This tool stops at a 3x3 system, but behind it lies more than half a century of numerical-linear-algebra know-how.