What is LU decomposition?

LU decomposition factors a square matrix A into a unit lower triangular L and an upper triangular U. In Doolittle's variant the diagonal of L is one, while Crout's variant fixes the diagonal of U. Once L and U are computed, multiple right-hand sides Ax=b sharing the same A can be solved cheaply by forward substitution Ly=b and back substitution Ux=y, without rerunning Gaussian elimination. The factorization itself costs O(n^3), but each subsequent solve costs only O(n^2).

Why is partial pivoting necessary?

If the pivot (the diagonal entry used to eliminate the column) is zero or very small, dividing by it produces overflow or huge round-off error. Partial pivoting swaps the current row with the row that has the largest absolute value in the current column before eliminating, written as P*A=L*U with a permutation matrix P. Full pivoting (swapping both rows and columns) is more stable but costlier, so partial pivoting is the practical standard in production solvers.

How is it different from Cholesky decomposition?

Cholesky decomposition is a specialized LU for symmetric positive definite matrices, writing A=L*L^T. It uses about half the memory and half the operations of a general LU and is numerically very stable. Stiffness matrices in linear elasticity, normal equations in least squares and covariance matrices in Gaussian processes are typical use cases. The 3x3 matrices in this tool are not necessarily symmetric positive definite, so we use general LU with partial pivoting.

Is LU decomposition used in the finite element method (FEM)?

Yes. Linear static analysis solves K*u=f, where K is the stiffness matrix. For symmetric positive definite K, Cholesky is the direct solver of choice; for non-symmetric K, LU is used. For very large problems the matrix is sparse, so sparse LU implementations (SuperLU, PARDISO, MUMPS) combined with fill-in reducing orderings (AMD, METIS) are used, and for even larger problems iterative solvers (CG, GMRES, algebraic multigrid) take over. The big win of LU and Cholesky in FEM is reuse: factor once, solve cheaply for many load cases.

LU Decomposition Simulator — Free Online Calculator

Parameters

A[1,1] (diagonal)

—

A[2,2] (diagonal)

—

A[3,3] (diagonal)

—

b[1] (right-hand side)

—

Fixed off-diagonals: A[1,2]=3, A[1,3]=2, A[2,1]=1, A[2,3]=1, A[3,1]=2, A[3,2]=1. Fixed b[2]=8, b[3]=13.

Results

—

Determinant det(A)

—

Solution x_1

—

Solution x_2

—

Residual ‖Ax-b‖_2

Matrix A and right-hand side b

Left = coefficient matrix A / Right = right-hand side b (blue cells = off-diagonal, green cells = diagonal/variable)

L*U Factors and Forward/Back Substitution

Green = lower triangular L (unit diagonal) / Blue = upper triangular U / Yellow = intermediate y = L^-1*b / Orange = solution x = U^-1*y

Theory & Key Formulas

LU decomposition factors a square matrix A into a unit lower triangular L and an upper triangular U. With partial pivoting we write P*A = L*U.

Factorization (Doolittle variant, L has unit diagonal):

$$P \cdot A = L \cdot U, \quad L_{ii} = 1$$

Ax=b is then solved by two triangular systems: forward substitution for y, back substitution for x:

$$L\,y = P\,b \quad\Longrightarrow\quad U\,x = y$$

Determinant and residual. det(A) is the signed product of U's diagonal; the residual measures numerical accuracy:

$$\det(A) = (-1)^{s}\prod_i U_{ii}, \qquad r = \|A\,x - b\|_2$$

Cost is O(n^3) for the factorization but only O(n^2) per right-hand side once L and U are formed — the practical advantage in FEM multi-load-case analyses.

What is the LU Decomposition Simulator

🙋

Solving a system of linear equations is just like the elimination method we learned in school, right? Does a computer really do the same thing?

🎓

Fundamentally yes — it is Gaussian elimination. In practice we save the intermediate work as two matrices L and U, which is LU decomposition. Look at the second canvas above. The green L records the "multiplier used for each row during elimination", and the blue U is "the upper-triangular shape after elimination". Their product reconstructs A.

🙋

Why bother decomposing? If you can solve it once, isn't that enough?

🎓

That is exactly the trick. In FEM linear static analysis you often want to solve the same stiffness matrix K against 10 or 100 different load vectors. Once you have L and U, each new right-hand side only needs forward and back substitution. The factorization is O(n^3), but each subsequent solve is O(n^2). For a 1000-equation system you can solve roughly 1000 right-hand sides for the price of one factorization.

🙋

What is "partial pivoting"? I see it mentioned under the canvas.

🎓

If the pivot you divide by is zero, elimination breaks; if it is tiny, round-off errors explode. So at each step we swap rows to bring the largest absolute value to the diagonal, then eliminate. Try setting A[1,1] to 1 and b[1] to -20 in the simulator — a row swap kicks in, and the solver still produces a correct answer.

🙋

If the residual card is almost zero, does that prove the calculation is right?

🎓

Yes — the residual is the norm of A times x minus b. In exact arithmetic it is zero; in double precision you typically see something around 10^-10. If you ever see 1 or 10, either the matrix is nearly singular (huge condition number) or there is a bug. Always check the residual as a self-test of your numerical solution.

Frequently Asked Questions

Both are LU decomposition variants. Doolittle fixes the diagonal of L to 1 (so U holds the pivots), while Crout fixes the diagonal of U to 1. This tool uses Doolittle, which is the convention in most textbooks. The numerical results are essentially equivalent; the choice is mostly about memory layout and the order in which the entries are computed.

The condition number cond(A)=||A||*||A^-1|| measures how much input error is amplified into the solution. A rough rule is that for a condition number of 10^k, you lose about k digits of accuracy in double precision (which has roughly 16). When the condition number exceeds 10^16 you lose all accuracy. In practice people use iterative refinement, scaling, or reformulate the problem to control conditioning.

Large FEM and CFD matrices are sparse (more than 99% zero). A naive LU produces "fill-in" — new non-zeros at positions that were zero — and quickly exhausts memory. In practice we use sparse LU libraries such as SuperLU, PARDISO, or MUMPS combined with fill-in reducing orderings (AMD, METIS) that reorder the rows and columns first. Beyond a few million unknowns, iterative solvers (CG, GMRES, algebraic multigrid) typically win.

For linear elasticity the stiffness matrix K is symmetric positive definite, and the first choice is Cholesky (K=L*L^T), which costs roughly half of a general LU in both memory and operations. For contact, plasticity, or coupled fluid-structure problems where K is non-symmetric, general LU is used. In every case the win of the direct method is reuse: factor once, then solve cheaply for many right-hand sides.

Real-World Applications

FEM linear static analysis: The core of structural solvers (NASTRAN, Abaqus, ANSYS, Marc, etc.) is a direct solve against the stiffness matrix. Cholesky is used when symmetric positive definite, general LU otherwise. Reuse pays off when the same K is solved against many load cases (gravity, wind, seismic, thermal). Sparse implementations such as SuperLU, PARDISO, and MUMPS are the industry standard and scale to tens of millions of degrees of freedom.

Circuit analysis (SPICE): Nodal equations G*v=i, where G is a sparse symmetric nodal admittance matrix, are commonly solved with LU. SPICE-class simulators solve a linearized system at each integration step, and reuse of the LU factor across steps where the matrix pattern is unchanged gives huge speedups.

Normal equations in optimization and machine learning: The normal equations (X^T X)*beta = X^T y for linear least squares and (K + sigma^2 I)*alpha = y for Gaussian process regression involve symmetric positive definite matrices and are solved with Cholesky (a specialized LU). Efficient factorization implementations directly drive performance in batch training and Bayesian inference.

Lyapunov equations in control: The continuous-time Lyapunov equation A^T P + P A = -Q used in stability analysis is solved by Bartels-Stewart: first reduce A to Schur form, then substitute through a triangular system — essentially the same idea as LU. Control toolboxes such as MATLAB Control Toolbox implement this routinely.

Common Misconceptions and Cautions

The most common misconception is to view LU decomposition as a way to compute the inverse A^-1. In fact you almost never need the explicit inverse; LU is enough to solve Ax=b. Forming the inverse is roughly three times more expensive in memory and operations and is less accurate. This is the reason MATLAB recommends A\b (the backslash operator, internally LU) over inv(A)*b. The tool above never forms an inverse — it gets the answer purely from two triangular solves Ly = Pb and Ux = y.

The next common error is to believe that partial pivoting alone guarantees numerical stability. Partial pivoting prevents the diagonal from becoming too small, but if the matrix itself is poorly conditioned (almost singular), the accuracy of the solution still degrades. In the simulator, try setting A[1,1]=A[2,2]=A[3,3]=1: the determinant shrinks and the residual tends to grow. In production work people estimate the condition number (rcond) in advance and use iterative refinement to recover accuracy.

Finally, beware of missing the O(n^3) cost and applying direct methods to large problems. A 3x3 system is instantaneous, but factoring a million-equation FEM matrix as dense LU would consume astronomical memory and time. Even sparse libraries such as SuperLU and PARDISO are often outperformed by iterative solvers (CG, GMRES, algebraic multigrid) beyond a few million unknowns. Always check "symmetric?", "positive definite?", "sparse?", and "how many unknowns?" before choosing a solver. This tool stops at a 3x3 system, but behind it lies more than half a century of numerical-linear-algebra know-how.

LU Decomposition Simulator — Direct Solver for Ax=b

What is the LU Decomposition Simulator

Frequently Asked Questions

Real-World Applications

Common Misconceptions and Cautions

Related Tools