Newton-Raphson Simulator — Roots of Nonlinear Equations
Visualize the Newton iteration x_{n+1}=x_n-omega*f(x_n)/f'(x_n) on f(x)=x^3-2x-5. Watch tangent lines, the log10 error curve, and tune the relaxation factor.
Convergence criterion. epsilon (tolerance) = 1e^-n where n is the tolerance slider.
What is the Newton-Raphson Method
It solves f(x)=0 by linearizing the function at the current estimate and using that tangent line to pick the next estimate. It is the workhorse iteration inside virtually every nonlinear CAE solver.
🙋
When I set the initial guess to 2 it converges in three iterations, but moving it to -2 makes the simulator blow up. Same equation, totally different behavior — what's going on?
🎓
Newton-Raphson follows the tangent line. Where the slope f'(x) is small, that tangent becomes nearly horizontal and the next x-intercept can fly far away. For f(x)=x^3-2x-5 the derivative f'(x)=3x^2-2 vanishes at x = plus-minus sqrt(2/3), about 0.816. Starting from x_0=-2 the iteration crosses that unstable strip, so the trajectory ricochets. Drag the initial-value slider and watch the log10 error curve — you'll cleanly see the difference between a monotonic decay and a chaotic-looking spike.
🙋
So if I pull the omega slider down to 0.5, does that rescue a bad initial guess?
🎓
Often yes. Damping each step by omega slows the runaway updates and lets the iteration recover. The same idea is built into Abaqus, ANSYS, and OpenSees as "damped Newton" or "line search". The catch: small omega kills the quadratic convergence. Compare omega=1.0 versus omega=0.3 on the log10 error plot — at 1.0 you see the curve bend (each iteration roughly doubles the number of correct digits), while at 0.3 the slope is almost linear.
🙋
If I crank the tolerance down to 1e-12, do iterations explode? Is it even meaningful to be that strict?
🎓
Almost free, actually. Quadratic convergence means going from 1e-6 to 1e-12 typically costs one or two extra iterations. The real wall is double-precision machine epsilon, about 2.2e-16, and modeling errors. In practice CAE engineers pick 1e-6 for linear FEA and 1e-4 to 1e-5 (relative residual) for nonlinear runs, because anything tighter is invisible behind discretization and material-law uncertainty.
Physical Model and Key Equations
Newton-Raphson is a first-order Taylor linearization of f(x) about the current iterate x_n, solved for the point where the tangent crosses zero.
Near the root the number of correct digits roughly doubles every iteration — the headline feature of Newton's method.
Real-World Applications
Nonlinear FEA (CAE): Each load increment solves the equilibrium R(u)=F_int(u)-F_ext=0 by Newton-Raphson. The tangent stiffness K=dR/du plays the role of f'(x). Plasticity, contact, and large-deformation problems all live or die by this iteration.
SPICE circuit simulation: Diode and MOSFET nonlinearities make the DC operating-point equations g(V)=0 nonlinear. Newton-Raphson on the Jacobian (the small-signal admittance matrix) is the default. Bad initial guesses cause the famous "Newton failed to converge" message.
Optimization and ML: Setting the gradient grad f(x)=0 turns optimization into root finding. Pure Newton with the Hessian gives quadratic convergence; quasi-Newton methods (BFGS, L-BFGS) approximate the Hessian to scale to many variables.
Orbital mechanics: Kepler's equation M = E - e*sin(E) is solved for the eccentric anomaly E by Newton-Raphson every time a satellite track or planetary ephemeris is computed.
Common Misconceptions and Pitfalls
The biggest myth is "quadratic convergence means the initial guess doesn't matter". It absolutely does — Newton is only locally convergent. In production code engineers wrap it in a globalization strategy: a bisection or golden-section search brackets the root first, then Newton takes over. Brent's method is the classical hybrid that bundles bisection, secant, and inverse-quadratic interpolation into one robust routine.
Second, watch for f'(x) approximately 0. When the tangent is nearly flat the update f/f' explodes. Push x_0 toward 0.8 on the simulator: f'(0.8)=3(0.64)-2=-0.08 and the next iterate leaps far across the axis. Production solvers either floor the denominator, scale down omega, or apply a trust region to cap the step size.
Finally, "converged" is not the same as "right". A polynomial like x^3-2x-5 has only one real root, but in general the basin of attraction depends sensitively on the initial guess. In post-buckling structural analysis multiple equilibrium paths coexist, and engineers switch to arc-length (Riks) methods so the iteration follows the physically correct branch instead of snapping onto a different one.
Frequently Asked Questions
Replace f'(x) with a central-difference quotient (f(x+h)-f(x-h))/(2h). That gives the secant method (when you reuse the previous point) or Broyden's method (for systems). In CAE, when assembling the tangent stiffness matrix is expensive, modified-Newton reuses an earlier Jacobian across several load steps to amortize the cost.
At a multiple root x* with multiplicity m, Newton degrades from quadratic to linear convergence. The fix is the modified update x_{n+1}=x_n - m*f/f', which restores quadratic order. If m is unknown, Schroder's iteration x_{n+1}=x_n - f*f'/((f')^2 - f*f'') automatically achieves the same rate.
Yes. For F(x)=0 with F a vector function, the update is Delta_x = -J(x_n)^-1 F(x_n) where J is the Jacobian. Nonlinear FEA is exactly this scheme with J replaced by the tangent stiffness K. In practice we never invert J explicitly — we solve K * Delta_u = -R with a direct or iterative linear solver.
Not really. Double-precision floats only resolve about 15-16 significant digits, and roundoff in evaluating f(x) further limits achievable accuracy. Engineering tolerances are typically relative (e.g. 1e-6 on the initial residual) because the modeling error from discretization and material data already swamps anything tighter.