When should I prefer the secant method over Newton-Raphson?

Use Newton-Raphson when f'(x) is available analytically and cheap to evaluate. Use the secant method when the derivative is unavailable (e.g. f comes from interpolated experimental data or expensive numerical integration), or when f' costs roughly the same as f. The secant method evaluates f only once per iteration (versus twice for Newton's f and f'), so the effective convergence rate per function evaluation is competitive.

Why does the secant method sometimes diverge or oscillate?

The secant method is locally convergent: bad initial points (x_0, x_1) easily cause divergence. In particular, when f(x_n) is approximately f(x_{n-1}) the denominator becomes near zero and the next iterate flies far away. Picking initial points that bracket the root (f(x_0)*f(x_1)<0) helps stability, but unlike bisection the secant method does not guarantee bracket preservation. Brent's method combines bisection and secant to get the robustness of one with the speed of the other.

Where is the secant method used inside CAE?

In CAE, the multidimensional generalization of the secant method — quasi-Newton methods like Broyden, BFGS, and DFP — is widely used when assembling the Jacobian J=dF/dx is too expensive (contact analysis, large nonlinear FEA). They asymptotically approximate J^-1 from past updates, avoiding full reassembly each iteration. The 1D secant method is also a key building block of Brent's method, the default univariate root finder in scipy and MATLAB.

Secant Method Simulator — Derivative-Free Root Finding

Q: What is the secant method?

The secant method finds a root of f(x)=0 by replacing the derivative in Newton-Raphson with a finite-difference quotient (f(x_n)-f(x_{n-1}))/(x_n-x_{n-1}). The iteration is x_{n+1} = x_n - f(x_n)*(x_n-x_{n-1})/(f(x_n)-f(x_{n-1})). It needs two initial points and converges with order phi = (1+sqrt(5))/2 approximately 1.618 (the golden ratio) — slightly slower than Newton's quadratic order but with the major advantage of not requiring a derivative.

Q: Where is the secant method used inside CAE?

In CAE, the multidimensional generalization of the secant method — quasi-Newton methods like Broyden, BFGS, and DFP — is widely used when assembling the Jacobian J=dF/dx is too expensive (contact analysis, large nonlinear FEA). They asymptotically approximate J^-1 from past updates, avoiding full reassembly each iteration. The 1D secant method is also a key building block of Brent's method, the default univariate root finder in scipy and MATLAB.

Parameters

Initial x_01.00

Initial x_13.00

Max iterations30

Tolerance 1e^-n6

Results

—

Converged root

—

Iterations

—

Final |f(x)|

—

Convergence

Function and secants (f(x)=x^3-2x-5)

Error |f(x_n)| convergence (log10)

Theory & Key Formulas

$$x_{n+1} = x_n - f(x_n)\,\frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}$$

Secant method iteration. The derivative f'(x) in Newton-Raphson is replaced by a finite-difference quotient.

$$f(x) = x^{3} - 2x - 5$$

Test function (Newton's original example). Real root: x approximately 2.094551482.

$$|e_{n+1}| \approx C\,|e_{n}|^{\varphi}, \quad \varphi = \frac{1+\sqrt{5}}{2} \approx 1.618$$

Superlinear convergence (order = golden ratio phi). Slower than Newton's quadratic order but no derivative needed.

$$|f(x_n)| < \varepsilon \quad \text{or} \quad |x_{n+1} - x_n| < \varepsilon$$

Convergence criterion. epsilon (tolerance) = 1e^-n where n is the tolerance slider.

What is the Secant Method

It solves f(x)=0 by drawing the line (the "secant") through the previous two iterates and using its x-intercept as the next estimate. It is Newton-Raphson with the analytic derivative replaced by a finite difference of past evaluations — the simplest derivative-free root finder.

🙋

Newton-Raphson and the secant method look almost identical in formula. When should I use which?

🎓

They share the same DNA — the secant method just swaps Newton's f'(x_n) for the slope (f(x_n)-f(x_{n-1}))/(x_n-x_{n-1}) built from the last two evaluations. So you can use it when f'(x) is unavailable (interpolated lab data, complicated numerical integrals) or just expensive. Convergence is superlinear with order phi approximately 1.618 (the golden ratio) — slower than Newton's quadratic 2, but the secant method costs only one f-evaluation per iteration (Newton needs both f and f'), so per function call the two are nearly tied.

🙋

When I set x_0=1, x_1=3 it converges in seven iterations. But with x_0=-2, x_1=-1 it shoots off in a weird direction. Why?

🎓

Good catch. The polynomial f(x)=x^3-2x-5 has a single real root near x=2.0946. Starting from two negative points means f(x_0) and f(x_1) are both negative, so the secant slants steeply upward and intersects the x-axis far to the right — overshooting wildly. That is the price of locality. A safe rule of thumb is to bracket the root with f(x_0)*f(x_1)<0; even better, use Brent's method (scipy.optimize.brentq), which mixes bisection's safety with the secant method's speed.

🙋

Will tightening the tolerance to 1e-12 explode the iteration count? You said the secant method is slower than Newton.

🎓

Surprisingly cheap. Superlinear order phi approximately 1.618 is still very strong. Going from 1e-6 to 1e-12 typically adds only two or three more iterations (Newton would add one or two). In CAE, the multidimensional cousin of the secant method — quasi-Newton methods like Broyden, BFGS, L-BFGS — is exactly how Abaqus and ANSYS avoid reassembling the tangent stiffness matrix every step in nonlinear FEA.

Physical Model and Key Equations

The secant method takes Newton-Raphson, x_{n+1}=x_n-f(x_n)/f'(x_n), and approximates the derivative by a finite-difference slope from the last two iterates.

$$f'(x_n) \approx \frac{f(x_n) - f(x_{n-1})}{x_n - x_{n-1}}$$

Substituting and rearranging gives the secant update:

$$x_{n+1} = x_n - f(x_n)\,\frac{x_n - x_{n-1}}{f(x_n) - f(x_{n-1})}$$

Geometrically, the line through (x_{n-1}, f(x_{n-1})) and (x_n, f(x_n)) — the secant — crosses the x-axis at x_{n+1}.

Convergence rate (with e_n = x_n - x* the error):

$$|e_{n+1}| \approx \left|\frac{f''(x^{*})}{2f'(x^{*})}\right|\,|e_{n}|\,|e_{n-1}|, \quad \text{order} = \varphi = \frac{1+\sqrt{5}}{2} \approx 1.618$$

The golden ratio shows up naturally — that is the signature of superlinear convergence (faster than linear but slower than quadratic).

Real-World Applications

Quasi-Newton optimization (CAE): Multidimensional F(x)=0 or grad f(x)=0 problems make assembling the Jacobian or Hessian expensive. Broyden's method (root finding) and BFGS / L-BFGS (optimization) approximate J^-1 or H^-1 from past updates — the natural multidimensional generalization of the 1D secant method.

Brent's method (general-purpose root finder): scipy.optimize.brentq, MATLAB's fzero, and Boost's brent_find_minima all default to Brent's method, which combines bisection's guaranteed convergence with the speed of the secant method and inverse quadratic interpolation.

Modified Newton in nonlinear FEA: Abaqus and ANSYS often reuse the tangent stiffness K=dR/du across several load steps to amortize the cost of LU factorization. Internally, the equilibrium iteration then behaves like a secant-style update; BFGS-style corrections are also common in implicit nonlinear solvers.

Implied volatility in finance: Inverting the Black-Scholes formula for the volatility sigma that matches the market price avoids computing dC/dsigma (Vega) analytically by using the secant method. Trading systems run this loop thousands of times per second.

Common Misconceptions and Pitfalls

The most frequent myth is "the secant method is just a worse Newton-Raphson". Although its order is 1.618 versus Newton's 2, the effective rate per function evaluation is phi approximately 1.618 for the secant versus sqrt(2) approximately 1.414 for Newton (when f and f' are evaluated separately). If f' costs as much as f, the secant method is actually faster in wall-clock terms.

Second, "the two initial points must bracket the root" is wrong. The secant method does not require f(x_0)*f(x_1)<0. But without that bracket, it can easily overshoot and diverge, as you can see by setting x_0=-2, x_1=-1 in this simulator: both f-values are negative, the secant slants up, and the next iterate leaps far to the right. For safety, bracket the root with bisection first, then switch to the secant — the philosophy behind Brent's method.

Third, watch out for division by f(x_n)-f(x_{n-1}). When two consecutive function values nearly coincide, the denominator approaches zero and the update explodes. A robust implementation aborts the iteration (or restarts from fresh points) once |f(x_n)-f(x_{n-1})| falls below a small threshold, e.g. 1e-14. Quasi-Newton solvers in CAE include similar restart logic when their Jacobian approximations stale out.

Frequently Asked Questions

The secant method replaces Newton's analytic derivative f'(x_n) with a finite-difference quotient (f(x_n)-f(x_{n-1}))/(x_n-x_{n-1}) built from the last two iterates. No symbolic or numerical differentiation is required — only function evaluations. Convergence is superlinear with order phi approximately 1.618, slightly slower than Newton's quadratic 2, but per function call the methods are competitive because the secant method needs only one f evaluation per step.

Bisection requires a bracket f(a)*f(b)<0 and converges linearly (the error halves every step) — slow but unconditionally safe. The secant method needs two starting points but no bracket, and converges superlinearly (order phi approximately 1.618) — fast but without convergence guarantees. Brent's method combines both to get safety and speed; that is what scipy.optimize.brentq uses internally.

Yes. The multidimensional generalization is Broyden's method, which approximates the Jacobian J of F(x)=0 using rank-one updates from past iterates. For optimization (where grad f(x)=0), the analogous BFGS and L-BFGS schemes approximate the Hessian H. Both are workhorses in scipy.optimize.minimize and in deep-learning second-order optimizers.

When the absolute value of the denominator drops below a small threshold close to machine epsilon (about 2.2e-16), the standard practice is to halt the iteration and report failure. Alternatives include restarting from two fresh points or falling back to bisection. This simulator stops the iteration once |f(x_n)-f(x_{n-1})| is below 1e-14.