Osmophile

Osmophile is a virtual lab notebook that documents ongoing projects and experiments.

Improving Function Approximations using Newton–Raphson Iterations

2024-10-15, C. V. Pines

Suppose we want to evaluate a function y = f(x), which is either slow or not directly computable. If we also have (and can efficiently evaluate) all of:

  • f^{-1}(y), the function’s inverse
  • \left[f^{-1}\right]'(y), the derivative of the function’s inverse
  • f^\ast(x), an approximant to f(x)

then we can use the Newton–Raphson method to iteratively compute more accurate approximations.

Derivation

The Newton–Raphson method finds successively better approximations to a real function’s roots. By constructing a new function g(y) with a root at the solution,

\begin{split} g(y) &= f^{-1}(y) - x \\ g'(y) &= \left[f^{-1}\right]'(y) \end{split}

we can produce successive approximations y^\ast_n to f(x) by evaluating

\begin{split} y^\ast_{0} &= f^\ast(x) \\ y^\ast_{n+1} &= y^\ast_n - \frac{g(y^\ast_n)}{g'(y^\ast_n)} \\ &= y^\ast_n - \frac{f^{-1}(y^\ast_n) - x}{\left[f^{-1}\right]'(y^\ast_n)} \end{split}

Note that x is treated as a constant when differentiating and evaluating.

Example: Square Root

// Evaluates a Newton-Raphson approximation to f(x) = sqrt(x)
float sqrtApproxNR(const float& x) {
    if (x < 0.0f) return NAN;
    
    // Avoid a division-by-zero later on
    if (x == 0.0f) return 0.0f;

    // Initial rough approximation to sqrt(x) using bit-level manipulation
    float y = std::bit_cast<float>((std::bit_cast<uint32_t>(x) + 0x3F800000) >> 1);
    
    // Compute NR step
    // f^-1(y) = y^2; [f^-1]'(y) = 2y
    // y = y - (y * y - x)/(2.0f * y);
    
    // After simplifying,
    y = 0.5f * (y + x/y);
    
    // Repeat as many times as desired; each iteration increases accuracy at
    //   the expense of execution time.
    y = 0.5f * (y + x/y);
    
    return y;
}
y = \sqrt{x} x = (2\pi)^{-1} x = 2 x = 224
y^\ast_0 0.4091549 1.500000 15.00000
y^\ast_1 0.3990697 1.416666 14.96666
y^\ast_2 0.3989422 1.414215 14.96662
std::sqrt(x) 0.3989422 1.414213 14.96662
\sqrt{x} 0.3989422… 1.414213… 14.96662…

Function Transforms

Sometimes, the performance or stability can be improved by applying an additional transform h(y) to both terms in g(y):

\begin{split} y^\ast_{0} &= f^\ast(x) \\ y^\ast_{n+1} &= y^\ast_n - \frac{h(f^{-1}(y^\ast_n)) - h(x)}{h'(f^{-1}(y^\ast_n))\cdot\left[f^{-1}\right]'(y^\ast_n)} \end{split}

The choice of h(y) is not readily obvious, but, in general, should seek to identify functions that cancel terms in or expand the domain of g(y).

Example: Wright Omega

Seeking to approximate the Wright omega function f(x) = \omega(x) leads to

\begin{split} f^{-1}(y) &= y + \ln{y} \\ \left[f^{-1}\right]'(y) &= 1 + y^{-1} \\ y^\ast_{n+1} &= y^\ast_n - \frac{y^\ast_n + \ln{y^\ast_n} - x}{1 + \frac{1}{y^\ast_n}} \end{split}

which can be problematic as y \rightarrow 0 due to limited floating point precision. By applying the transform h(y) = \exp{y}, we may instead calculate

\begin{split} h(f^{-1}(y)) &= \exp{\left(y + \ln{y}\right)} \\ &= y \exp{(y)} \\ h'(f^{-1}(y))\cdot\left[f^{-1}\right]'(y) &= (1 + y^{-1}) \exp{\left(y + \ln{y}\right)} \\ &= (y + 1) \exp{(y)} \\ y^\ast_{n+1} &= y^\ast_n - \frac{y^\ast_n \exp{(y^\ast_n)} - \exp{(x)}}{\left(y^\ast_n + 1\right)\exp{(y^\ast_n) }} \\ &= y^\ast_n - \frac{y^\ast_n - \exp{(x - y^\ast_n)}}{y^\ast_n + 1} \end{split}

which both removes the singularity at y = 0 and reduces the number of floating point operations performed each iteration.

Sources