Improving Function Approximations using Newton–Raphson Iterations

2024-10-15, C. V. Pines

Suppose we want to evaluate a function $y = f(x)$ , which is either slow or not directly computable. If we also have (and can efficiently evaluate) all of:

$f^{-1}(y)$ , the function’s inverse
$\left[f^{-1}\right]'(y)$ , the derivative of the function’s inverse
$f^\ast(x)$ , an approximant to $f(x)$

then we can use the Newton–Raphson method to iteratively compute more accurate approximations.

Derivation

The Newton–Raphson method finds successively better approximations to a real function’s roots. By constructing a new function $g(y)$ with a root at the solution,

\begin{split} g(y) &= f^{-1}(y) - x \\ g'(y) &= \left[f^{-1}\right]'(y) \end{split}

we can produce successive approximations $y^\ast_n$ to $f(x)$ by evaluating

\begin{split} y^\ast_{0} &= f^\ast(x) \\ y^\ast_{n+1} &= y^\ast_n - \frac{g(y^\ast_n)}{g'(y^\ast_n)} \\ &= y^\ast_n - \frac{f^{-1}(y^\ast_n) - x}{\left[f^{-1}\right]'(y^\ast_n)} \end{split}

Note that $x$ is treated as a constant when differentiating and evaluating.

Example: Square Root

// Evaluates a Newton-Raphson approximation to f(x) = sqrt(x)
float sqrtApproxNR(const float& x) {
    if (x < 0.0f) return NAN;
    
    // Avoid a division-by-zero later on
    if (x == 0.0f) return 0.0f;

    // Initial rough approximation to sqrt(x) using bit-level manipulation
    float y = std::bit_cast<float>((std::bit_cast<uint32_t>(x) + 0x3F800000) >> 1);
    
    // Compute NR step
    // f^-1(y) = y^2; [f^-1]'(y) = 2y
    // y = y - (y * y - x)/(2.0f * y);
    
    // After simplifying,
    y = 0.5f * (y + x/y);
    
    // Repeat as many times as desired; each iteration increases accuracy at
    //   the expense of execution time.
    y = 0.5f * (y + x/y);
    
    return y;
}

$y = \sqrt{x}$	$x = (2\pi)^{-1}$	$x = 2$	$x = 224$
$y^\ast_0$	0.4091549	1.500000	15.00000
$y^\ast_1$	0.3990697	1.416666	14.96666
$y^\ast_2$	0.3989422	1.414215	14.96662
`std::sqrt(x)`	0.3989422	1.414213	14.96662
$\sqrt{x}$	0.3989422…	1.414213…	14.96662…

Function Transforms

Sometimes, the performance or stability can be improved by applying an additional transform $h(y)$ to both terms in $g(y)$ :

\begin{split} y^\ast_{0} &= f^\ast(x) \\ y^\ast_{n+1} &= y^\ast_n - \frac{h(f^{-1}(y^\ast_n)) - h(x)}{h'(f^{-1}(y^\ast_n))\cdot\left[f^{-1}\right]'(y^\ast_n)} \end{split}

The choice of $h(y)$ is not readily obvious, but, in general, should seek to identify functions that cancel terms in or expand the domain of $g(y)$ .

Example: Wright Omega

Seeking to approximate the Wright omega function $f(x) = \omega(x)$ leads to

\begin{split} f^{-1}(y) &= y + \ln{y} \\ \left[f^{-1}\right]'(y) &= 1 + y^{-1} \\ y^\ast_{n+1} &= y^\ast_n - \frac{y^\ast_n + \ln{y^\ast_n} - x}{1 + \frac{1}{y^\ast_n}} \end{split}

which can be problematic as $y \rightarrow 0$ due to limited floating point precision. By applying the transform $h(y) = \exp{y}$ , we may instead calculate

\begin{split} h(f^{-1}(y)) &= \exp{\left(y + \ln{y}\right)} \\ &= y \exp{(y)} \\ h'(f^{-1}(y))\cdot\left[f^{-1}\right]'(y) &= (1 + y^{-1}) \exp{\left(y + \ln{y}\right)} \\ &= (y + 1) \exp{(y)} \\ y^\ast_{n+1} &= y^\ast_n - \frac{y^\ast_n \exp{(y^\ast_n)} - \exp{(x)}}{\left(y^\ast_n + 1\right)\exp{(y^\ast_n) }} \\ &= y^\ast_n - \frac{y^\ast_n - \exp{(x - y^\ast_n)}}{y^\ast_n + 1} \end{split}

which both removes the singularity at $y = 0$ and reduces the number of floating point operations performed each iteration.

Improving Function Approximations using Newton–Raphson Iterations

Derivation

Example: Square Root

Function Transforms

Example: Wright Omega

Sources