Solving Problems By Trying Over And Over Again: the Newton-Raphson Method

Wednesday 3 February 2016 at 08:00 GMT

I was re-reading Slash Slash Massive Hack as I wrote yesterday's blog post, and was reminded of the awesomeness of the Newton-Raphson method. Wikipedia explains it better than I:

In numerical analysis, Newton's method (also known as the Newton–Raphson method), named after Isaac Newton and Joseph Raphson, is a method for finding successively better approximations to the roots (or zeroes) of a real-valued function.

Well, at least, it explains it using more complicated words than I. Let's break it down a little. I will be pulling information liberally from Wikipedia, so you might find it easier to follow along there.

The root of a function has a description as a mathematical formula:

x:f(x)=0.x : f(x) = 0.

What this means is that the Newton-Raphson method is a method for finding some input, xx, such that the output of the function ff is 00. This allows us to solve some problems of the form "easy to do, hard to undo".

The classic example of a use of the method is the square root function. The method doesn't allow us to solve this directly, but it does allow us to solve a related problem.

The square root function, sqrt, is defined as x:x2=yx : x^2 = y. Unfortunately, if we want to find the square root of 99, the method can't find x:x2=9x : x^2 = 9. However, if we subtract 99 from both sides, we can solve for this function:

f(x)=x29f(x) = x^2 - 9

The first thing we do is find the derivative of this function. I'm not going to go into calculus, so if you're not sure why, just trust me when I say that the derivative, f(x)=2xf^\prime(x) = 2x.

We then pick a starting value, x_0x\_0. This should be a value that's a good guess for the number we're going for. The actual number, yy, can be used in this case as something that's obviously not the right answer (unless y=1y = 1, of course), but isn't well out.

Then we iterate using the method to find x_1x\_1, then x_2x\_2, then x_3x\_3, and so on. In general, we can calculate x_(n+1)x\_(n + 1) as:

x_(n+1)=xn(f(xn))/(f(xn))x\_(n + 1) = x_n - (f(x_n)) / (f^\prime(x_n))

In the case of the square root function, this is:

x_(n+1)=xn(:xn:2y)/(2xn)x\_(n + 1) = x_n - ({:x_n:}^2 - y) / (2 x_n)

We keep iterating until the change is 0, or so close to it as to be negligible.

So let's try it with y=9y = 9, starting from x_0=1x\_0 = 1, which is probably not the right answer, but is probably not too far off in the grand scheme of things:

x0=1x1=x0(:x0:29)/(2x0)=5x2=x1(:x1:29)/(2x1)=3.4x3=x2(:x2:29)/(2x2)=3.02352941176471...x4=x3(:x3:29)/(2x3)=3.00009155413138...x5=x4(:x4:29)/(2x4)=3.00000000139698...x6=x5(:x5:29)/(2x5)=3.0x7=x6(:x6:29)/(2x6)=3.0\begin{align} x_0 &= 1 \\ x_1 &= x_0 - ({:x_0:}^2 - 9) / (2 x_0) = 5 \\ x_2 &= x_1 - ({:x_1:}^2 - 9) / (2 x_1) = 3.4 \\ x_3 &= x_2 - ({:x_2:}^2 - 9) / (2 x_2) = 3.02352941176471... \\ x_4 &= x_3 - ({:x_3:}^2 - 9) / (2 x_3) = 3.00009155413138... \\ x_5 &= x_4 - ({:x_4:}^2 - 9) / (2 x_4) = 3.00000000139698... \\ x_6 &= x_5 - ({:x_5:}^2 - 9) / (2 x_5) = 3.0 \\ x_7 &= x_6 - ({:x_6:}^2 - 9) / (2 x_6) = 3.0 \end{align}

And we're done. The square root of 9 is 3.0 exactly.

You can use this for any (positive) number, not just square numbers. I've got a working Scala version below which works in very much the same way (and it's on GitHub too).

class NewtonRaphson(
    f: Double => Double,
    fDerivative: Double => Double,
    epsilon: Double = 0.000000000000001
) {
  def iterate(x: Double): Double =
    x - f(x) / fDerivative(x)

  def apply(start: Double): Double =
    Stream.iterate(start)(iterate)
      .sliding(2)
      .map { case Stream(a, b) => (a, b) }
      .dropWhile { case (a, b) => scala.math.abs(a - b) >= epsilon }
      .next
      ._2
}

def squareRoot(n: Double): Double = {
  if (n < 0)
    return Double.NaN

  new NewtonRaphson(x => x * x - n, x => 2 * x).apply(n)
}

The squareRoot function primes the method with ff and f^\\prime, then immediately invokes the apply method with 11 in order to calculate sqrt(n)sqrt(n). The iterate method performs a single iteration, and so is fairly simple and boring. The apply method, however, is interesting, and so I'd like to explain how it works.

First of all, it constructs a lazy stream of subsequent iterations of x_0x\_0:

x0,x1,x2,...x_0, x_1, x_2, ...

It then creates a sliding iterator of two items on top of that, and converts it from a Stream[Stream[Double]], where the inner stream always has two items, to a Stream[(Double, Double)].

(x0,x1),(x1,x2),(x2,x3),...(x_0, x_1), (x_1, x_2), (x_2, x_3), ...

We want the first pair where the two values are the same (or close enough), so it drops elements from this stream while the difference is greater than epsilon. It then grabs the first element from the resulting stream, and then the second part of the tuple. It gets the second, rather than the first, because it's one more iteration and will therefore be the more accurate answer if they are still different.

And that, folks, is how you calculate the square root of a number.


If you enjoyed this post, you can subscribe to this blog using Atom.

Maybe you have something to say. You can email me or toot at me. I love feedback. I also love gigantic compliments, so please send those too.

Please feel free to share this on any and all good social networks.

This article is licensed under the Creative Commons Attribution 4.0 International Public License (CC-BY-4.0).