#import "@local/handout:0.1.0": *

= The Fast Inverse Square Root


The following code is present in _Quake III Arena_ (1999):

#v(5mm)

```c
float Q_rsqrt( float number ) {
    long i = * ( long * ) &number;
    i = 0x5f3759df - ( i >> 1 );
    return * ( float * ) &i;
}
```

#v(5mm)

This code defines a function `Q_rsqrt` that consumes a float `number` and approximates its inverse square root.
If we rewrite this using notation we're familiar with, we get the following:
$
  #text[`Q_sqrt`] (n_f) =
  #h(5mm)
  6240089 - (n_i div 2)
  #h(5mm)
  approx 1 / sqrt(n_f)
$

#note[
  `0x5f3759df` is $6240089$ in hexadecimal. \
  It is a magic number hard-coded into `Q_sqrt`.
]

#v(2mm)

Our goal in this section is to understand why this works:
- How does Quake approximate $1 / sqrt(x)$ by simply subtracting and dividing by two?
- What's special about $6240089$?


#v(1fr)

#remark()
For those that are interested, here are the details of the "code-to-math" translation:

- "`long i = * (long *) &number`" is C magic that tells the compiler \
  to set `i` to the `uint` value of the bits of `number`. \
  #note[
    "long" refers to a "long integer", which has 32 bits. \
    Normal `int`s have 16 bits, `short int`s have 8.
  ] \
  In other words, `number` is $n_f$ and `i` is $n_i$.
#v(2mm)


- Notice the right-shift in the second line of the function. \
  We translated `(i >> i)` into $(n_i div 2)$.
#v(2mm)

- "`return * (float *) &i`" is again C magic. \
  Much like before, it tells us to return the value of the bits of `i` as a float.
#pagebreak()

#generic("Setup:")
We are now ready to show that $#text[`Q_sqrt`] (x) approx 1/sqrt(x)$. \
For convenience, let's call the bit string of the inverse square root $r$. \
In other words,
$
  r_f := 1 / (sqrt(n_f))
$
This is the value we want to approximate.

#problem(label: "finala")
Find an approximation for $log_2(r_f)$ in terms of $n_i$ and $epsilon$ \
#note[Remember, $epsilon$ is the correction constant in our approximation of $log_2(1 + a)$.]

#solution[
  $
    log_2(r_f)
    = log_2(1 / sqrt(n_f))
    = (-1) / 2 log_2(n_f)
    approx (-1) / 2 ( (n_i) / (2^23) + epsilon - 127 )
  $
]

#v(1fr)

#problem(label: "finalb")
Let's call the "magic number" in the code above $kappa$, so that
$
  #text[`Q_sqrt`] (n_f) = kappa - (n_i div 2)
$
Use @convert and @finala to show that $#text[`Q_sqrt`] (n_f) approx r_i$

#solution[
  From @convert, we know that
  $
    log_2(r_f) approx (r_i) / (2^23) + epsilon - 127
  $

  Combining this with the result from @finala, we get:
  $
    (r_i) / (2^23) + epsilon - 127
    &approx (-1) / (2) ( (n_i) / (2^23) + epsilon - 127) \
    (r_i) / (2^23)
    &approx (-1) / (2) ( (n_i) / (2^23)) + 3 / 2 (127 - epsilon) \
    r_i
    &approx (-1) / 2 (n_i) + 2^23 3 / 2(127 - epsilon)
    = 2^23 3 / 2 (127 - epsilon) - (n_i) / 2
  $

  #v(2mm)

  This is exactly what we need! If we set $kappa$ to $(3 times 2^22) (127-epsilon)$, then
  $
    r_i approx kappa - (n_i div 2) = #text[`Q_sqrt`] (n_f)
  $
]

#v(1fr)

#problem(label: "finalc")
What is the exact value of $kappa$ in terms of $epsilon$? \
#hint[Look at @finalb. We already found it!]

#solution[
  This problem makes sure our students see that
  $kappa = (3 times 2^22) (127 - epsilon)$. \
  See the solution to @finalb.
]

#v(2cm)

#pagebreak()

#remark()
In @finalc we saw that $kappa = (3 times 2^22) (127 - epsilon)$. \
Looking at the code again, we see that $kappa = #text[`0x5f3759df`]$ in _Quake_:

#v(2mm)

```c
float Q_rsqrt( float number ) {
    long i = * ( long * ) &number;
    i = 0x5f3759df - ( i >> 1 );
    return * ( float * ) &i;
}
```

#v(2mm)
Using a calculator and some basic algebra, we can find the $epsilon$ this code uses: \
#note[Remember, #text[`0x5f3759df`] is $6240089$ in hexadecimal.]
$
  (3 times 2^22) (127 - epsilon) &= 6240089 \
  (127 - epsilon) &= 126.955 \
  epsilon &= 0.0450466
$

So, $0.045$ is the $epsilon$ used by Quake. \
Online sources state that this constant was generated by trial-and-error, \
though it is fairly close to the ideal $epsilon$.

#remark()
And now, we're done! \
We've shown that `Q_sqrt(x)` approximates $1/sqrt(x)$ fairly well, \
thanks to the approximation $log(1+a) = a + epsilon$.

#v(2mm)

Notably, `Q_sqrt` uses _zero_ divisions or multiplications (`>>` doesn't count). \
This makes it _very_ fast when compared to more traditional approximation techniques (i.e, Taylor series).

#v(2mm)

In the case of _Quake_, this is very important. 3D graphics require thousands of inverse-square-root calculations to render a single frame#footnote[e.g, to generate normal vectors], which is not an easy task for a Playstation running at 300MHz.

#instructornote[
  Let $x$ be a bit string. If we assume $x_f$ is positive and $E$ is even, then
  $
    (x #text[`>>`] 1)_f = 2^((E div 2) - 127) times (1 + (F div 2) / (2^(23)))
  $
  Notably: a right-shift divides the exponent of $x_f$ by two, \
  which is, of course, a square root!

  #v(2mm)

  This intuition is hand-wavy, though: \
  If $E$ is odd, its lowest-order bit becomes the highest-order bit of $F$ when we shift $x$ right. \
  Also, a right shift doesn't divide the _entire_ exponent, skipping the $-127$ offset. \

  #v(2mm)

  Remarkably, this intuition is still somewhat correct. \
  The bits align _just so_, and our approximation still works.

  #v(8mm)

  One can think of the fast inverse root as a "digital slide rule": \
  The integer representation of $x_f$ already contains $log_2(x_f)$, offset and scaled. \
  By subtracting and dividing in "log space", we effectively invert and root $x_f$!

  After all,
  $
    - 1 / 2 log_2(n_f) = 1 / sqrt(n_f)
  $
]