2025-02-13 15:34:06 -08:00

211 lines
5.7 KiB
Typst

#import "@local/handout:0.1.0": *
= The Fast Inverse Square Root
A simplified version of the _Quake_ routine we are studying is reproduced below.
#v(2mm)
```c
float Q_rsqrt( float number ) {
long i = * ( long * ) &number;
i = 0x5f3759df - ( i >> 1 );
return * ( float * ) &i;
}
```
#v(2mm)
This code defines a function `Q_rsqrt` that consumes a float `number` and approximates its inverse square root.
If we rewrite this using notation we're familiar with, we get the following:
$
#text[`Q_sqrt`] (n_f) =
6240089 - (n_i div 2)
#h(10mm)
approx 1 / sqrt(n_f)
$
#note[
`0x5f3759df` is $6240089$ in hexadecimal. \
Ask an instructor to explain if you don't know what this means. \
It is a magic number hard-coded into `Q_sqrt`.
]
#v(2mm)
Our goal in this section is to understand why this works:
- How does Quake approximate $1 / sqrt(x)$ by simply subtracting and dividing by two?
- What's special about $6240089$?
#v(1fr)
#remark()
For those that are interested, here are the details of the "code-to-math" translation:
- "`long i = * (long *) &number`" is C magic that tells the compiler \
to set `i` to the `uint` value of the bits of `number`. \
#note[
"long" refers to a "long integer", which has 32 bits. \
Normal `int`s have 16 bits, `short int`s have 8.
] \
In other words, `number` is $n_f$ and `i` is $n_i$.
#v(2mm)
- Notice the right-shift in the second line of the function. \
We translated `(i >> 1)` into $(n_i div 2)$.
#v(2mm)
- "`return * (float *) &i`" is again C magic. \
Much like before, it tells us to return the value of the bits of `i` as a float.
#pagebreak()
#generic("Setup:")
We are now ready to show that $#text[`Q_sqrt`] (x)$ effectively approximates $1/sqrt(x)$. \
For convenience, let's call the bit string of the inverse square root $r$. \
In other words,
$
r_f := 1 / (sqrt(n_f))
$
This is the value we want to approximate. \
#problem(label: "finala")
Find an approximation for $log_2(r_f)$ in terms of $n_i$ and $epsilon$ \
#note[Remember, $epsilon$ is the correction constant in our approximation of $log_2(1 + x)$.]
#solution[
$
log_2(r_f)
= log_2(1 / sqrt(n_f))
= (-1) / 2 log_2(n_f)
approx (-1) / 2 ( (n_i) / (2^23) + epsilon - 127 )
$
]
#v(1fr)
#problem(label: "finalb")
Let's call the "magic number" in the code above $kappa$, so that
$
#text[`Q_sqrt`] (n_f) = kappa - (n_i div 2)
$
Use @convert and @finala to show that $#text[`Q_sqrt`] (n_f) approx r_i$ \
#note(type: "Note")[
If we know $r_i$, we know $r_f$. \
We don't even need to convert between the two---the underlying bits are the same!
]
#solution[
From @convert, we know that
$
log_2(r_f) approx (r_i) / (2^23) + epsilon - 127
$
Combining this with the result from @finala, we get:
$
(r_i) / (2^23) + epsilon - 127
&approx (-1) / (2) ( (n_i) / (2^23) + epsilon - 127) \
(r_i) / (2^23)
&approx (-1) / (2) ( (n_i) / (2^23)) + 3 / 2 (127 - epsilon) \
r_i
&approx (-1) / 2 (n_i) + 2^23 3 / 2(127 - epsilon)
= 2^23 3 / 2 (127 - epsilon) - (n_i) / 2
$
#v(2mm)
This is exactly what we need! If we set $kappa$ to $(3 times 2^22) (127-epsilon)$, then
$
r_i approx kappa - (n_i div 2) = #text[`Q_sqrt`] (n_f)
$
]
#v(1fr)
#problem(label: "finalc")
What is the exact value of $kappa$ in terms of $epsilon$? \
#hint[Look at @finalb. We already found it!]
#solution[
This problem makes sure our students see that
$kappa = (3 times 2^22) (127 - epsilon)$. \
See the solution to @finalb.
]
#v(2cm)
#pagebreak()
#remark()
In @finalc we saw that $kappa = (3 times 2^22) (127 - epsilon)$. \
Looking at the code again, we see that $kappa = #text[`0x5f3759df`]$ in _Quake_:
#v(2mm)
```c
float Q_rsqrt( float number ) {
long i = * ( long * ) &number;
i = 0x5f3759df - ( i >> 1 );
return * ( float * ) &i;
}
```
#v(2mm)
Using a calculator and some basic algebra, we can find the $epsilon$ this code uses: \
#note[Remember, #text[`0x5f3759df`] is $6240089$ in hexadecimal.]
$
(3 times 2^22) (127 - epsilon) &= 6240089 \
(127 - epsilon) &= 126.955 \
epsilon &= 0.0450466
$
So, $0.045$ is the $epsilon$ used by Quake. \
Online sources state that this constant was generated by trial-and-error, \
though it is fairly close to the ideal $epsilon$.
#remark()
And now, we're done! \
We've shown that `Q_sqrt(x)` approximates $1/sqrt(x)$ fairly well. \
#v(2mm)
Notably, `Q_sqrt` uses _zero_ divisions or multiplications (`>>` doesn't count). \
This makes it _very_ fast when compared to more traditional approximation techniques (i.e, Taylor series).
#v(2mm)
In the case of _Quake_, this is very important. 3D graphics require thousands of inverse-square-root calculations to render a single frame#footnote[e.g, to generate normal vectors], which is not an easy task for a Playstation running at 300MHz.
#instructornote[
Let $x$ be a bit string. If we assume $x_f$ is positive and $E$ is even, then
$
(x #text[`>>`] 1)_f = 2^((E div 2) - 127) times (1 + (F div 2) / (2^(23)))
$
Notably: a right-shift divides the exponent of $x_f$ by two, \
which is, of course, a square root!
#v(2mm)
This intuition is hand-wavy, though: \
If $E$ is odd, its lowest-order bit becomes the highest-order bit of $F$ when we shift $x$ right. \
Also, a right shift doesn't divide the _entire_ exponent, skipping the $-127$ offset. \
#v(2mm)
Remarkably, this intuition is still somewhat correct. \
The bits align _just so_, and our approximation still works.
#v(8mm)
One can think of the fast inverse root as a "digital slide rule": \
The integer representation of $x_f$ already contains $log_2(x_f)$, offset and scaled. \
By subtracting and dividing in "log space", we effectively invert and root $x_f$!
After all,
$
- 1 / 2 log_2(n_f) = 1 / sqrt(n_f)
$
]