#import "@local/handout:0.1.0": * = The Fast Inverse Square Root The following code is present in _Quake III Arena_ (1999): #v(5mm) ```c float Q_rsqrt( float number ) { long i = * ( long * ) &number; i = 0x5f3759df - ( i >> 1 ); return * ( float * ) &i; } ``` #v(5mm) This code defines a function `Q_rsqrt` that consumes a float `number` and approximates its inverse square root. If we rewrite this using notation we're familiar with, we get the following: $ #text[`Q_sqrt`] (n_f) = #h(5mm) 6240089 - (n_i div 2) #h(5mm) approx 1 / sqrt(n_f) $ #note[ `0x5f3759df` is $6240089$ in hexadecimal. \ It is a magic number hard-coded into `Q_sqrt`. ] #v(2mm) Our goal in this section is to understand why this works: - How does Quake approximate $1 / sqrt(x)$ by simply subtracting and dividing by two? - What's special about $6240089$? #v(1fr) #remark() For those that are interested, here are the details of the "code-to-math" translation: - "`long i = * (long *) &number`" is C magic that tells the compiler \ to set `i` to the `uint` value of the bits of `number`. \ #note[ "long" refers to a "long integer", which has 32 bits. \ Normal `int`s have 16 bits, `short int`s have 8. ] \ In other words, `number` is $n_f$ and `i` is $n_i$. #v(2mm) - Notice the right-shift in the second line of the function. \ We translated `(i >> i)` into $(n_i div 2)$. #v(2mm) - "`return * (float *) &i`" is again C magic. \ Much like before, it tells us to return the value of the bits of `i` as a float. #pagebreak() #generic("Setup:") We are now ready to show that $#text[`Q_sqrt`] (x) approx 1/sqrt(x)$. \ For convenience, let's call the bit string of the inverse square root $r$. \ In other words, $ r_f := 1 / (sqrt(n_f)) $ This is the value we want to approximate. #problem(label: "finala") Find an approximation for $log_2(r_f)$ in terms of $n_i$ and $epsilon$ \ #note[Remember, $epsilon$ is the correction constant in our approximation of $log_2(1 + a)$.] #solution[ $ log_2(r_f) = log_2(1 / sqrt(n_f)) = (-1) / 2 log_2(n_f) approx (-1) / 2 ( (n_i) / (2^23) + epsilon - 127 ) $ ] #v(1fr) #problem(label: "finalb") Let's call the "magic number" in the code above $kappa$, so that $ #text[`Q_sqrt`] (n_f) = kappa - (n_i div 2) $ Use @convert and @finala to show that $#text[`Q_sqrt`] (n_f) approx r_i$ #solution[ From @convert, we know that $ log_2(r_f) approx (r_i) / (2^23) + epsilon - 127 $ Combining this with the result from @finala, we get: $ (r_i) / (2^23) + epsilon - 127 &approx (-1) / (2) ( (n_i) / (2^23) + epsilon - 127) \ (r_i) / (2^23) &approx (-1) / (2) ( (n_i) / (2^23)) + 3 / 2 (127 - epsilon) \ r_i &approx (-1) / 2 (n_i) + 2^23 3 / 2(127 - epsilon) = 2^23 3 / 2 (127 - epsilon) - (n_i) / 2 $ #v(2mm) This is exactly what we need! If we set $kappa$ to $(3 times 2^22) (127-epsilon)$, then $ r_i approx kappa - (n_i div 2) = #text[`Q_sqrt`] (n_f) $ ] #v(1fr) #problem(label: "finalc") What is the exact value of $kappa$ in terms of $epsilon$? \ #hint[Look at @finalb. We already found it!] #solution[ This problem makes sure our students see that $kappa = (3 times 2^22) (127 - epsilon)$. \ See the solution to @finalb. ] #v(2cm) #pagebreak() #remark() In @finalc we saw that $kappa = (3 times 2^22) (127 - epsilon)$. \ Looking at the code again, we see that $kappa = #text[`0x5f3759df`]$ in _Quake_: #v(2mm) ```c float Q_rsqrt( float number ) { long i = * ( long * ) &number; i = 0x5f3759df - ( i >> 1 ); return * ( float * ) &i; } ``` #v(2mm) Using a calculator and some basic algebra, we can find the $epsilon$ this code uses: \ #note[Remember, #text[`0x5f3759df`] is $6240089$ in hexadecimal.] $ (3 times 2^22) (127 - epsilon) &= 6240089 \ (127 - epsilon) &= 126.955 \ epsilon &= 0.0450466 $ So, $0.045$ is the $epsilon$ used by Quake. \ Online sources state that this constant was generated by trial-and-error, \ though it is fairly close to the ideal $epsilon$. #remark() And now, we're done! \ We've shown that `Q_sqrt(x)` approximates $1/sqrt(x)$ fairly well, \ thanks to the approximation $log(1+a) = a + epsilon$. #v(2mm) Notably, `Q_sqrt` uses _zero_ divisions or multiplications (`>>` doesn't count). \ This makes it _very_ fast when compared to more traditional approximation techniques (i.e, Taylor series). #v(2mm) In the case of _Quake_, this is very important. 3D graphics require thousands of inverse-square-root calculations to render a single frame#footnote[e.g, to generate normal vectors], which is not an easy task for a Playstation running at 300MHz. #instructornote[ Let $x$ be a bit string. If we assume $x_f$ is positive and $E$ is even, then $ (x #text[`>>`] 1)_f = 2^((E div 2) - 127) times (1 + (F div 2) / (2^(23))) $ Notably: a right-shift divides the exponent of $x_f$ by two, \ which is, of course, a square root! #v(2mm) This intuition is hand-wavy, though: \ If $E$ is odd, its lowest-order bit becomes the highest-order bit of $F$ when we shift $x$ right. \ Also, a right shift doesn't divide the _entire_ exponent, skipping the $-127$ offset. \ #v(2mm) Remarkably, this intuition is still somewhat correct. \ The bits align _just so_, and our approximation still works. #v(8mm) One can think of the fast inverse root as a "digital slide rule": \ The integer representation of $x_f$ already contains $log_2(x_f)$, offset and scaled. \ By subtracting and dividing in "log space", we effectively invert and root $x_f$! After all, $ - 1 / 2 log_2(n_f) = 1 / sqrt(n_f) $ ]