All checks were successful
		
		
	
	CI / Typos (pull_request) Successful in 9s
				
			CI / Typst formatting (pull_request) Successful in 4s
				
			CI / Build (pull_request) Successful in 6m3s
				
			CI / Typos (push) Successful in 10s
				
			CI / Typst formatting (push) Successful in 4s
				
			CI / Build (push) Successful in 7m25s
				
			
		
			
				
	
	
		
			211 lines
		
	
	
		
			5.8 KiB
		
	
	
	
		
			Typst
		
	
	
	
	
	
			
		
		
	
	
			211 lines
		
	
	
		
			5.8 KiB
		
	
	
	
		
			Typst
		
	
	
	
	
	
| #import "@local/handout:0.1.0": *
 | |
| 
 | |
| = The Fast Inverse Square Root
 | |
| 
 | |
| A simplified version of the _Quake_ routine we are studying is reproduced below.
 | |
| 
 | |
| #v(2mm)
 | |
| 
 | |
| ```c
 | |
| float Q_rsqrt( float number ) {
 | |
|     long i = * ( long * ) &number;
 | |
|     i = 0x5f3759df - ( i >> 1 );
 | |
|     return * ( float * ) &i;
 | |
| }
 | |
| ```
 | |
| 
 | |
| #v(2mm)
 | |
| 
 | |
| This code defines a function `Q_rsqrt` that consumes a float `number` and approximates its inverse square root.
 | |
| If we rewrite this using notation we're familiar with, we get the following:
 | |
| $
 | |
|   #text[`Q_sqrt`] (n_f) =
 | |
|   6240089 - (n_i div 2)
 | |
|   #h(10mm)
 | |
|   approx 1 / sqrt(n_f)
 | |
| $
 | |
| 
 | |
| #note[
 | |
|   `0x5f3759df` is $6240089$ in hexadecimal. \
 | |
|   Ask an instructor to explain if you don't know what this means. \
 | |
|   It is a magic number hard-coded into `Q_sqrt`.
 | |
| ]
 | |
| 
 | |
| #v(2mm)
 | |
| 
 | |
| Our goal in this section is to understand why this works:
 | |
| - How does Quake approximate $1 / sqrt(x)$ by simply subtracting and dividing by two?
 | |
| - What's special about $6240089$?
 | |
| 
 | |
| 
 | |
| 
 | |
| #v(1fr)
 | |
| 
 | |
| #remark()
 | |
| For those that are interested, here are the details of the "code-to-math" translation:
 | |
| 
 | |
| - "`long i = * (long *) &number`" is C magic that tells the compiler \
 | |
|   to set `i` to the `uint` value of the bits of `number`. \
 | |
|   #note[
 | |
|     "long" refers to a "long integer", which has 32 bits. \
 | |
|     Normal `int`s have 16 bits, `short int`s have 8.
 | |
|   ] \
 | |
|   In other words, `number` is $n_f$ and `i` is $n_i$.
 | |
| #v(2mm)
 | |
| 
 | |
| 
 | |
| - Notice the right-shift in the second line of the function. \
 | |
|   We translated `(i >> 1)` into $(n_i div 2)$.
 | |
| #v(2mm)
 | |
| 
 | |
| - "`return * (float *) &i`" is again C magic. \
 | |
|   Much like before, it tells us to return the value of the bits of `i` as a float.
 | |
| #pagebreak()
 | |
| 
 | |
| #generic("Setup:")
 | |
| We are now ready to show that $#text[`Q_sqrt`] (x)$ effectively approximates $1/sqrt(x)$. \
 | |
| For convenience, let's call the bit string of the inverse square root $r$. \
 | |
| In other words,
 | |
| $
 | |
|   r_f := 1 / (sqrt(n_f))
 | |
| $
 | |
| This is the value we want to approximate. \
 | |
| 
 | |
| #problem(label: "finala")
 | |
| Find an approximation for $log_2(r_f)$ in terms of $n_i$ and $epsilon$ \
 | |
| #note[Remember, $epsilon$ is the correction constant in our approximation of $log_2(1 + x)$.]
 | |
| 
 | |
| #solution[
 | |
|   $
 | |
|     log_2(r_f)
 | |
|     = log_2(1 / sqrt(n_f))
 | |
|     = (-1) / 2 log_2(n_f)
 | |
|     approx (-1) / 2 ( (n_i) / (2^23) + epsilon - 127 )
 | |
|   $
 | |
| ]
 | |
| 
 | |
| #v(1fr)
 | |
| 
 | |
| #problem(label: "finalb")
 | |
| Let's call the "magic number" in the code above $kappa$, so that
 | |
| $
 | |
|   #text[`Q_sqrt`] (n_f) = kappa - (n_i div 2)
 | |
| $
 | |
| Use @convert and @finala to show that $#text[`Q_sqrt`] (n_f) approx r_i$ \
 | |
| #note(type: "Note")[
 | |
|   If we know $r_i$, we know $r_f$. \
 | |
|   We don't even need to convert between the two---the underlying bits are the same!
 | |
| ]
 | |
| 
 | |
| #solution[
 | |
|   From @convert, we know that
 | |
|   $
 | |
|     log_2(r_f) approx (r_i) / (2^23) + epsilon - 127
 | |
|   $
 | |
| 
 | |
|   Combining this with the result from @finala, we get:
 | |
|   $
 | |
|     (r_i) / (2^23) + epsilon - 127
 | |
|     &approx (-1) / (2) ( (n_i) / (2^23) + epsilon - 127) \
 | |
|     (r_i) / (2^23)
 | |
|     &approx (-1) / (2) ( (n_i) / (2^23)) + 3 / 2 (127 - epsilon) \
 | |
|     r_i
 | |
|     &approx (-1) / 2 (n_i) + 2^23 3 / 2(127 - epsilon)
 | |
|     = 2^23 3 / 2 (127 - epsilon) - (n_i) / 2
 | |
|   $
 | |
| 
 | |
|   #v(2mm)
 | |
| 
 | |
|   This is exactly what we need! If we set $kappa$ to $(3 times 2^22) (127-epsilon)$, then
 | |
|   $
 | |
|     r_i approx kappa - (n_i div 2) = #text[`Q_sqrt`] (n_f)
 | |
|   $
 | |
| ]
 | |
| 
 | |
| #v(1fr)
 | |
| 
 | |
| #problem(label: "finalc")
 | |
| What is the exact value of $kappa$ in terms of $epsilon$? \
 | |
| #hint[Look at @finalb. We already found it!]
 | |
| 
 | |
| #solution[
 | |
|   This problem makes sure our students see that
 | |
|   $kappa = (3 times 2^22) (127 - epsilon)$. \
 | |
|   See the solution to @finalb.
 | |
| ]
 | |
| 
 | |
| #v(2cm)
 | |
| 
 | |
| #pagebreak()
 | |
| 
 | |
| #remark()
 | |
| In @finalc we saw that $kappa = (3 times 2^22) (127 - epsilon)$. \
 | |
| Looking at the code again, we see that $kappa = #text[`0x5f3759df`]$ in _Quake_:
 | |
| 
 | |
| #v(2mm)
 | |
| 
 | |
| ```c
 | |
| float Q_rsqrt( float number ) {
 | |
|     long i = * ( long * ) &number;
 | |
|     i = 0x5f3759df - ( i >> 1 );
 | |
|     return * ( float * ) &i;
 | |
| }
 | |
| ```
 | |
| 
 | |
| #v(2mm)
 | |
| Using a calculator and some basic algebra, we can find the $epsilon$ this code uses: \
 | |
| #note[Remember, #text[`0x5f3759df`] is $6240089$ in hexadecimal.]
 | |
| $
 | |
|   (3 times 2^22) (127 - epsilon) & = 6240089 \
 | |
|                  (127 - epsilon) & = 126.955 \
 | |
|                          epsilon & = 0.0450466
 | |
| $
 | |
| 
 | |
| So, $0.045$ is the $epsilon$ used by Quake. \
 | |
| Online sources state that this constant was generated by trial-and-error, \
 | |
| though it is fairly close to the ideal $epsilon$.
 | |
| 
 | |
| #remark()
 | |
| And now, we're done! \
 | |
| We've shown that `Q_sqrt(x)` approximates $1/sqrt(x)$ fairly well. \
 | |
| 
 | |
| #v(2mm)
 | |
| 
 | |
| Notably, `Q_sqrt` uses _zero_ divisions or multiplications (`>>` doesn't count). \
 | |
| This makes it _very_ fast when compared to more traditional approximation techniques (i.e, Taylor series).
 | |
| 
 | |
| #v(2mm)
 | |
| 
 | |
| In the case of _Quake_, this is very important. 3D graphics require thousands of inverse-square-root calculations to render a single frame#footnote[e.g, to generate normal vectors], which is not an easy task for a Playstation running at 300MHz.
 | |
| 
 | |
| #instructornote[
 | |
|   Let $x$ be a bit string. If we assume $x_f$ is positive and $E$ is even, then
 | |
|   $
 | |
|     (x #text[`>>`] 1)_f = 2^((E div 2) - 127) times (1 + (F div 2) / (2^(23)))
 | |
|   $
 | |
|   Notably: a right-shift divides the exponent of $x_f$ by two, \
 | |
|   which is, of course, a square root!
 | |
| 
 | |
|   #v(2mm)
 | |
| 
 | |
|   This intuition is hand-wavy, though: \
 | |
|   If $E$ is odd, its lowest-order bit becomes the highest-order bit of $F$ when we shift $x$ right. \
 | |
|   Also, a right shift doesn't divide the _entire_ exponent, skipping the $-127$ offset. \
 | |
| 
 | |
|   #v(2mm)
 | |
| 
 | |
|   Remarkably, this intuition is still somewhat correct. \
 | |
|   The bits align _just so_, and our approximation still works.
 | |
| 
 | |
|   #v(8mm)
 | |
| 
 | |
|   One can think of the fast inverse root as a "digital slide rule": \
 | |
|   The integer representation of $x_f$ already contains $log_2(x_f)$, offset and scaled. \
 | |
|   By subtracting and dividing in "log space", we effectively invert and root $x_f$!
 | |
| 
 | |
|   After all,
 | |
|   $
 | |
|     - 1 / 2 log_2(n_f) = 1 / sqrt(n_f)
 | |
|   $
 | |
| ]
 |