Tom edits

This commit is contained in:
Mark 2025-02-12 16:37:14 -08:00
parent 6d4127f5a5
commit b4852e7fcd
7 changed files with 97 additions and 34 deletions

View File

@ -13,16 +13,19 @@
by: "Mark", by: "Mark",
) )
#include "parts/00 int.typ" #include "parts/00 intro.typ"
#pagebreak() #pagebreak()
#include "parts/01 float.typ" #include "parts/01 int.typ"
#pagebreak() #pagebreak()
#include "parts/02 approx.typ" #include "parts/02 float.typ"
#pagebreak() #pagebreak()
#include "parts/03 quake.typ" #include "parts/03 approx.typ"
#pagebreak() #pagebreak()
#include "parts/04 bonus.typ" #include "parts/04 quake.typ"
#pagebreak()
#include "parts/05 bonus.typ"

View File

@ -0,0 +1,45 @@
#import "@local/handout:0.1.0": *
= Introduction
In 2005, ID Software published the source code of _Quake III Arena_, a popular game released in 1999. \
This caused quite a stir: ID Software was responsible for many games popular among old-school engineers (most notably _Doom_, which has a place in programmer humor even today).
#v(2mm)
Naturally, this community immediately began dissecting _Quake_'s source. \
One particularly interesting function is reproduced below, with original comments: \
#v(3mm)
```c
float Q_rsqrt( float number ) {
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // [redacted]
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
```
#v(3mm)
This code defines a function `Q_sqrt`, which was used as a fast approximation of the inverse square root in graphics routines. (in other words, `Q_sqrt` efficiently approximates $1 div sqrt(x)$)
#v(3mm)
The key word here is "fast": _Quake_ ran on very limited hardware, and traditional approximation techniques (like Taylor series)#footnote[Taylor series aren't used today, and for the same reason. There are better ways.] were too computationally expensive to be viable.
#v(3mm)
Our goal today is to understand how `Q_sqrt` works. \
To do that, we'll first need to understand how computers represent numbers. \
We'll start with simple binary integers---turn the page.

View File

@ -5,7 +5,8 @@
#definition() #definition()
A _bit string_ is a string of binary digits. \ A _bit string_ is a string of binary digits. \
In this handout, we'll denote bit strings with the prefix `0b`. \ In this handout, we'll denote bit strings with the prefix `0b`. \
That is, $1010 =$ "one thousand and one," while $#text([`0b1001`]) = 2^3 + 2^0 = 9$ #note[This prefix is only notation---it is _not_ part of the string itself.] \
For example, $1001$ is the number "one thousand and one," while $#text([`0b1001`])$ is the string of bits "1 0 0 1".
#v(2mm) #v(2mm)
We will separate long bit strings with underscores for readability. \ We will separate long bit strings with underscores for readability. \
@ -40,7 +41,7 @@ The value of a `uint` is simply its value as a binary number:
What is the largest number we can represent with a 32-bit `uint`? What is the largest number we can represent with a 32-bit `uint`?
#solution([ #solution([
$#text([`0b01111111_11111111_11111111_11111111`]) = 2^(31)$ $#text([`0b11111111_11111111_11111111_11111111`]) = 2^(32)-1$
]) ])
#v(1fr) #v(1fr)
@ -53,6 +54,10 @@ Find the value of each of the following 32-bit unsigned integers:
- `0b00000000_00000000_00000100_10110000` - `0b00000000_00000000_00000100_10110000`
#hint([The third conversion is easy---look carefully at the second.]) #hint([The third conversion is easy---look carefully at the second.])
#instructornote[
Consider making a list of the powers of two $>= 1024$ on the board.
]
#solution([ #solution([
- $#text([`0b00000000_00000000_00000101_00111001`]) = 1337$ - $#text([`0b00000000_00000000_00000101_00111001`]) = 1337$
- $#text([`0b00000000_00000000_00000001_00101100`]) = 300$ - $#text([`0b00000000_00000000_00000001_00101100`]) = 300$
@ -64,20 +69,20 @@ Find the value of each of the following 32-bit unsigned integers:
#definition() #definition()
In general, division of `uints` is nontrivial#footnote([One may use repeated subtraction, but that isn't efficient.]). \ In general, fast division of `uints` is difficult#footnote([One may use repeated subtraction, but this isn't efficient.]). \
Division by powers of two, however, is incredibly easy: \ Division by powers of two, however, is incredibly easy: \
To divide by two, all we need to do is shift the bits of our integer right. To divide by two, all we need to do is shift the bits of our integer right.
#v(2mm) #v(2mm)
For example, consider $#text[`0b0000_0110`] = 6$. \ For example, consider $#text[`0b0000_0110`] = 6$. \
If we insert a zero at the left end of this bit string and delete the digit at the right \ If we insert a zero at the left end of this string and delete the zero at the right \
(thus "shifting" each bit right), we get `0b0000_0011`, which is 3. \ (thus "shifting" each bit right), we get `0b0000_0011`, which is 3. \
#v(2mm) #v(2mm)
Of course, we loose the remainder when we left-shift an odd number: \ Of course, we lose the remainder when we right-shift an odd number: \
$9 div 2 = 4$, since `0b0000_1001` shifted right is `0b0000_0100`. $9$ shifted right is $4$, since `0b0000_1001` shifted right is `0b0000_0100`.
#problem() #problem()
Right shifts are denoted by the `>>` symbol: \ Right shifts are denoted by the `>>` symbol: \
@ -86,6 +91,7 @@ Find the value of the following:
- $12 #text[`>>`] 1$ - $12 #text[`>>`] 1$
- $27 #text[`>>`] 3$ - $27 #text[`>>`] 3$
- $16 #text[`>>`] 8$ - $16 #text[`>>`] 8$
#note[Naturally, you'll have to convert these integers to binary first.]
#solution[ #solution[
- $12 #text[`>>`] 1 = 6$ - $12 #text[`>>`] 1 = 6$

View File

@ -3,7 +3,7 @@
= Floats = Floats
#definition() #definition()
_Binary decimals_#footnote["decimal" is a misnomer, but that's ok.] are very similar to base-10 decimals. \ _Binary decimals_#footnote([Note that "binary decimal" is a misnomer---"deci" means "ten"!]) are very similar to base-10 decimals.\
In base 10, we interpret place value as follows: In base 10, we interpret place value as follows:
- $0.1 = 10^(-1)$ - $0.1 = 10^(-1)$
- $0.03 = 3 times 10^(-2)$ - $0.03 = 3 times 10^(-2)$
@ -107,11 +107,13 @@ Floats represent a subset of the real numbers, and are interpreted as follows: \
- The next eight bits represent the _exponent_ of this float. - The next eight bits represent the _exponent_ of this float.
#note([(we'll see what that means soon)]) \ #note([(we'll see what that means soon)]) \
We'll call the value of this eight-bit binary integer $E$. \ We'll call the value of this eight-bit binary integer $E$. \
Naturally, $0 <= E <= 255$ #note([(since $E$ consist of eight bits.)]) Naturally, $0 <= E <= 255$ #note([(since $E$ consist of eight bits)])
- The remaining 23 bits represent the _fraction_ of this float, which we'll call $F$. \ - The remaining 23 bits represent the _fraction_ of this float. \
These 23 bits are interpreted as the fractional part of a binary decimal. \ They are interpreted as the fractional part of a binary decimal. \
For example, the bits `0b10100000_00000000_00000000` represents $0.5 + 0.125 = 0.625$. For example, the bits `0b10100000_00000000_00000000` represent $0.5 + 0.125 = 0.625$. \
We'll call the value of these bits as a binary integer $F$. \
Their value as a binary decimal is then $F div 2^23$. #note([(convince yourself of this)])
#problem(label: "floata") #problem(label: "floata")
@ -135,12 +137,17 @@ $
(-1)^s times 2^(E - 127) times (1 + F / (2^(23))) (-1)^s times 2^(E - 127) times (1 + F / (2^(23)))
$ $
Notice that this is very similar to decimal scientific notation, which is written as Notice that this is very similar to base-10 scientific notation, which is written as
$ $
(-1)^s times 10^(e) times (f) (-1)^s times 10^(e) times (f)
$ $
#note[
We subtract 127 from $E$ so we can represent positive and negative numbers. \
$E$ is an eight bit binary integer, so $0 <= E <= 255$ and thus $-127 <= (E - 127) <= 127$.
]
#problem() #problem()
Consider `0b01000001_10101000_00000000_00000000`. \ Consider `0b01000001_10101000_00000000_00000000`. \
This is the same bit string we used in @floata. \ This is the same bit string we used in @floata. \

View File

@ -5,7 +5,7 @@
= Integers and Floats = Integers and Floats
#generic("Observation:") #generic("Observation:")
For small values of $x$, $log_2(1 + x)$ is approximately equal to $x$. \ If $x$ is smaller than 1, $log_2(1 + x)$ is approximately equal to $x$. \
Note that this equality is exact for $x = 0$ and $x = 1$, since $log_2(1) = 0$ and $log_2(2) = 1$. Note that this equality is exact for $x = 0$ and $x = 1$, since $log_2(1) = 0$ and $log_2(2) = 1$.
#v(5mm) #v(5mm)
@ -18,7 +18,7 @@ This allows us to improve the average error of our linear approximation:
align: center, align: center,
columns: (1fr, 1fr), columns: (1fr, 1fr),
inset: 5mm, inset: 5mm,
[$log(1+x)$ and $x + 0$] [$log_2(1+x)$ and $x + 0$]
+ cetz.canvas({ + cetz.canvas({
import cetz.draw: * import cetz.draw: *
@ -64,7 +64,7 @@ This allows us to improve the average error of our linear approximation:
Max error: 0.086 \ Max error: 0.086 \
Average error: 0.0573 Average error: 0.0573
], ],
[$log(1+x)$ and $x + 0.045$] [$log_2(1+x)$ and $x + 0.045$]
+ cetz.canvas({ + cetz.canvas({
import cetz.draw: * import cetz.draw: *
@ -125,7 +125,7 @@ We won't bother with this---we'll simply leave the correction term as an opaque
[ [
"Average error" above is simply the area of the region between the two graphs: "Average error" above is simply the area of the region between the two graphs:
$ $
integral_0^1 abs( #v(1mm) log(1+x) - (x+epsilon) #v(1mm)) integral_0^1 abs( #v(1mm) log(1+x)_2 - (x+epsilon) #v(1mm))
$ $
Feel free to ignore this note, it isn't a critical part of this handout. Feel free to ignore this note, it isn't a critical part of this handout.
], ],

View File

@ -2,10 +2,9 @@
= The Fast Inverse Square Root = The Fast Inverse Square Root
A simplified version of the _Quake_ routine we are studying is reproduced below.
The following code is present in _Quake III Arena_ (1999): #v(2mm)
#v(5mm)
```c ```c
float Q_rsqrt( float number ) { float Q_rsqrt( float number ) {
@ -15,20 +14,20 @@ float Q_rsqrt( float number ) {
} }
``` ```
#v(5mm) #v(2mm)
This code defines a function `Q_rsqrt` that consumes a float `number` and approximates its inverse square root. This code defines a function `Q_rsqrt` that consumes a float `number` and approximates its inverse square root.
If we rewrite this using notation we're familiar with, we get the following: If we rewrite this using notation we're familiar with, we get the following:
$ $
#text[`Q_sqrt`] (n_f) = #text[`Q_sqrt`] (n_f) =
#h(5mm)
6240089 - (n_i div 2) 6240089 - (n_i div 2)
#h(5mm) #h(10mm)
approx 1 / sqrt(n_f) approx 1 / sqrt(n_f)
$ $
#note[ #note[
`0x5f3759df` is $6240089$ in hexadecimal. \ `0x5f3759df` is $6240089$ in hexadecimal. \
Ask an instructor to explain if you don't know what this means. \
It is a magic number hard-coded into `Q_sqrt`. It is a magic number hard-coded into `Q_sqrt`.
] ]
@ -56,7 +55,7 @@ For those that are interested, here are the details of the "code-to-math" transl
- Notice the right-shift in the second line of the function. \ - Notice the right-shift in the second line of the function. \
We translated `(i >> i)` into $(n_i div 2)$. We translated `(i >> 1)` into $(n_i div 2)$.
#v(2mm) #v(2mm)
- "`return * (float *) &i`" is again C magic. \ - "`return * (float *) &i`" is again C magic. \
@ -64,17 +63,17 @@ For those that are interested, here are the details of the "code-to-math" transl
#pagebreak() #pagebreak()
#generic("Setup:") #generic("Setup:")
We are now ready to show that $#text[`Q_sqrt`] (x) approx 1/sqrt(x)$. \ We are now ready to show that $#text[`Q_sqrt`] (x)$ effectively approximates $1/sqrt(x)$. \
For convenience, let's call the bit string of the inverse square root $r$. \ For convenience, let's call the bit string of the inverse square root $r$. \
In other words, In other words,
$ $
r_f := 1 / (sqrt(n_f)) r_f := 1 / (sqrt(n_f))
$ $
This is the value we want to approximate. This is the value we want to approximate. \
#problem(label: "finala") #problem(label: "finala")
Find an approximation for $log_2(r_f)$ in terms of $n_i$ and $epsilon$ \ Find an approximation for $log_2(r_f)$ in terms of $n_i$ and $epsilon$ \
#note[Remember, $epsilon$ is the correction constant in our approximation of $log_2(1 + a)$.] #note[Remember, $epsilon$ is the correction constant in our approximation of $log_2(1 + x)$.]
#solution[ #solution[
$ $
@ -92,7 +91,11 @@ Let's call the "magic number" in the code above $kappa$, so that
$ $
#text[`Q_sqrt`] (n_f) = kappa - (n_i div 2) #text[`Q_sqrt`] (n_f) = kappa - (n_i div 2)
$ $
Use @convert and @finala to show that $#text[`Q_sqrt`] (n_f) approx r_i$ Use @convert and @finala to show that $#text[`Q_sqrt`] (n_f) approx r_i$ \
#note(type: "Note")[
If we know $r_i$, we know $r_f$. \
We don't even need to convert between the two---the underlying bits are the same!
]
#solution[ #solution[
From @convert, we know that From @convert, we know that
@ -164,8 +167,7 @@ though it is fairly close to the ideal $epsilon$.
#remark() #remark()
And now, we're done! \ And now, we're done! \
We've shown that `Q_sqrt(x)` approximates $1/sqrt(x)$ fairly well, \ We've shown that `Q_sqrt(x)` approximates $1/sqrt(x)$ fairly well. \
thanks to the approximation $log(1+a) = a + epsilon$.
#v(2mm) #v(2mm)