Finish main sections
This commit is contained in:
parent
ba374a5ee2
commit
2259ce1bcb
@ -1,11 +1,5 @@
|
||||
#import "@local/handout:0.1.0": *
|
||||
|
||||
// Intro:
|
||||
// - we don't need signed ints
|
||||
// - Why is division expensive?
|
||||
// Add a few problems add/multiplying/dividing floats
|
||||
// - Spend more time on left-shift
|
||||
//
|
||||
// Another look:
|
||||
// - Highlight that left-shift divides the exponent by two
|
||||
// - Highlight that log(ri) is already in its integer representation
|
||||
@ -19,10 +13,8 @@
|
||||
#show: doc => handout(
|
||||
doc,
|
||||
group: "Advanced 2",
|
||||
|
||||
title: [Fast Inverse Root],
|
||||
by: "Mark",
|
||||
subtitle: "Based on a handout by Bryant Mathews",
|
||||
)
|
||||
|
||||
#include "parts/00 int.typ"
|
||||
|
@ -22,57 +22,32 @@ What is the value of the following bit strings, if we interpret them as integers
|
||||
])
|
||||
|
||||
#v(1fr)
|
||||
#pagebreak()
|
||||
|
||||
#definition()
|
||||
We can interpret a bit string in any number of ways. \
|
||||
One such interpretation is the _signed integer_, or `int` for short. \
|
||||
`ints` allow us to represent negative and positive integers using 32-bit strings.
|
||||
One such interpretation is the _unsigned integer_, or `uint` for short. \
|
||||
`uint`s allow us to represent positive (hence "unsigned") integers using 32-bit strings.
|
||||
|
||||
#v(2mm)
|
||||
|
||||
The first bit of an `int` tells us its sign:
|
||||
- if the first bit is `1`, the _int_ represents a negative number;
|
||||
- if the first bit is `0`, it represents a positive number.
|
||||
|
||||
We do not need negative numbers today, so we will assume that the first bit is always zero. \
|
||||
#note([If you'd like to know how negative integers are written, look up "two's complement} after class.])
|
||||
|
||||
#v(2mm)
|
||||
|
||||
The value of a positive signed `long` is simply the value of its binary digits:
|
||||
The value of a `uint` is simply its value as a binary number:
|
||||
- $#text([`0b00000000_00000000_00000000_00000000`]) = 0$
|
||||
- $#text([`0b00000000_00000000_00000000_00000011`]) = 3$
|
||||
- $#text([`0b00000000_00000000_00000000_00100000`]) = 32$
|
||||
- $#text([`0b00000000_00000000_00000000_10000010`]) = 130$
|
||||
|
||||
#problem()
|
||||
What is the largest number we can represent with a 32-bit `int`?
|
||||
What is the largest number we can represent with a 32-bit `uint`?
|
||||
|
||||
#solution([
|
||||
$#text([`0b01111111_11111111_11111111_11111111`]) = 2^(31)$
|
||||
])
|
||||
|
||||
#v(1fr)
|
||||
#pagebreak()
|
||||
|
||||
#problem()
|
||||
What is the smallest possible number we can represented with a 32-bit `int`? \
|
||||
#hint([
|
||||
You do not need to know _how_ negative numbers are represented. \
|
||||
Assume that we do not skip any integers, and don't forget about zero.
|
||||
])
|
||||
|
||||
#solution([
|
||||
There are $2^(64)$ possible 32-bit patterns,
|
||||
of which 1 represents zero and $2^(31)$ represent positive numbers.
|
||||
We therefore have access to $2^(64) - 1 - 2^(31)$ negative numbers,
|
||||
giving us a minimum representable value of $-2^(31) + 1$.
|
||||
])
|
||||
|
||||
#v(1fr)
|
||||
|
||||
#problem()
|
||||
Find the value of each of the following 32-bit `int`s:
|
||||
Find the value of each of the following 32-bit unsigned integers:
|
||||
- `0b00000000_00000000_00000101_00111001`
|
||||
- `0b00000000_00000000_00000001_00101100`
|
||||
- `0b00000000_00000000_00000100_10110000`
|
||||
@ -82,8 +57,40 @@ Find the value of each of the following 32-bit `int`s:
|
||||
- $#text([`0b00000000_00000000_00000101_00111001`]) = 1337$
|
||||
- $#text([`0b00000000_00000000_00000001_00101100`]) = 300$
|
||||
- $#text([`0b00000000_00000000_00000010_01011000`]) = 1200$
|
||||
])
|
||||
Notice that the third int is the second shifted left twice (i.e, multiplied by 4)
|
||||
Notice that the third int is the second shifted left twice (i.e, multiplied by 4)
|
||||
])
|
||||
|
||||
#v(2fr)
|
||||
#v(1fr)
|
||||
|
||||
|
||||
#definition()
|
||||
In general, division of `uints` is nontrivial#footnote([One may use repeated subtraction, but that isn't efficient.]). \
|
||||
Division by powers of two, however, is incredibly easy: \
|
||||
To divide by two, all we need to do is shift the bits of our integer right.
|
||||
|
||||
#v(2mm)
|
||||
|
||||
For example, consider $#text[`0b0000_0110`] = 6$. \
|
||||
If we insert a zero at the left end of this bit string and delete the digit at the right \
|
||||
(thus "shifting" each bit right), we get `0b0000_0011`, which is 3. \
|
||||
|
||||
#v(2mm)
|
||||
|
||||
Of course, we loose the remainder when we left-shift an odd number: \
|
||||
$9 div 2 = 4$, since `0b0000_1001` shifted right is `0b0000_0100`.
|
||||
|
||||
#problem()
|
||||
Right shifts are denoted by the `>>` symbol: \
|
||||
$#text[`00110`] #text[`>>`] n$ means "shift `0b0110` right $n$ times." \
|
||||
Find the value of the following:
|
||||
- $12 #text[`>>`] 1$
|
||||
- $27 #text[`>>`] 3$
|
||||
- $16 #text[`>>`] 8$
|
||||
|
||||
#solution[
|
||||
- $12 #text[`>>`] 1 = 6$
|
||||
- $27 #text[`>>`] 3 = 3$
|
||||
- $16 #text[`>>`] 8 = 0$
|
||||
]
|
||||
|
||||
#v(1fr)
|
||||
|
@ -159,3 +159,15 @@ $
|
||||
])
|
||||
|
||||
#v(1fr)
|
||||
|
||||
|
||||
#problem()
|
||||
Using basic log rules, rewrite $log_2(1 / sqrt(x))$ in terms of $log_2(x)$.
|
||||
|
||||
#solution([
|
||||
$
|
||||
log_2(1 / sqrt(x)) = (-1) / (2)log_2(x)
|
||||
$
|
||||
])
|
||||
|
||||
#v(1fr)
|
||||
|
@ -20,14 +20,17 @@ float Q_rsqrt( float number ) {
|
||||
This code defines a function `Q_rsqrt` that consumes a float `number` and approximates its inverse square root.
|
||||
If we rewrite this using notation we're familiar with, we get the following:
|
||||
$
|
||||
#text[`Q_sqrt`] (n_f) = 6240089 - (n_i div 2)
|
||||
#text[`Q_sqrt`] (n_f) =
|
||||
#h(5mm)
|
||||
6240089 - (n_i div 2)
|
||||
#h(5mm)
|
||||
approx 1 / sqrt(n_f)
|
||||
$
|
||||
|
||||
#note([
|
||||
#note[
|
||||
`0x5f3759df` is $6240089$ in hexadecimal. \
|
||||
It is a magic number hard-coded into `Q_sqrt`.
|
||||
])
|
||||
]
|
||||
|
||||
#v(2mm)
|
||||
|
||||
@ -36,16 +39,28 @@ Our goal in this section is to understand why this works:
|
||||
- What's special about $6240089$?
|
||||
|
||||
|
||||
#problem()
|
||||
Using basic log rules, rewrite $log_2(1 / sqrt(x))$ in terms of $log_2(x)$.
|
||||
|
||||
#solution([
|
||||
$
|
||||
log_2(1 / sqrt(x)) = (-1) / (2)log_2(x)
|
||||
$
|
||||
])
|
||||
|
||||
#v(1fr)
|
||||
|
||||
#remark()
|
||||
For those that are interested, here are the details of the "code-to-math" translation:
|
||||
|
||||
- "`long i = * (long *) &number`" is C magic that tells the compiler \
|
||||
to set `i` to the `uint` value of the bits of `number`. \
|
||||
#note[
|
||||
"long" refers to a "long integer", which has 32 bits. \
|
||||
Normal `int`s have 16 bits, `short int`s have 8.
|
||||
] \
|
||||
In other words, `number` is $n_f$ and `i` is $n_i$.
|
||||
#v(2mm)
|
||||
|
||||
|
||||
- Notice the right-shift in the second line of the function. \
|
||||
We translated `(i >> i)` into $(n_i div 2)$.
|
||||
#v(2mm)
|
||||
|
||||
- "`return * (float *) &i`" is again C magic. \
|
||||
Much like before, it tells us to return the value of the bits of `i` as a float.
|
||||
#pagebreak()
|
||||
|
||||
#generic("Setup:")
|
||||
@ -144,6 +159,22 @@ $
|
||||
$
|
||||
|
||||
So, $0.045$ is the $epsilon$ used by Quake. \
|
||||
This constant was likely generated by trial-and-error, and is fairly close to the ideal $epsilon$.
|
||||
Online sources state that this constant was generated by trial-and-error, \
|
||||
though it is fairly close to the ideal $epsilon$.
|
||||
|
||||
#v(4mm)
|
||||
|
||||
#remark()
|
||||
And now, we're done! \
|
||||
We've shown that `Q_sqrt(x)` approximates $1/sqrt(x)$ fairly well, \
|
||||
thanks to the approximation $log(1+a) = a + epsilon$.
|
||||
|
||||
#v(2mm)
|
||||
|
||||
Notably, `Q_sqrt` uses _zero_ divisions or multiplications (`>>` doesn't count). \
|
||||
This makes it _very_ fast when compared to more traditional approximation techniques like Fourier series.
|
||||
|
||||
#v(2mm)
|
||||
|
||||
In the case of _Quake_, this is very important. 3D graphics require thousands of inverse-square-root calculations to render a single frame#footnote[e.g, to generate normal vectors], which is not an easy task for a Playstation running at 300MHz.
|
||||
|
||||
#if_no_solutions(v(2cm))
|
||||
|
Loading…
x
Reference in New Issue
Block a user