Tom edits
This commit is contained in:
207
src/Advanced/Fast Inverse Root/parts/02 float.typ
Normal file
207
src/Advanced/Fast Inverse Root/parts/02 float.typ
Normal file
@ -0,0 +1,207 @@
|
||||
#import "@local/handout:0.1.0": *
|
||||
#import "@preview/cetz:0.3.1"
|
||||
|
||||
= Floats
|
||||
#definition()
|
||||
_Binary decimals_#footnote([Note that "binary decimal" is a misnomer---"deci" means "ten"!]) are very similar to base-10 decimals.\
|
||||
In base 10, we interpret place value as follows:
|
||||
- $0.1 = 10^(-1)$
|
||||
- $0.03 = 3 times 10^(-2)$
|
||||
- $0.0208 = 2 times 10^(-2) + 8 times 10^(-4)$
|
||||
|
||||
#v(5mm)
|
||||
|
||||
We can do the same in base 2:
|
||||
- $#text([`0.1`]) = 2^(-1) = 0.5$
|
||||
- $#text([`0.011`]) = 2^(-2) + 2^(-3) = 0.375$
|
||||
- $#text([`101.01`]) = 5.125$
|
||||
|
||||
#v(5mm)
|
||||
|
||||
#problem()
|
||||
Rewrite the following binary decimals in base 10: \
|
||||
#note([You may leave your answer as a fraction.])
|
||||
- `1011.101`
|
||||
- `110.1101`
|
||||
|
||||
|
||||
#v(1fr)
|
||||
#pagebreak()
|
||||
|
||||
#definition()
|
||||
Another way we can interpret a bit string is as a _signed floating-point decimal_, or a `float` for short. \
|
||||
Floats represent a subset of the real numbers, and are interpreted as follows: \
|
||||
#note([The following only applies to floats that consist of 32 bits. We won't encounter any others today.])
|
||||
|
||||
#align(
|
||||
center,
|
||||
box(
|
||||
inset: 2mm,
|
||||
cetz.canvas({
|
||||
import cetz.draw: *
|
||||
|
||||
let chars = (
|
||||
`0`,
|
||||
`b`,
|
||||
`0`,
|
||||
`_`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`_`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`_`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`_`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
`0`,
|
||||
)
|
||||
|
||||
let x = 0
|
||||
for c in chars {
|
||||
content((x, 0), c)
|
||||
x += 0.25
|
||||
}
|
||||
|
||||
let y = -0.4
|
||||
line((0.3, y), (0.65, y))
|
||||
content((0.45, y - 0.2), [s])
|
||||
|
||||
line((0.85, y), (2.9, y))
|
||||
content((1.9, y - 0.2), [exponent])
|
||||
|
||||
line((3.10, y), (9.4, y))
|
||||
content((6.3, y - 0.2), [fraction])
|
||||
}),
|
||||
),
|
||||
)
|
||||
|
||||
- The first bit denotes the sign of the float's value
|
||||
We'll label it $s$. \
|
||||
If $s = #text([`1`])$, this float is negative; if $s = #text([`0`])$, it is positive.
|
||||
|
||||
- The next eight bits represent the _exponent_ of this float.
|
||||
#note([(we'll see what that means soon)]) \
|
||||
We'll call the value of this eight-bit binary integer $E$. \
|
||||
Naturally, $0 <= E <= 255$ #note([(since $E$ consist of eight bits)])
|
||||
|
||||
- The remaining 23 bits represent the _fraction_ of this float. \
|
||||
They are interpreted as the fractional part of a binary decimal. \
|
||||
For example, the bits `0b10100000_00000000_00000000` represent $0.5 + 0.125 = 0.625$. \
|
||||
We'll call the value of these bits as a binary integer $F$. \
|
||||
Their value as a binary decimal is then $F div 2^23$. #note([(convince yourself of this)])
|
||||
|
||||
|
||||
#problem(label: "floata")
|
||||
Consider `0b01000001_10101000_00000000_00000000`. \
|
||||
Find the $s$, $E$, and $F$ we get if we interpret this bit string as a `float`. \
|
||||
#note([Leave $F$ as a sum of powers of two.])
|
||||
|
||||
#solution([
|
||||
$s = 0$ \
|
||||
$E = 258$ \
|
||||
$F = 2^31+2^19 = 2,621,440$
|
||||
])
|
||||
|
||||
#v(1fr)
|
||||
|
||||
|
||||
#definition(label: "floatdef")
|
||||
The final value of a float with sign $s$, exponent $E$, and fraction $F$ is
|
||||
|
||||
$
|
||||
(-1)^s times 2^(E - 127) times (1 + F / (2^(23)))
|
||||
$
|
||||
|
||||
Notice that this is very similar to base-10 scientific notation, which is written as
|
||||
|
||||
$
|
||||
(-1)^s times 10^(e) times (f)
|
||||
$
|
||||
|
||||
#note[
|
||||
We subtract 127 from $E$ so we can represent positive and negative numbers. \
|
||||
$E$ is an eight bit binary integer, so $0 <= E <= 255$ and thus $-127 <= (E - 127) <= 127$.
|
||||
]
|
||||
|
||||
#problem()
|
||||
Consider `0b01000001_10101000_00000000_00000000`. \
|
||||
This is the same bit string we used in @floata. \
|
||||
|
||||
#v(2mm)
|
||||
|
||||
What value do we get if we interpret this bit string as a float? \
|
||||
#hint([$21 div 16 = 1.3125$])
|
||||
|
||||
#solution([
|
||||
This is 21:
|
||||
$
|
||||
2^(131) times (1 + (2^(21) + 2^(19)) / (2^(23)))
|
||||
= 2^(4) times (1 + 0.25 + 0.0625)
|
||||
= 16 times (1.3125)
|
||||
= 21
|
||||
$
|
||||
])
|
||||
|
||||
#v(1fr)
|
||||
#pagebreak()
|
||||
|
||||
#problem()
|
||||
Encode $12.5$ as a float. \
|
||||
#hint([$12.5 div 8 = 1.5625$])
|
||||
|
||||
#solution([
|
||||
$
|
||||
12.5
|
||||
= 8 times 1.5625
|
||||
= 2^(3) times (1 + (0.5 + 0.0625))
|
||||
= 2^(130) times (1 + (2^(22) + 2^(19)) / (2^(23)))
|
||||
$
|
||||
|
||||
which is `0b01000001_01001000_00000000_00000000`. \
|
||||
])
|
||||
|
||||
|
||||
#v(1fr)
|
||||
|
||||
#definition()
|
||||
Say we have a bit string $x$. \
|
||||
We'll let $x_f$ denote the value we get if we interpret $x$ as a float, \
|
||||
and we'll let $x_i$ denote the value we get if we interpret $x$ an integer.
|
||||
|
||||
#problem()
|
||||
Let $x = #text[`0b01000001_01001000_00000000_00000000`]$. \
|
||||
What are $x_f$ and $x_i$? #note([As always, you may leave big numbers as powers of two.])
|
||||
#solution([
|
||||
$x_f = 12.5$
|
||||
|
||||
#v(2mm)
|
||||
|
||||
$x_i = 2^30 + 2^24 + 2^22 + 2^19 = 11,095,237,632$
|
||||
])
|
||||
|
||||
#v(1fr)
|
Reference in New Issue
Block a user