Polish
This commit is contained in:
		@@ -2,8 +2,7 @@
 | 
			
		||||
% use [solutions] flag to show solutions.
 | 
			
		||||
\documentclass[
 | 
			
		||||
	solutions,
 | 
			
		||||
	singlenumbering,
 | 
			
		||||
	unfinished
 | 
			
		||||
	singlenumbering
 | 
			
		||||
]{../../resources/ormc_handout}
 | 
			
		||||
\usepackage{../../resources/macros}
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -27,8 +27,9 @@ How many bits will we need? \par
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
\problem{}<naivelen>
 | 
			
		||||
Similarly, we can use a na\"ive coding scheme to encode an $n$-symbol string over an alphabet of size $k$ \par
 | 
			
		||||
using $n \times \lceil \log_2k \rceil$ bits. Convince yourself that this is true.
 | 
			
		||||
Similarly, we can encode an $n$-symbol string over an alphabet of size $k$ \par
 | 
			
		||||
using $n \times \lceil \log_2k \rceil$ bits. Show that this is true. \par
 | 
			
		||||
\note[Note]{We'll call this the \textit{na\"ive coding scheme}.}
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
\vfill
 | 
			
		||||
 
 | 
			
		||||
@@ -49,8 +49,13 @@ Using a na\"ive coding scheme, encode \texttt{AAAA$\cdot$AAAA$\cdot$BCD$\cdot$AA
 | 
			
		||||
 | 
			
		||||
\vfill
 | 
			
		||||
In \ref{runlenone}---and often, in the real world---the strings we want to encode have fairly low \textit{entropy}. \par
 | 
			
		||||
They have predictable patterns, sequences of symbols that don't contain a lot of information. \par
 | 
			
		||||
We can exploit this fact to develop efficient encoding schemes.
 | 
			
		||||
That is, they have predictable patterns, sequences of symbols that don't contain a lot of information. \par
 | 
			
		||||
\note{
 | 
			
		||||
	For example, consider the text in this document. \par
 | 
			
		||||
	The symbols \texttt{e}, \texttt{t}, and \texttt{<space>} are much more common than any others. \par
 | 
			
		||||
	Also, certain subsequences are repeated: \texttt{th}, \texttt{and}, \texttt{encode}, and so on.
 | 
			
		||||
}
 | 
			
		||||
We can exploit this fact to develop encoding schemes that need relatively few bits per letter.
 | 
			
		||||
 | 
			
		||||
\example{}
 | 
			
		||||
A simple example of such a coding scheme is \textit{run-length encoding}. Instead of simply listing letters of a string
 | 
			
		||||
@@ -88,10 +93,18 @@ We'll encode our string into a sequence of 6-bit blocks, interpreted as follows:
 | 
			
		||||
\end{center}
 | 
			
		||||
So, the sequence \texttt{BBB} will be encoded as \texttt{[0011-01]}. \par
 | 
			
		||||
\note[Notation]{
 | 
			
		||||
	Just like dots, dashes and spaces are added for readability. \par
 | 
			
		||||
	Just like dots, dashes and spaces are added for readability. Pretend they don't exist. \par
 | 
			
		||||
	Encoded binary sequences will always be written in square brackets. \texttt{[]}.
 | 
			
		||||
}
 | 
			
		||||
 | 
			
		||||
\problem{}
 | 
			
		||||
Decode \texttt{[010000001111]} using this scheme.
 | 
			
		||||
 | 
			
		||||
\begin{solution}
 | 
			
		||||
	\texttt{AAAADDD}
 | 
			
		||||
\end{solution}
 | 
			
		||||
\vfill
 | 
			
		||||
 | 
			
		||||
\problem{}
 | 
			
		||||
Encode \texttt{AAAA$\cdot$AAAA$\cdot$BCD$\cdot$AAAA$\cdot$AAAA} using this scheme. \par
 | 
			
		||||
Is this more or less efficient than \ref{runlenone}?
 | 
			
		||||
@@ -109,12 +122,26 @@ Is this more or less efficient than \ref{runlenone}?
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
\problem{}
 | 
			
		||||
Give an example of a message on $\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}\}$
 | 
			
		||||
that uses $n$ bits when encoded with a na\"ive scheme, and \textit{fewer} than $\nicefrac{n}{2}$ bits
 | 
			
		||||
when encoded using the scheme described on the previous page.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
\vfill
 | 
			
		||||
 | 
			
		||||
\problem{}
 | 
			
		||||
Give an example of a message on $\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}\}$
 | 
			
		||||
that uses $n$ bits when encoded with a na\"ive scheme, and \textit{more} than $2n$ bits
 | 
			
		||||
when encoded using the scheme described on the previous page.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
\vfill
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
\problem{}
 | 
			
		||||
Is run-length coding always efficient? When does it work well, and when does it fail?
 | 
			
		||||
Is run-length coding always more efficient than na\"ive coding? \par
 | 
			
		||||
When does it work well, and when does it fail?
 | 
			
		||||
 | 
			
		||||
\vfill
 | 
			
		||||
 | 
			
		||||
 
 | 
			
		||||
@@ -21,7 +21,8 @@ Pointers take the form \texttt{<pos, len>}, where \texttt{pos} is the position o
 | 
			
		||||
For example, we can encode the string \texttt{ABRACADABRA} as \texttt{[ABRACAD<7, 4>]}. \par
 | 
			
		||||
The pointer \texttt{<7, 4>} tells us to look back 7 positions (to the first \texttt{A}), and copy the next 4 symbols. \par
 | 
			
		||||
Note that pointers refer to the partially decoded output---\textit{not} to the encoded string. \par
 | 
			
		||||
This allows pointers to reference other pointers, and ensures that codes like \texttt{A<1,9>} are valid.
 | 
			
		||||
This allows pointers to reference other pointers, and ensures that codes like \texttt{A<1,9>} are valid. \par
 | 
			
		||||
\note{For example, \texttt{[B<1,2>]} decodes to \texttt{BBB}.}
 | 
			
		||||
 | 
			
		||||
\problem{}
 | 
			
		||||
Encode \texttt{ABCD$\cdot$ABCD$\cdot$BABABA$\cdot$ABCD$\cdot$ABCD} using this scheme. \par
 | 
			
		||||
 
 | 
			
		||||
		Reference in New Issue
	
	Block a user