Polish
This commit is contained in:
@ -3,7 +3,7 @@
|
||||
|
||||
\example{}
|
||||
Now consider the alphabet $\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}, \texttt{E}\}$. \par
|
||||
With a na\"ive coding scheme, we can encode a length-$n$ string with $3n$ bits, by mapping...
|
||||
With a na\"ive coding scheme, we can encode a length $n$ string with $3n$ bits, by mapping...
|
||||
\begin{itemize}
|
||||
\item $\texttt{A}$ to $\texttt{000}$
|
||||
\item $\texttt{B}$ to $\texttt{001}$
|
||||
@ -12,12 +12,12 @@ With a na\"ive coding scheme, we can encode a length-$n$ string with $3n$ bits,
|
||||
\item $\texttt{E}$ to $\texttt{100}$
|
||||
\end{itemize}
|
||||
For example, this encodes \texttt{ADEBCE} as \texttt{[000 011 100 001 010 100]}. \par
|
||||
To encoding strings over $\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}, \texttt{E}\}$ with this scheme, we
|
||||
To encode strings over $\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}, \texttt{E}\}$ with this scheme, we
|
||||
need an average of three bits per symbol.
|
||||
|
||||
\vspace{2mm}
|
||||
|
||||
One could argue that this coding scheme is wasteful: \par
|
||||
However, one could argue that this coding scheme is wasteful: \par
|
||||
we're not using three of the eight possible three-bit sequences!
|
||||
|
||||
\example{}
|
||||
@ -86,9 +86,8 @@ Is this a good way to encode five-letter strings?
|
||||
|
||||
|
||||
\remark{}
|
||||
The code from the previous page can be visualized as a tree which we traverse while decoding our sequence.
|
||||
Starting from the topmost node, we take the left edge if we see a \texttt{0} and the right edge if we see a \texttt{1}.
|
||||
Once we reach a letter, we return to the top node and repeat the process.
|
||||
The code from the previous page can be visualized as a full binary tree: \par
|
||||
\note{Every node in a \textit{full binary tree} has either zero or two children.}
|
||||
|
||||
\vspace{-5mm}
|
||||
\null\hfill
|
||||
@ -135,10 +134,19 @@ Once we reach a letter, we return to the top node and repeat the process.
|
||||
\end{center}
|
||||
\end{minipage}
|
||||
\hfill\null
|
||||
You can think of each symbol's code as it's \say{address} in this tree.
|
||||
When decoding a string, we start at the topmost node. Reading the binary sequence
|
||||
bit by bit, we move down the tree, taking a left edge if we see a \texttt{0}
|
||||
and a right edge if we see a \texttt{1}.
|
||||
Once we reach a letter, we return to the top node and repeat the process.
|
||||
|
||||
|
||||
|
||||
\definition{}
|
||||
We say a coding scheme is \textit{prefix-free} if no whole code word is a prefix of another code word. \par
|
||||
|
||||
\problem{}
|
||||
Convince yourself that trees like the one above always produce a prefix-free code.
|
||||
|
||||
\problem{}<treedecode>
|
||||
Decode \texttt{[110111001001110110]} using the tree above.
|
||||
@ -149,6 +157,18 @@ Decode \texttt{[110111001001110110]} using the tree above.
|
||||
|
||||
\vfill
|
||||
|
||||
\problem{}
|
||||
Encode \texttt{ABDECBE} using this tree. \par
|
||||
How many bits do we save over a na\"ive scheme?
|
||||
|
||||
\begin{solution}
|
||||
This is \texttt{[00 01 110 111 10 01 111]}, and saves four bits.
|
||||
\end{solution}
|
||||
|
||||
|
||||
\vfill
|
||||
\pagebreak
|
||||
|
||||
\problem{}
|
||||
In \ref{treedecode}, we needed 18 bits to encode \texttt{DEACBDD}. \par
|
||||
\note{Note that we'd need $3 \times 7 = 21$ bits to encode this string na\"ively.}
|
||||
@ -236,13 +256,19 @@ Now, do the opposite: draw a tree that encodes \texttt{DEACBDD} \textit{less} ef
|
||||
\vfill
|
||||
|
||||
\remark{}
|
||||
We say a coding scheme is \textit{prefix-free} if no whole code word is a prefix of another code word. \par
|
||||
As we've seen, it is fairly easy to construct a prefix-free variable-length code using a binary tree. \par
|
||||
As we just saw, constructing a prefix-free code is fairly easy. \par
|
||||
Constucting the \textit{most efficient} prefix-free code for a given message is a bit more difficult. \par
|
||||
We'll spend the rest of this section solving this problem.
|
||||
|
||||
\pagebreak
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
\remark{}
|
||||
Let's restate our problem. \par
|
||||
Given an alphabet $A$ and a frequency function $f$, we want to construct a binary tree $T$ that minimizes
|
||||
@ -270,16 +296,13 @@ Where...
|
||||
|
||||
\vspace{2mm}
|
||||
|
||||
Also, notice that $\mathcal{B}_f(T)$ is the \say{average bits per symbol} metric we saw in previous problems.
|
||||
Also notice that $\mathcal{B}_f(T)$ is the \say{average bits per symbol} metric we saw in previous problems.
|
||||
|
||||
|
||||
\problem{}<hufptone>
|
||||
Let $f$ be fixed frequency function over an alphabet $A$. \par
|
||||
Let $T$ be an arbitrary tree for $A$, and let $a, b$ be two symbols in $A$. \par
|
||||
|
||||
\vspace{2mm}
|
||||
|
||||
Now, construct $T'$ by swapping $a$ and $b$ in $T$. Show that \par
|
||||
Construct $T'$ by swapping $a$ and $b$ in $T$. Show that \par
|
||||
\begin{equation*}
|
||||
\mathcal{B}_f(T) - \mathcal{B}_f(T') = \Bigl(f(b) - f(a)\Bigr) \times \Bigl(d_T(a) - d_T(b)\Bigr)
|
||||
\end{equation*}
|
||||
@ -300,8 +323,8 @@ Now, construct $T'$ by swapping $a$ and $b$ in $T$. Show that \par
|
||||
\pagebreak
|
||||
|
||||
\problem{}<hufpttwo>
|
||||
Show that is an optimal tree in which the two symbols with the lowest frequencies have the same parent.
|
||||
\hint{You may assume that an optimal tree exists. Check three nontrivial cases.}
|
||||
Show that there is an optimal tree in which the two symbols with the lowest frequencies have the same parent.
|
||||
\hint{You may assume that an optimal tree exists. There are a few cases.}
|
||||
|
||||
\begin{solution}
|
||||
Let $T$ be an optimal tree, and let $a, b$ be the two symbols with the lowest frequency. \par
|
||||
@ -356,7 +379,7 @@ Then, use the previous two problems to show that your algorithm indeed produces
|
||||
|
||||
\vspace{2mm}
|
||||
In plain english: pick the two nodes with the smallest frequency, combine them,
|
||||
and add that into the alphabet as a \say{compound symbol}. Repeat until you're done.
|
||||
and replace them with a \say{compound symbol}. Repeat until you're done.
|
||||
|
||||
|
||||
\linehack{}
|
||||
|
Reference in New Issue
Block a user