27 lines
1.1 KiB
TeX
27 lines
1.1 KiB
TeX
\section{Huffman Codes}
|
|
|
|
|
|
\remark{}
|
|
As a first example, consider the alphabet $\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}, \texttt{E}\}$. \par
|
|
With a na\"ive coding scheme, we can encode a length-$n$ string with $3n$ bits, by mapping...
|
|
\begin{itemize}
|
|
\item $\texttt{A}$ to $\texttt{000}$
|
|
\item $\texttt{B}$ to $\texttt{001}$
|
|
\item $\texttt{C}$ to $\texttt{010}$
|
|
\item $\texttt{D}$ to $\texttt{011}$
|
|
\item $\texttt{E}$ to $\texttt{100}$
|
|
\end{itemize}
|
|
With this scheme, the string \texttt{ADEBCE} becomes \texttt{[000 011 100 001 010 100]}. \par
|
|
This matches what we computed in \ref{naivelen}: ~ $6 \times \lceil \log_2(5) \rceil = 6 \times 3 = 18$. \par
|
|
\note[Notation]{
|
|
The spaces in \texttt{[000 011 100 001 010 100]} are provided for convenience. \par
|
|
This is equivalent to \texttt{[000011100001010100]}, but is easier to read. \par
|
|
In this handout, encoded binary blobs will always be written in square brackets.
|
|
}
|
|
|
|
\vspace{2mm}
|
|
|
|
You could argue that this coding scheme is wasteful: we're not using three of the eight possible three-bit sequences!
|
|
|
|
\vfill
|
|
\pagebreak |