2024-04-24 15:33:33 -07:00

39 lines
1.3 KiB
TeX

\section{Introduction}
\definition{}
An \textit{alphabet} is a set of symbols. Two examples are
$\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}\}$ and $\{\texttt{0}, \texttt{1}\}$.
\definition{}
A \textit{string} is a sequence of symbols from an alphabet. \par
For example, \texttt{CBCAADDD} is a string over the alphabet $\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}\}$.
\problem{}
Say we want to store a length-$n$ string over the alphabet $\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}\}$ as a binary sequence. \par
How many bits will we need? \par
\hint{
Our alphabet has four symbols, so we can encode each symbol using two bits, \par
mapping $\texttt{A} \rightarrow \texttt{00}$,
$\texttt{B} \rightarrow \texttt{01}$,
$\texttt{C} \rightarrow \texttt{10}$, and
$\texttt{D} \rightarrow \texttt{11}$.
}
\begin{solution}
$2n$ bits.
\end{solution}
\vfill
\problem{}<naivelen>
Similarly, we can encode an $n$-symbol string over an alphabet of size $k$ \par
using $n \times \lceil \log_2k \rceil$ bits. Show that this is true. \par
\note[Note]{We'll call this the \textit{na\"ive coding scheme}.}
\vfill
As you might expect, this isn't ideal: we can do much better than $n \times \lceil \log_2k \rceil$.
We will spend the rest of this handout exploring more efficient ways of encoding such sequences of symbols.
\pagebreak