2024-04-12 13:11:24 -07:00

38 lines
1.3 KiB
TeX

\section{Introduction}
\definition{}
An \textit{alphabet} is a set of symbols. Two examples are
$\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}\}$ and $\{\texttt{0}, \texttt{1}\}$.
\definition{}
A \textit{string} is a sequence of symbols from an alphabet. \par
For example, \texttt{CBCAADDD} is a string over the alphabet $\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}\}$.
\problem{}
Say we want to store a length-$n$ string over the alphabet $\{\texttt{A}, \texttt{B}, \texttt{C}, \texttt{D}\}$ as a binary blob. \par
How many bits will we need? \par
\hint{
Our alphabet has four symbols, so we can encode each symbol using two bits, \par
mapping $\texttt{A} \rightarrow \texttt{00}$,
$\texttt{B} \rightarrow \texttt{01}$,
$\texttt{C} \rightarrow \texttt{10}$, and
$\texttt{D} \rightarrow \texttt{11}$.
}
\begin{solution}
$2n$ bits.
\end{solution}
\vfill
\problem{}<naivelen>
Similarly, we can use a na\"ive coding scheme to encode an $n$-symbol string over an alphabet of size $k$ \par
using $n \times \lceil \log_2k \rceil$ bits. Convince yourself that this is true.
\vfill
Of course, this isn't ideal---we can do much better than $n \times \lceil \log_2k \rceil$.
We will spend the rest of this handout exploring more efficient ways of encoding such sequences of symbols.
\pagebreak