New build system
Some checks failed
Lints / typos (push) Successful in 21s
Build and deploy / build (push) Has been cancelled

This commit is contained in:
2025-01-21 18:35:58 -08:00
parent a14656d94f
commit a82cc8c79f
354 changed files with 870 additions and 1658 deletions

View File

@ -0,0 +1,56 @@
\section{Introduction}
\example{}<lockproblem>
A certain electronic lock has two buttons: \texttt{0} and \texttt{1}.
It opens as soon as the correct two-digit code is entered, completely ignoring
previous inputs. For example, if the correct code is \text{10}, the lock will open
once the sequence \texttt{010} is entered.
\vspace{2mm}
Naturally, there are $2^2 = 4$ possible combinations that open this lock. \par
If we don't know the lock's combination, we could try to guess it by trying all four combinations. \par
This would require eight key presses: \texttt{0001101100}.
\problem{}
There is, of course, a better way. \par
Unlock this lock with only 5 keypresses.
\begin{solution}
The sequence \texttt{00110} is guaranteed to unlock this lock.
\end{solution}
\vfill
Now, consider the same lock, now set with a three-digit binary code.
\problem{}
How many codes are possible?
\vfill
\problem{}
Show that there is no solution with fewer than three keypresses
\vfill
\problem{}
What is the shortest sequence that is guaranteed to unlock the lock? \par
\hint{You'll need 10 digits.}
\begin{solution}
\texttt{0001110100} will do.
\end{solution}
%\problem{}
%How about a four-digit code? How many digits do we need? \par
%
%\begin{instructornote}
% Don't spend too much time here.
% Provide a solution at the board once everyone has had a few
% minutes to think about this problem.
%\end{instructornote}
%
%\begin{solution}
% One example is \texttt{0000 1111 0110 0101 000}
%\end{solution}
\vfill
\pagebreak

View File

@ -0,0 +1,225 @@
\section{Words}
\definition{}
An \textit{alphabet} is a set of symbols. \par
For example, $\{\texttt{0}, \texttt{1}\}$ is an alphabet of two symbols,
and $\{\texttt{a}, \texttt{b}, \texttt{c}\}$ is an alphabet of three.
\definition{}
A \textit{word} over an alphabet $A$ is a sequence of symbols in that alphabet. \par
For example, $\texttt{00110}$ is a word over the alphabet $\{\texttt{0}, \texttt{1}\}$. \par
We'll let $\varnothing$ denote the empty word, which is a valid word over any alphabet.
\definition{}
Let $v$ and $w$ be words over the same alphabet. \par
We say $v$ is a \textit{subword} of $w$ if $v$ is contained in $w$. \par
\note{
In other words, $v$ is a subword of $w$ if we can construct $v$ \par
by removing a few characters from the start and end of $w$.
}
For example, \texttt{11} is a subword of \texttt{011}, but \texttt{00} is not.
\definition{}
Recall \ref{lockproblem}. Let's generalize this to the \textit{$n$-subword problem}: \par
Given an alphabet $A$ and a positive integer $n$,
we want a word over $A$ that contains all possible length-$n$ subwords.
The shortest word that solves a given $n$-subword problem is called the \textit{optimal solution}.
\problem{}
List all subwords of \texttt{110}. \par
\hint{There are six.}
\begin{solution}
They are $\varnothing$, \texttt{0}, \texttt{1}, \texttt{10}, \texttt{11}, and \texttt{110}.
\end{solution}
\vfill
\definition{}
Let $\mathcal{S}_n(w)$ be the number of subwords of length $n$ in a word $w$.
\problem{}
Find the following:
\begin{itemize}
\item $\mathcal{S}_n(\texttt{101001})$ for $n \in \{0, 1, ..., 6\}$
\item $\mathcal{S}_n(\texttt{abccac})$ for $n \in \{0, 1, ..., 6\}$
\end{itemize}
\begin{solution}
In order from $\mathcal{S}_0$ to $\mathcal{S}_6$:
\begin{itemize}
\item 1, 2, 3, 4, 3, 2, 1
\item 1, 3, 5, 4, 3, 2, 1
\end{itemize}
\end{solution}
\vfill
\pagebreak
\problem{}<sbounds>
Let $w$ be a word over an alphabet of size $k$. \par
Prove the following:
\begin{itemize}
\item $\mathcal{S}_n(w) \leq k^n$
\item $\mathcal{S}_n(w) \geq \mathcal{S}_{n-1}(w) - 1$
\item $\mathcal{S}_n(w) \leq k \times \mathcal{S}_{n-1}(w)$
\end{itemize}
\begin{solution}
\begin{itemize}
\item There are $k$ choices for each of $n$ letters in the subword.
So, there are $k^n$ possible words of length $n$, and $\mathcal{S}_n(w) \leq k^n$.
\item For almost every distinct subword counted by $\mathcal{S}_{n-1}$,
concatenating the next letter creates a distinct length $n$ subword.
The only exception is the last subword with length $n-1$, so
$\mathcal{S}_n(w) \geq \mathcal{S}_{n-1}(w) - 1$
\item For each subword counted by $\mathcal{S}_{n-1}$, there are $k$ possibilities
for the letter that follows in $w$. Each element in the count $\mathcal{S}_n$ comes from
one of $k$ different length $n$ words starting with an element counted by $\mathcal{S}_{n-1}$.
Thus, $\mathcal{S}_n(w) \leq k \times \mathcal{S}_{n-1}(w)$
\end{itemize}
\end{solution}
\vfill
\pagebreak
\definition{}
Let $v$ and $w$ be words over the same alphabet. \par
The word $vw$ is the word formed by writing $v$ after $w$. \par
For example, if $v = \texttt{1001}$ and $w = \texttt{10}$, $vw$ is $\texttt{100110}$.
\problem{}
Let $F_k$ denote the word over the alphabet $\{\texttt{0}, \texttt{1}\}$ obtained from the following relation:
\begin{equation*}
F_0 = \texttt{0}; ~~ F_1 = \texttt{1}; ~~ F_k = F_{k-1}F_{k-2}
\end{equation*}
We'll call this the \textit{Fibonacci word} of order $k$.
\begin{itemize}
\item What are $F_3$, $F_4$, and $F_5$?
\item Compute $\mathcal{S}_0$ through $\mathcal{S}_5$ for $F_5$.
\item Show that the length of $F_k$ is the $(k + 2)^\text{th}$ Fibonacci number. \par
\hint{Induction.}
\end{itemize}
\begin{solution}
\begin{itemize}
\item $F_3 = \texttt{101}$
\item $F_4 = \texttt{10110}$
\item $F_5 = \texttt{10110101}$
\end{itemize}
\linehack{}
\begin{itemize}
\item $\mathcal{S}_0 = 1$
\item $\mathcal{S}_1 = 2$
\item $\mathcal{S}_2 = 3$
\item $\mathcal{S}_3 = 4$
\item $\mathcal{S}_4 = 5$
\item $\mathcal{S}_5 = 4$
\end{itemize}
\linehack
As stated, use induction. The base case is trivial. \par
Let $N_k$ represent the Fibonacci numbers, with $N_0 = 0$, $N_1 = 1$, and $N_{k} = N_{k-1} + N_{k-2}$
\vspace{2mm}
Assume that $F_k$ has length $N_{k+2}$ for all $k \leq n$.
We want to show that $F_{k+1}$ has length $N_{k+3}$. \par
Since $F_{k} = F_{k-1}F_{k-2}$, it has the length $|F_{k-1}| + |F_{k-2}|$. \par
By our assumption, $|F_{k-1}| = N_{k+1}$ and $|F_{k-2}| = N_{k}$. \par
So, $|F_{k}| = |F_{k-1}| + |F_{k-2}| = N_{k+1} + N_{k} = N_{k + 2}$.
\end{solution}
\vfill
\pagebreak
% C_k is called the "Champernowne word" of order k.
\problem{}<cword>
Let $C_k$ denote the word over the alphabet $\{\texttt{0}, \texttt{1}\}$ obtained by \par
concatenating the binary representations of the integers $0,~...,~2^k -1$. \par
For example, $C_1 = \texttt{01}$, $C_2 = \texttt{011011}$, and $C_3 = \texttt{011011100101110111}$.
\begin{itemize}
% Good bonus problem, hard to find a closed-form solution
% \item How many symbols does the word $C_k$ contain?
\item Compute $\mathcal{S}_0$, $\mathcal{S}_1$, $\mathcal{S}_2$, and $\mathcal{S}_3$ for $C_3$.
\item Show that $\mathcal{S}_k(C_k) = 2^k - 1$.
\item Show that $\mathcal{S}_n(C_k) = 2^n$ for $n < k$.
\end{itemize}
\hint{
If $v$ is a subword of $w$ and $w$ is a subword of $u$, $v$ must be a subword of $u$. \par
In other words, the \say{subword} relation is transitive.
}
\begin{solution}
$\mathcal{S}_0 = 1$, $\mathcal{S}_1 = 2$, $\mathcal{S}_2 = 4$, and $\mathcal{S}_3 = 7$.
\linehack{}
First, we show that $\mathcal{S}_k(C_k) = 2^k - 1$. \par
Consider an arbitrary word $w$ of length $k$. We'll consider three cases:
\begin{itemize}
\item If $w$ consists only of zeros, $w$ does not appear in $C_k$.
\item If $w$ starts with a \texttt{1}, $w$ must appear in $C_k$ by construction.
\item If $w$ does starts with a \texttt{0} and contains a \texttt{1}, $w$ has the form
$\texttt{0}^x\texttt{1}\overline{\texttt{y}}$ \par
\note{
That is, $x$ copies of \texttt{0} followed by a \texttt{1}, followed by \par
an arbitrary sequence $\overline{\texttt{y}}$ with length $(k-x-1)$.
} \par
Now consider the word $\texttt{1}\overline{\texttt{y}}\texttt{0}^x\texttt{1}\overline{\texttt{y}}\texttt{0}^{(x-1)}\texttt{1}$. \par
This is the concatenation of two consecutive binary numbers with $k$ digits, and thus appears in $C_k$.
$w$ is a subword of this word, and therefore also appears in $C_k$.
\end{itemize}
\linehack{}
We can use the above result to conclude that $\mathcal{S}_n(C_k) = 2^n$ for $n < k$: \par
If we take any word of length $n < k$ and repeatedly append \texttt{1} to create a word of length $k$, \par
we end up with a subword of $C_k$ by the reasoning above. \par
Thus, any word of length $n$ is a subword of $w$, of which there are $2^n$.
\end{solution}
\vfill
\problem{}
Convince yourself that $C_{n+1}$ provides a solution to the $n$-subword problem over $\{\texttt{0}, \texttt{1}\}$. \par
\note[Note]{$C_{n+1}$ may or may not be an \textit{optimal} solution---but it is a \textit{valid} solution} \par
Which part of \ref{cword} shows that this is true?
\pagebreak

View File

@ -0,0 +1,392 @@
\section{De Bruijn Words}
Before we continue, we'll need to review some basic
graph theory.
\definition{}
A \textit{directed graph} consists of nodes and directed edges. \par
An example is shown below. It consists of three vertices (labeled $a, b, c$), \par
and five edges (labeled $0, ... , 4$).
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (a) at (0, 0) {$a$};
\node[main] (b) at (2, 0) {$b$};
\node[main] (c) at (4, 0) {$c$};
\end{scope}
\draw[->]
(a) edge node[label] {$0$} (b)
(a) edge[loop above] node[label] {$1$} (a)
(b) edge[bend left] node[label] {$2$} (c)
(b) edge[loop above] node[label] {$3$} (b)
(c) edge[bend left] node[label] {$4$} (b)
;
\end{tikzpicture}
\end{center}
\definition{}
A \textit{path} in a graph is a sequence of adjacent edges, \par
In a directed graph, edges $a$ and $b$ are adjacent if $a$ ends at the node which $b$ starts at. \par
\vspace{2mm}
For example, consider the graph above. \par
The edges $1$ and $0$ are adjacent, since you can take edge $0$ after taking edge $1$. \par
$0$ starts where $1$ ends. \par
$0$ and $1$, however, are not: $1$ does not start at the edge at which $0$ ends.
\definition{}
An \textit{Eulerian path} is a path that visits each edge of a graph exactly once. \par
An \textit{Eulerian cycle} is an Eulerian path that starts and ends on the same node.
\problem{}
Find the single unique Eulerian cycle in the graph below.
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (a) at (0, 0) {$a$};
\node[main] (b) at (2, 0) {$b$};
\node[main] (c) at (4, 0) {$c$};
\end{scope}
\draw[->]
(a) edge[bend left] node[label] {$0$} (b)
(b) edge[bend left] node[label] {$1$} (a)
(b) edge[bend left] node[label] {$2$} (c)
(c) edge[bend left] node[label] {$3$} (b)
(c) edge[loop right] node[label] {$4$} (c)
;
\end{tikzpicture}
\end{center}
\begin{solution}
$24310$ is one way to write this cycle. \par
There are other options, but they're all the same.
\end{solution}
\vfill
\theorem{}<eulerexists>
A directed graph contains an Eulerian cycle iff...
\begin{itemize}
\item There is a path between every pair of nodes, and
\item every node has as many \say{in} edges as it has \say{out} edges.
\end{itemize}
If the a graph contains an Eulerian cycle, it must contain an Eulerian path. \note{(why?)} \par
Some graphs contain an Eulerian path, but not a cycle. In this case, both conditions above must
still hold, but the following exceptions are allowed:
\begin{itemize}
\item There may be at most one node where $(\text{number in} - \text{number out}) = 1$
\item There may be at most one node where $(\text{number in} - \text{number out}) = -1$
\end{itemize}
\note[Note]{Either both exceptions occur, or neither occurs. Bonus problem: why?}
We won't provide a proof of this theorem today. However, you should convince yourself that it is true:
if any of these conditions are violated, why do we know that an Eulerian cycle (or path) cannot exist?
\pagebreak
\definition{}
Now, consider the $n$-subword problem over $\{\texttt{0}, \texttt{1}\}$. \par
We'll call the optimal solution to this problem a \textit{De Bruijn\footnotemark{} word} of order $n$. \par
\footnotetext{Dutch. Rhymes with \say{De Grown.}}
\problem{}<dbbounds>
Let $w$ be the an order-$n$ De Bruijn word, and denote its length with $|w|$. \par
Show that the following bounds always hold:
\begin{itemize}
\item $|w| \leq n2^n$
\item $|w| \geq 2^n + n - 1$
\end{itemize}
\begin{solution}
\begin{itemize}
\item There are $2^n$ binary words with length $n$. \par
Concatenate these to get a word with length $n2^n$.
\item A word must have at least $2^n + n - 1$ letters to have $2^n$ subwords with length $n$.
\end{itemize}
\end{solution}
\remark{}
Now, we'd like to show that the length of a De Bruijn word is always $2^n + n - 1$ \par
That is, that the optimal solution to the subword problem always has $2^n + n - 1$ letters. \par
We'll do this by construction: for a given $n$, we want to build a word with length $2^n + n - 1$
that solves the binary $n$-subword problem.
\definition{}
Consider a $n$-length word $w$. \par
The \textit{prefix} of $w$ is the word formed by the first $n-1$ letters of $w$. \par
The \textit{suffix} of $w$ is the word formed by the last $n-1$ letters of $w$. \par
For example, the prefix of the word \texttt{1101} is \texttt{110}, and its suffix is \texttt{101}.
The prefix and suffix of any one-letter word are both $\varnothing$.
\definition{}
A \textit{De Bruijn graph} of order $n$, denoted $G_n$, is constructed as follows:
\begin{itemize}
\item Nodes are created for each word of length $n - 1$.
\item A directed edge is drawn from $a$ to $b$ if the suffix of
$a$ matches the prefix of $b$. \par
Note that a node may have an edge to itself.
\item We label each edge with the last letter of $b$.
\end{itemize}
$G_2$ and $G_3$ are shown below.
\null\hfill
\begin{minipage}{0.48\textwidth}
\begin{center}
$G_2$
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (0) at (0, 0) {\texttt{0}};
\node[main] (1) at (2, 0) {\texttt{1}};
\end{scope}
\draw[->]
(0) edge[loop left] node[label] {$0$} (0)
(1) edge[loop right] node[label] {$1$} (1)
(1) edge[bend left] node[label] {$0$} (0)
(0) edge[bend left] node[label] {$1$} (1)
;
\end{tikzpicture}
\end{center}
\end{minipage}
\hfill
\begin{minipage}{0.48\textwidth}
\begin{center}
$G_3$
\begin{tikzpicture}[scale = 0.9]
\begin{scope}[layer = nodes]
\node[main] (00) at (0, 0) {\texttt{00}};
\node[main] (01) at (2, 1) {\texttt{01}};
\node[main] (10) at (2, -1) {\texttt{10}};
\node[main] (11) at (4, 0) {\texttt{11}};
\end{scope}
\draw[->]
(00) edge[loop left] node[label] {$0$} (00)
(11) edge[loop right] node[label] {$1$} (11)
(00) edge[bend left] node[label] {$1$} (01)
(01) edge[bend left] node[label] {$0$} (10)
(10) edge[bend left] node[label] {$1$} (01)
(10) edge[bend left] node[label] {$0$} (00)
(01) edge[bend left] node[label] {$1$} (11)
(11) edge[bend left] node[label] {$0$} (10)
;
\end{tikzpicture}
\end{center}
\end{minipage}
\hfill\null
\vfill
\pagebreak
\problem{}
Draw $G_4$.
\begin{solution}
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (7) at (0, 0) {\texttt{111}};
\node[main] (3) at (0, -2) {\texttt{011}};
\node[main] (6) at (2, -2) {\texttt{110}};
\node[main] (4) at (4, -2) {\texttt{100}};
\node[main] (1) at (-4, -4) {\texttt{001}};
\node[main] (5) at (0, -4) {\texttt{101}};
\node[main] (2) at (-2, -4) {\texttt{010}};
\node[main] (0) at (-2, -6) {\texttt{000}};
\end{scope}
\draw[->]
(0) edge[loop left, looseness = 7] node[label] {\texttt{0}} (0)
(7) edge[loop above, looseness = 7] node[label] {\texttt{1}} (7)
(0) edge[out=90,in=-90] node[label] {\texttt{1}} (1)
(1) edge node[label] {\texttt{0}} (2)
(1) edge[out=45,in=-135] node[label] {\texttt{1}} (3)
(2) edge[bend left] node[label] {\texttt{1}} (5)
(3) edge node[label] {\texttt{0}} (6)
(3) edge node[label] {\texttt{1}} (7)
(5) edge[bend left] node[label] {\texttt{0}} (2)
(5) edge node[label] {\texttt{1}} (3)
(6) edge[bend left] node[label] {\texttt{0}} (4)
(6) edge[out=-90,in=0] node[label] {\texttt{1}} (5)
(7) edge[out=0,in=90] node[label] {\texttt{0}} (6)
;
\draw[->, rounded corners = 10mm]
(4) to (4, 2) to node[label] {\texttt{1}} (-4, 2) to (1)
;
\draw[->, rounded corners = 10mm]
(4) to (4, -6) to node[label] {\texttt{0}} (0)
;
\draw[->, rounded corners = 5mm]
(2) to (-2, -5) to node[label] {\texttt{0}} (3, -5) to (3, -2) to (4)
;
\end{tikzpicture}
\end{center}
\begin{instructornote}
This graph also appears as a solution to a different
problem in the DFA handout.
\end{instructornote}
\end{solution}
\vfill
\pagebreak
\problem{}
\begin{itemize}
\item Show that $G_n$ has $2^{n-1}$ nodes and $2^n$ edges;
\item that each node has two outgoing edges;
\item and that there are as many edges labeled $0$ as are labeled $1$.
\end{itemize}
\begin{solution}
\begin{itemize}
\item There $2^{n-1}$ binary words of length $n-1$.
\item The suffix of a given word is the prefix of two other words, \par
so there are two edges leaving each node.
\item One of those words will end with one, and the other will end with zero.
\item Our $2^{n-1}$ nodes each have $2$ outgoing edges---we thus have $2^n$ edges in total.
\end{itemize}
\end{solution}
\vfill
\problem{}<dbpath>
Show that $G_4$ always contains an Eulerian path. \par
\hint{\ref{eulerexists}}
\vfill
\theorem{}<dbeuler>
We can now easily construct De Bruijn words for a given $n$: \par
\begin{itemize}
\item Construct $G_n$,
\item find an Eulerian cycle in $G_n$,
\item then, construct a De Bruijn word by writing the label of our starting vertex,
then appending the label of every edge we travel.
\end{itemize}
\problem{}
Find De Bruijn words of orders $2$, $3$, and $4$.
\begin{solution}
\begin{itemize}
\item
One Eulerian cycle in $G_2$ starts at node \texttt{0}, and takes the edges labeled $[1, 1, 0, 0]$. \par
We thus have the word \texttt{01100}.
\item
In $G_3$, we have an Eulerian cycle that visits nodes in the following order: \par
$
\texttt{00}
\rightarrow \texttt{01}
\rightarrow \texttt{11}
\rightarrow \texttt{11}
\rightarrow \texttt{10}
\rightarrow \texttt{01}
\rightarrow \texttt{10}
\rightarrow \texttt{00}
\rightarrow \texttt{00}
$\par
This gives us the word \texttt{0011101000}
\item Similarly, we $G_4$ gives us the word \texttt{0001 0011 0101 1110 000}. \par
\note{Spaces have been added for convenience.}
\end{itemize}
\end{solution}
\vfill
\pagebreak
Let's quickly show that the process described in \ref{dbeuler}
indeed produces a valid De Bruijn word.
\problem{}<dblength>
How long will a word generated by the above process be?
\begin{solution}
A De Bruijn graph has $2^n$ edges, each of which is traversed exactly once.
The starting node consists of $n - 1$ letters.
\vspace{2mm}
Thus, the resulting word contains $2^n + n - 1$ symbols.
\end{solution}
\vfill
\problem{}<dbsubset>
Show that a word generated by the process in \ref{dbeuler}
contains every possible length-$n$ subword. \par
In other words, show that $\mathcal{S}_n(w) = 2^n$ for a generated word $w$.
\begin{solution}
Any length-$n$ subword of $w$ is the concatenation of a vertex label and an edge label.
By construction, the next length-$n$ subword is the concatenation of the next vertex and edge
in the Eulerian cycle.
\vspace{2mm}
This cycle traverses each edge exactly once, so each length-$n$ subword is distinct. \par
Since $w$ has length $2^n + n - 1$, there are $2^n$ total subwords. \par
These are all different, so $\mathcal{S}_n \geq 2^n$. \par
However, $\mathcal{S}_n \leq 2^n$ by \ref{sbounds}, so $\mathcal{S}_n = 2^n$.
\end{solution}
\vfill
\remark{}
\begin{itemize}
\item We found that \ref{dbeuler} generates a word with length $2^n + n - 1$ in \ref{dblength}, \par
\item and we showed that this word always solves the $n$-subword problem in \ref{dbsubset}.
\item From \ref{dbbounds}, we know that any solution to the binary $n$-subword problem \par
must have at least $2^n + n - 1$ letters.
\item Finally, \ref{dbpath} guarantees that it is possible to generate such a word in any $G_n$.
\end{itemize}
Thus, we have shown that the process in \ref{dbeuler} generates ideal solutions
to the $n$-subword problem, and that such solutions always exist.
We can now conclude that for any $n$, the binary $n$-subword problem may be solved with a word of length $2^n + n - 1$.
\pagebreak

View File

@ -0,0 +1,110 @@
\section{Line Graphs}
\problem{}
Given a graph $G$, we can construct a graph called the \par
\textit{line graph} of $G$ (\hspace{0.3ex}denoted $\mathcal{L}(G)$\hspace{0.3ex}) by doing the following: \par
\begin{itemize}
\item Creating a node in $\mathcal{L}(G)$ for each edge in $G$
\item Drawing a directed edge between every pair of nodes $a, b$ in $\mathcal{L}(G)$ \par
if the corresponding edges in $G$ are adjacent. \par
\note{That is, if edge $b$ in $G$ starts at the node at which $a$ ends.}
\end{itemize}
\problem{}
Draw the line graph for the graph below. \par
Have an instructor check your solution.
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (a) at (0, 0) {$a$};
\node[main] (b) at (2, 0) {$b$};
\node[main] (c) at (4, 0) {$c$};
\end{scope}
\draw[->]
(a) edge[bend left] node[label] {$0$} (b)
(b) edge[bend left] node[label] {$1$} (a)
(b) edge[bend left] node[label] {$2$} (c)
(c) edge[bend left] node[label] {$3$} (b)
(c) edge[loop right] node[label] {$4$} (c)
;
\end{tikzpicture}
\end{center}
\begin{solution}
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (1) at (0, 0) {$1$};
\node[main] (4) at (3.5, 1) {$4$};
\node[main] (3) at (2, 0) {$3$};
\node[main] (2) at (2, 2) {$2$};
\node[main] (0) at (0, 2) {$0$};
\end{scope}
\draw[->]
(0) edge[bend left] (1)
(1) edge[bend left] (0)
(0) edge (2)
(2) edge (3)
(2) edge[bend left] (4)
(4) edge[bend left] (2)
(3) edge (1)
(4) edge (3)
(4) edge[loop right] (4)
;
\end{tikzpicture}
\end{center}
\end{solution}
\vfill
\definition{}
We say a graph $G$ is \textit{connected} if there is a path
between any two vertices of $G$.
\problem{}
Show that if $G$ is connected, $\mathcal{L}(G)$ is connected.
\begin{solution}
Let $a, b$ and $x, y$ be nodes in a connected graph $G$ so that an edges $a \rightarrow b$ and
and $x \rightarrow y$ exist. Since $G$ is connected, we can find a path from $b$ to $x$.
The path $a$ to $y$ corresponds to a path in $\mathcal{L}(G)$ between $a \rightarrow b$ and $x \rightarrow y$.
\end{solution}
\vfill
\pagebreak
\definition{}
Consider $\mathcal{L}(G_n)$, where $G_n$ is the $n^\text{th}$ order De Bruijn graph. \par
\vspace{2mm}
We'll need to label the vertices of $\mathcal{L}(G_n)$. To do this, do the following:
\begin{itemize}
\item Let $a$ and $b$ be nodes in $G_n$
\item Let \texttt{x} be the first letter of $a$
\item Let \texttt{y}, the last letter of $b$
\item Let $\overline{\texttt{p}}$ be the prefix/suffix that $a$ and $b$ share. \par
Note that $a = \texttt{x}\overline{\texttt{p}}$ and $b = \overline{\texttt{p}}\texttt{y}$,
\end{itemize}
Now, relabel the edge from $a$ to $b$ as $\texttt{x}\overline{\texttt{p}}\texttt{y}$. \par
Use these new labels to name nodes in $\mathcal{L}(G_n)$.
\problem{}
Construct $\mathcal{L}(G_2)$ and $\mathcal{L}(G_3)$. What do you notice? \par
\hint{
What are $\mathcal{L}(G_2)$ and $\mathcal{L}(G_3)$? We've seen them before! \par
You may need to re-label a few edges.
}
\begin{solution}
After fixing edge labels, we find that
$\mathcal{L}(G_2) \cong G_3$ and $\mathcal{L}(G_3) \cong G_4$
\end{solution}
\vfill
\pagebreak

View File

@ -0,0 +1,434 @@
\section{Sturmian Words}
A De Bruijn word is the shortest word that contains all subwords
of a given length. \par
Let's now solve a similar problem: given an alphabet, we want to
construct a word that contains exactly $m$ distinct subwords of
length $n$.
\vspace{2mm}
% TODO: better, intuitive description
In general, this is a difficult problem. We'll restrict ourselves
to a special case: \par
We'd like to find a word that contains exactly $m + 1$ distinct subwords
of length $m$ for all $m < n$.
\definition{}
We say a word $w$ is a \textit{Sturmian word} of order $n$
if $\mathcal{S}_m(w) = m + 1$ for all $m \leq n$. \par
We say $w$ is a \textit{minimal} Sturmian word if there is no shorter
Sturmian word of that order.
\problem{}
Show that the length of a Sturmian word of order $n$ is at least $2n$.
\begin{solution}
In order to have $n + 1$ subwords of length $n$, a word must have at
least $(n+1) + (n-1) = 2n$ letters.
\end{solution}
\vfill
\pagebreak
\problem{}
Construct $R_3$ by removing four edges from $G_3$. \par
Show that each of the following is possible:
\begin{itemize}[itemsep=2mm ]
\item $R_3$ does not contain an Eulerian path.
\item $R_3$ contains an Eulerian path, and this path \par
constructs a word $w$ with $\mathcal{S}_3(w) = 4$
and $\mathcal{S}_2(w) = 4$.
\item $R_3$ contains an Eulerian path, and this path \par
constructs a word $w$ that is a minimal Sturmian word
of order 3.
\end{itemize}
\begin{solution}
Remove the edges $\texttt{00} \rightarrow \texttt{01}$,
$\texttt{01} \rightarrow \texttt{10}$,
$\texttt{10} \rightarrow \texttt{00}$, and
$\texttt{11} \rightarrow \texttt{11}$:
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (00) at (0, 0) {\texttt{00}};
\node[main] (01) at (2, 1) {\texttt{01}};
\node[main] (10) at (2, -1) {\texttt{10}};
\node[main] (11) at (4, 0) {\texttt{11}};
\end{scope}
\draw[->]
(00) edge[loop left] node[label] {$0$} (00)
(10) edge[bend left] node[label] {$1$} (01)
(01) edge[bend left] node[label] {$1$} (11)
(11) edge[bend left] node[label] {$0$} (10)
;
\end{tikzpicture}
\end{center}
\linehack{}
Remove the edges $\texttt{00} \rightarrow \texttt{00}$,
$\texttt{01} \rightarrow \texttt{10}$,
$\texttt{10} \rightarrow \texttt{01}$, and
$\texttt{11} \rightarrow \texttt{11}$. \par
The Eulerian path starting at \texttt{00} produces \texttt{001100},
where $\mathcal{S}_2 = \mathcal{S}_3 = 4$.
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (00) at (0, 0) {\texttt{00}};
\node[main] (01) at (2, 1) {\texttt{01}};
\node[main] (10) at (2, -1) {\texttt{10}};
\node[main] (11) at (4, 0) {\texttt{11}};
\end{scope}
\draw[->]
(00) edge[bend left] node[label] {$1$} (01)
(10) edge[bend left] node[label] {$0$} (00)
(01) edge[bend left] node[label] {$1$} (11)
(11) edge[bend left] node[label] {$0$} (10)
;
\end{tikzpicture}
\end{center}
\linehack{}
Remove the edges $\texttt{01} \rightarrow \texttt{11}$,
$\texttt{10} \rightarrow \texttt{00}$,
$\texttt{11} \rightarrow \texttt{10}$, and
$\texttt{11} \rightarrow \texttt{11}$. \par
The Eulerian path starting at \texttt{00} produces \texttt{000101},
where $\mathcal{S}_0 = 1$, $\mathcal{S}_1 = 2$, $\mathcal{S}_2 = 3$,
and $\mathcal{S}_3 = 4$. \par
\texttt{000101} has length $2 \times 3 = 6$, and is thus minimal.
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (00) at (0, 0) {\texttt{00}};
\node[main] (01) at (2, 1) {\texttt{01}};
\node[main] (10) at (2, -1) {\texttt{10}};
\node[main] (11) at (4, 0) {\texttt{11}};
\end{scope}
\draw[->]
(00) edge[loop left] node[label] {$0$} (00)
(00) edge[bend left] node[label] {$1$} (01)
(01) edge[bend left] node[label] {$0$} (10)
(10) edge[bend left] node[label] {$1$} (01)
;
\end{tikzpicture}
\end{center}
Note that this graph contains an Eulerian path even though
\texttt{11} is disconnected. \par
An Eulerian path needs to visit all \textit{edges}, not all \textit{nodes}!
\end{solution}
\vfill
\pagebreak
\problem{}<trysturmian>
Construct $R_2$ by removing one edge from $G_2$, then construct $\mathcal{L}(R_2)$. \par
\begin{itemize}
\item If this line graph has four edges, set $R_3 = \mathcal{L}(R_2)$. \par
\item If not, remove one edge from $\mathcal{L}(R_2)$ so that an Eulerian path still exists
and set $R_3$ to the resulting graph.
\end{itemize}
Label each edge in $R_3$ with the last letter of its target node. \par
Let $w$ be the word generated by an Eulerian path in this graph, as before.
\vspace{2mm}
Attempt the above construction a few times. Is $w$ a minimal Sturmian word?
\begin{solution}
If $R_2$ is constructed by removing the edge $\texttt{0} \rightarrow \texttt{1}$,
$\mathcal{L}(R_2)$ is the graph shown below.
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (00) at (0, 0) {\texttt{00}};
\node[main] (01) at (2, 1) {\texttt{01}};
\node[main] (10) at (2, -1) {\texttt{10}};
\node[main] (11) at (4, 0) {\texttt{11}};
\end{scope}
\draw[->]
(00) edge[loop left] node[label] {$0$} (00)
(10) edge[bend left] node[label] {$0$} (00)
(11) edge[bend left] node[label] {$0$} (10)
(11) edge[loop right] node[label] {$1$} (11)
;
\end{tikzpicture}
\end{center}
We obtain the Sturmian word \texttt{111000} via the Eulerian path through the nodes
$\texttt{11} \rightarrow \texttt{11} \rightarrow \texttt{10}
\rightarrow \texttt{00} \rightarrow \texttt{00}$.
\linehack{}
If $R_2$ is constructed by removing the edge $\texttt{0} \rightarrow \texttt{0}$,
$\mathcal{L}(R_2)$ is the graph pictured below.
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (00) at (0, 0) {\texttt{00}};
\node[main] (01) at (2, 1) {\texttt{01}};
\node[main] (10) at (2, -1) {\texttt{10}};
\node[main] (11) at (4, 0) {\texttt{11}};
\end{scope}
\draw[->]
(01) edge[bend left] node[label] {$0$} (10)
(10) edge[bend left] node[label] {$1$} (01)
(11) edge[bend left] node[label] {$0$} (10)
(01) edge[bend left] node[label] {$1$} (11)
(11) edge[loop right] node[label] {$1$} (11)
;
\end{tikzpicture}
\end{center}
This graph contains five edges, we need to remove one. \par
To keep an Eulerian path, we can remove any of the following:
\begin{itemize}
\item $\texttt{10} \rightarrow \texttt{01}$ to produce \texttt{011101}
\item $\texttt{01} \rightarrow \texttt{11}$ to produce \texttt{111010}
\item $\texttt{11} \rightarrow \texttt{10}$ to produce \texttt{010111}
\item $\texttt{11} \rightarrow \texttt{11}$ to produce \texttt{011010}
\end{itemize}
Each of these is a minimal Sturmian word.
\linehack{}
The case in which we remove $\texttt{1} \rightarrow \texttt{0}$ in $G_2$ should
produce a minimal Sturmian word where \texttt{0} and \texttt{1} are interchanged
in the word produced by removing $\texttt{0} \rightarrow \texttt{1}$.
\vspace{2mm}
If we remove $\texttt{1} \rightarrow \texttt{1}$ will produce minimal
Sturmian words where \texttt{0} and \texttt{1} are interchanged from the words
produced by removing $\texttt{0} \rightarrow \texttt{0}$.
\end{solution}
\vfill
\pagebreak
\theorem{}<sturmanthm>
We can construct a minimal Sturmian word of order $n \geq 3$ as follows:
\begin{itemize}
\item Start with $G_2$, create $R_2$ by removing one edge.
\item Construct $\mathcal{L}(G_2)$, remove an edge if necessary. \par
The resulting graph must have an 4 edges and an Eulerian path. Call this $R_3$.
\item Repeat the previous step to construct a sequence of graphs $R_n$. \par
$R_{n-1}$ is used to create $R_n$, which has $n + 1$ edges and an Eulerian path. \par
Label edges with the last letter of their target vertex.
\item Construct a word $w$ using the Eulerian path, as before. \par
This is a minimal Sturmian word.
\end{itemize}
For now, assume this theorem holds. We'll prove it in the next few problems.
\problem{}<sturmianfour>
Construct a minimal Sturmain word of order 4.
\begin{solution}
Let $R_3$ be the graph below (see \ref{trysturmian}).
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (00) at (0, 0) {\texttt{00}};
\node[main] (01) at (2, 1) {\texttt{01}};
\node[main] (10) at (2, -1) {\texttt{10}};
\node[main] (11) at (4, 0) {\texttt{11}};
\end{scope}
\draw[->]
(00) edge[loop left] node[label] {$0$} (00)
(10) edge[bend left] node[label] {$0$} (00)
(11) edge[bend left] node[label] {$0$} (10)
(11) edge[loop right] node[label] {$1$} (11)
;
\end{tikzpicture}
\end{center}
$R_4 = \mathcal{L}(R_3)$ is then as shown below, producing the
order $4$ minimal Sturman word \texttt{11110000}. Disconnected
nodes are omitted.
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (000) at (0, 0) {\texttt{000}};
\node[main] (100) at (2, 1) {\texttt{100}};
\node[main] (110) at (2, -1) {\texttt{110}};
\node[main] (111) at (4, 0) {\texttt{111}};
\end{scope}
\draw[->]
(000) edge[loop left] node[label] {$0$} (000)
(100) edge[bend right] node[label] {$0$} (000)
(110) edge[bend left] node[label] {$0$} (100)
(111) edge[bend left] node[label] {$0$} (110)
(11) edge[loop right] node[label] {$1$} (11)
;
\end{tikzpicture}
\end{center}
\end{solution}
\vfill
\pagebreak
\problem{}
Construct a minimal Sturmain word of order 5.
\begin{solution}
Use $R_4$ from \ref{sturmianfour} to construct $R_5$, shown below. \par
Disconnected nodes are omitted.
\begin{center}
\begin{tikzpicture}
\begin{scope}[layer = nodes]
\node[main] (0000) at (0, 0) {\texttt{0000}};
\node[main] (1000) at (2, 0) {\texttt{1000}};
\node[main] (1100) at (4, 0) {\texttt{1100}};
\node[main] (1110) at (6, 0) {\texttt{1110}};
\node[main] (1111) at (8, 0) {\texttt{1111}};
\end{scope}
\draw[->]
(1111) edge[loop right] node[label] {$1$} (1111)
(1111) edge[bend right] node[label] {$0$} (1110)
(1110) edge[bend left] node[label] {$0$} (1100)
(1100) edge[bend right] node[label] {$0$} (1000)
(1000) edge[bend left] node[label] {$0$} (0000)
(0000) edge[loop left] node[label] {$0$} (0000)
;
\end{tikzpicture}
\end{center}
This graph generates the minimal Sturmian word \texttt{1111100000}
\end{solution}
\vfill
\pagebreak
\problem{}
Argue that the words we get by \ref{sturmanthm} are minimal Sturmain words. \par
That is, the word $w$ has length $2n$ and $\mathcal{S}_m(w) = m + 1$ for all $m \leq n$.
\begin{solution}
We proceed by induction. \par
First, show that we can produce a minimal order 3 Sturmian word: \par
\vspace{2mm}
$R_3$ is guaranteed to have four edges with length-$2$ node labels,
the length of $w$ is $2 \times 3 = 6$. \par
Trivially, we also have $\mathcal{S}_0 = 1$ and $\mathcal{S}_1 = 2$. \par
\vspace{2mm}
There are three vertices of $R_3$ given by the three remaining nodes of $R_2$.
Each length-2 subword of $w$ will be represented by the label of one of these
three nodes. Thus, $\mathcal{S}_2(w) \leq 3$. The line graph of a connected graph
is connected, so an Eulerian path on $R_3$ reaches every node. We thus have that
$\mathcal{S}_2(w) = 3$.
\vspace{2mm}
By construction, the length 3 subwords of $w$ are all distinct, so $\mathcal{S}_3(w) = 4$.
We thus conclude that $w$ is a minimal order 3 Sturmain word.
\linehack{}
Now, we prove our inductive step: \par
Assume that the process above produces an order $n-1$ minimal Sturmain word $w_{n-1}$. \par
We want to show that $w_n$ is also a minimal Sturmain word. \par
\vspace{2mm}
By construction, $R_n$ has node labels of length $n-1$ and $n+1$ edges. \par
Thus, $w_n$ has length $2n$.
\vspace{2mm}
The only possilble length-$m$ subwords of $w_n$ are those of $w_{n-1}$ for $m < n$. \par
The line graph of a connected graph is connected, so an Eulerian path on $R_3$ reaches each node.
Thus, all length-$m$ subwords of $w_{n-1}$ appear in $w_n$.
\vspace{2mm}
By our inductive hypothesis, $\mathcal{S}_m(w_n) = m + 1$ for $m < n$. \par
The length-$n$ subwords of $w_n$ are distinct by construction, and there are
$n+1$ such subwords.
\vspace{2mm}
Thus, $\mathcal{S}_n(w_n) = n + 1$.
\end{solution}
\vfill
\pagebreak