\section{DFAs} This week, we will study computational devices called \textit{deterministic finite automata}. \par A DFA has a simple job: it will either \say{accept} or \say{reject} a string of letters. \vspace{2mm} Consider the automaton $A$ shown below: \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[main] (a) at (0, 0) {$a$}; \node[accept] (b) at (2, 0) {$b$}; \node[main] (c) at (5, 0) {$c$}; \node[start] (s) at (-2, 0) {\texttt{start}}; \end{scope} \draw[->] (s) edge (a) (a) edge node[label] {$1$} (b) (a) edge[loop above] node[label] {$0$} (a) (b) edge[bend left] node[label] {$0$} (c) (b) edge[loop above] node[label] {$1$} (b) (c) edge[bend left] node[label] {$0,1$} (b) ; \end{tikzpicture} \end{center} $A$ takes strings of letters in the alphabet $\{0, 1\}$ and reads them left to right, one letter at a time. \par Starting in the state $a$, the automaton $A$ will move between states along the edge marked by each letter. \par \vspace{2mm} Note that node $b$ has a \say{double edge} in the diagram above. This means that the state $b$ is \textit{accepting}. Any string that makes $A$ end in state $b$ is \textit{accepted}. Similarly, strings that end in states $a$ or $c$ are \textit{rejected}. \par \vspace{2mm} For example, consider the string \texttt{1011}. \par $A$ will go through the states $a - b - c - b - b$ while processing this string. \par \problem{} Which of the following strings are accepted by $A$? \par \begin{itemize} \item \texttt{1} \item \texttt{1010} \item \texttt{1110010} \item \texttt{1000100} \end{itemize} \vfill \problem{} Describe the general form of a string accepted by $A$. \par \hint{Work backwards from the accepting state, and decide what all the strings must look like at the end in order to be accepted.} \begin{solution} $A$ will accept strings that contain at least one $1$ and end with an even (possibly 0) number of zeroes. \end{solution} \vfill \pagebreak Now consider the automaton $B$, which uses the alphabet $\{a, b\}$. \par It starts in the state $s$ and has two accepting states $a_1$ and $b_1$. \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[main] (s) at (0, 0) {$s$}; \node[accept] (a1) at (-2, -0.5) {$a_1$}; \node[main] (a2) at (-2, -2.5) {$a_2$}; \node[accept] (b1) at (2, -0.5) {$b_1$}; \node[main] (b2) at (2, -2.5) {$b_2$}; \node[start] (start) at (0, 1) {\texttt{start}}; \end{scope} \clip (-4, -3.5) rectangle (4, 1); \draw[->] (start) edge (s) (s) edge node[label] {\texttt{a}} (a1) (a1) edge[loop left] node[label] {\texttt{a}} (a1) (a1) edge[bend left] node[label] {\texttt{b}} (a2) (a2) edge[bend left] node[label] {\texttt{a}} (a1) (a2) edge[loop left] node[label] {\texttt{b}} (a2) (s) edge node[label] {\texttt{b}} (b1) (b1) edge[loop right] node[label] {\texttt{b}} (b1) (b1) edge[bend left] node[label] {\texttt{a}} (b2) (b2) edge[bend left] node[label] {\texttt{b}} (b1) (b2) edge[loop right] node[label] {\texttt{a}} (b2) ; \end{tikzpicture} \end{center} \problem{} Which of the following strings are accepted by $B$? \begin{itemize} \item \texttt{aa} \item \texttt{abba} \item \texttt{abbba} \item \texttt{baabab} \end{itemize} \vfill \problem{} Describe the strings accepted by $B$. \begin{solution} $B$ accepts strings that start and end with the same letter. \end{solution} \vfill \pagebreak \problem{} How many strings of length $n$ are accepted by the automaton $C$? \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[main] (0) at (0, 0) {$0$}; \node[accept] (1) at (3, 0) {$1$}; \node[main] (2) at (5, 0) {$2$}; \node[start] (s) at (-2, 0) {\texttt{start}}; \end{scope} \draw[->] (s) edge (0) (0) edge[loop above] node[label] {\texttt{b}} (0) (0) edge[bend left] node[label] {\texttt{a}} (1) (1) edge[bend left] node[label] {\texttt{b}} (0) (1) edge node[label] {\texttt{a}} (2) (2) edge[loop above] node[label] {\texttt{a,b}} (2) ; \end{tikzpicture} \end{center} \begin{solution} If $A_n$ is the number of accepted strings of length $n$, then $A_n = A_{n-1}+A_{n-2}$. \par Computing initial conditions, we see that $A_n$ is an $n+2$-th Fibonacci number. \end{solution} %\begin{remark} %Note that all the states in our DFAs $A$, $B$ and $C$ from figures 1, 2, 3 have outgoing symbols for each letter of the alphabet. %Do the same for your DFAs. %\end{remark} \vfill \definition{} An \textit{alphabet} is a finite set of symbols. \par \definition{} A \textit{string} over an alphabet $Q$ is a finite sequence of symbols from $Q$. \par We denote the empty string $\varepsilon$. \par \vspace{2mm} $Q^*$ is the set of all possible strings over $Q$. \par For example, $\{\texttt{0}, \texttt{1}\}^*$ is the set $\{\varepsilon, \texttt{0}, \texttt{1}, \texttt{00}, \texttt{01}, \texttt{10}, \texttt{11}, \texttt{000},... \}$ \par Note that this set contains the empty string. \definition{} A \textit{language} over an alphabet $Q$ is a subset of $Q^*$. \par For example, the language \say{strings of length 2} over $\{\texttt{0}, \texttt{1}\}$ is $\{\texttt{00}, \texttt{01}, \texttt{10}, \texttt{11}\}$ \definition{} We say a language $L$ is \textit{recognized} by a DFA if that DFA accepts a string $w$ if and only if $w \in L$. \pagebreak \problem{} Draw DFAs that recognize the following languages. In all parts, the alphabet is $\{0, 1\}$: \begin{itemize} \item $\{w~ | ~w~ \text{begins with a \texttt{1} and ends with a \texttt{0}}\}$ \item $\{w~ | ~w~ \text{contains at least three \texttt{1}s}\}$ \item $\{w~ | ~w~ \text{contains the substring \texttt{0101} (i.e, $w = x\texttt{0101}y$ for some $x$ and $y$)}\}$ \item $\{w~ | ~w~ \text{has length at least three and its third symbol is a \texttt{0}}\}$ \item $\{w~ | ~w~ \text{starts with \texttt{0} and has odd length, or starts with \texttt{1} and has even length}\}$ \item $\{w~ | ~w~ \text{doesn't contain the substring \texttt{110}}\}$ \end{itemize} \begin{solution} $\{w~ | ~w~ \text{begins with a \texttt{1} and ends with a \texttt{0}}\}$ \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[accept] (0) at (0, 2) {$\phantom{0}$}; \node[main] (1) at (3, 2) {$\phantom{0}$}; \node[main] (2) at (0, 0) {$\phantom{0}$}; \node[main] (3) at (3, 0) {$\phantom{0}$}; \node[start] (s) at (-2, 0) {\texttt{start}}; \end{scope} \clip (-2, -1) rectangle (4.5, 3); \draw[->] (s) edge (2) (0) edge[loop left] node[label] {\texttt{1}} (0) (0) edge[bend left] node[label] {\texttt{1}} (1) (1) edge[loop right] node[label] {\texttt{1}} (1) (1) edge[bend left] node[label] {\texttt{0}} (0) (2) edge[out=90, in=270] node[label] {\texttt{1}} (1) (2) edge node[label] {\texttt{0}} (3) (3) edge[loop right] node[label] {\texttt{1,0}} (3) ; \end{tikzpicture} \end{center} \linehack{} $\{w~ | ~w~ \text{contains at least three \texttt{1}s}\}$ \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[start] (s) at (-2, 0) {\texttt{start}}; \node[main] (0) at (0, 0) {$\phantom{0}$}; \node[main] (1) at (2, 0) {$\phantom{0}$}; \node[main] (2) at (4, 0) {$\phantom{0}$}; \node[accept] (3) at (6, 0) {$\phantom{0}$}; \end{scope} \draw[->] (s) edge (0) (0) edge[loop above] node[label] {\texttt{0}} (0) (1) edge[loop above] node[label] {\texttt{0}} (1) (2) edge[loop above] node[label] {\texttt{0}} (2) (3) edge[loop above] node[label] {\texttt{0,1}} (3) (0) edge node[label] {\texttt{1}} (1) (1) edge node[label] {\texttt{1}} (2) (2) edge node[label] {\texttt{1}} (3) ; \end{tikzpicture} \end{center} \linehack{} $\{w~ | ~w~ \text{contains the substring \texttt{0101}}\}$ \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[start] (s) at (-2, 0) {\texttt{start}}; \node[main] (0) at (0, 0) {$\phantom{0}$}; \node[main] (1) at (2, 1) {$\phantom{0}$}; \node[main] (2) at (4, 1) {$\phantom{0}$}; \node[main] (3) at (0, 3) {$\phantom{0}$}; \node[accept] (4) at (2, 3) {$\phantom{0}$}; \end{scope} % Tikz includes invisible handles in picture size. % This crops the image to fix sizing. \clip (-2, -1.75) rectangle (5, 5.25); \draw[->] (s) edge (0) (0) edge[loop above] node[label] {\texttt{1}} (0) (0) edge[bend right] node[label] {\texttt{0}} (1) (1) edge[loop above] node[label] {\texttt{0}} (1) (1) edge node[label] {\texttt{1}} (2) (3) edge[bend right] node[label] {\texttt{0}} (1) (3) edge node[label] {\texttt{1}} (4) (4) edge[loop above] node[label] {\texttt{0,1}} (4) ; \draw[->, rounded corners = 10mm] (2) to (4, 5) to node[label] {\texttt{0}} (0, 5) to (3) ; \draw[->, rounded corners = 10mm] (2) to (4, -1.5) to node[label] {\texttt{1}} (0, -1.5) to (0) ; \end{tikzpicture} \end{center} Notice that after getting two 0's in a row we don't reset to the initial state. \pagebreak $\{w~ | ~w~ \text{has length at least three and its third symbol is a \texttt{0}}\}$ \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[start] (s) at (-2, 0) {\texttt{start}}; \node[main] (0) at (0, 0) {$\phantom{0}$}; \node[main] (1) at (2, 0) {$\phantom{0}$}; \node[main] (2) at (4, 0) {$\phantom{0}$}; \node[accept] (3) at (6, 1) {$\phantom{0}$}; \node[accept] (4) at (6, -1) {$\phantom{0}$}; \end{scope} \clip (-2, -2.5) rectangle (7, 2.5); \draw[->] (s) edge (0) (0) edge node[label] {\texttt{0,1}} (1) (1) edge node[label] {\texttt{0,1}} (2) (2) edge node[label] {\texttt{0}} (3) (2) edge node[label] {\texttt{1}} (4) (3) edge[loop above] node[label] {\texttt{0,1}} (3) (4) edge[loop below] node[label] {\texttt{0,1}} (4) ; \end{tikzpicture} \end{center} \linehack{} $\{w~ | ~w~ \text{starts with \texttt{0} and has odd length, or starts with \texttt{1} and has even length}\}$ \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[start] (s) at (-2, 0) {\texttt{start}}; \node[main] (0) at (0, 0) {$\phantom{0}$}; \node[accept] (1) at (2, 1) {$\phantom{0}$}; \node[main] (2) at (4, 1) {$\phantom{0}$}; \end{scope} \draw[->] (s) edge (0) (0) edge node[label] {\texttt{0}} (1) (1) edge[bend left] node[label] {\texttt{0,1}} (2) (2) edge[bend left] node[label] {\texttt{0,1}} (1) ; \draw[->, rounded corners = 5mm] (0) to node[label] {\texttt{1}} (4, 0) to (2) ; \end{tikzpicture} \end{center} \linehack{} $\{w~ | ~w~ \text{doesn't contain the substring \texttt{110}}\}$ \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[start] (s) at (-2, 0){\texttt{start}}; \node[accept] (0) at (0, 0) {$\phantom{0}$}; \node[accept] (1) at (2, 0) {$\phantom{0}$}; \node[accept] (2) at (4, 0) {$\phantom{0}$}; \node[main] (3) at (6, 0) {$\phantom{0}$}; \end{scope} \draw[->] (s) edge (0) (0) edge[loop above] node[label] {\texttt{0}} (0) (2) edge[loop above] node[label] {\texttt{1}} (2) (3) edge[loop above] node[label] {\texttt{0,1}} (3) (0) edge[bend left] node[label] {\texttt{1}} (1) (1) edge[bend left] node[label] {\texttt{0}} (0) (1) edge node[label] {\texttt{1}} (2) (2) edge node[label] {\texttt{0}} (3) ; \end{tikzpicture} \end{center} Notice that after getting three 1's in a row we don't reset to the initial state. \end{solution} \vfill \pagebreak \problem{} Draw a DFA over an alphabet $\{\texttt{a}, \texttt{b}, \texttt{@}, \texttt{.}\}$ recognizing the language of strings of the form \texttt{user@website.domain}, where \texttt{user}, \texttt{website} and \texttt{domain} are nonempty strings over $\{\texttt{a}, \texttt{b}\}$ and \texttt{domain} has length 2 or 3. \begin{solution} \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[start] (start) at (-2, 0) {\texttt{start}}; \node[main] (0) at (0, 0) {$\phantom{0}$}; \node[main] (1) at (0, 2) {$\phantom{0}$}; \node[main] (2) at (0, 4) {$\phantom{0}$}; \node[main] (3) at (0, 6) {$\phantom{0}$}; \node[main] (4) at (0, 8) {$\phantom{0}$}; \node[main] (5) at (0, 10) {$\phantom{0}$}; \node[accept] (6) at (0, 12) {$\phantom{0}$}; \node[accept] (7) at (0, 14) {$\phantom{0}$}; \node[main] (8) at (5, 7) {$\phantom{0}$}; \end{scope} \draw[->] (start) edge (0) (0) edge node[label] {\texttt{a,b}} (1) (1) edge node[label] {\texttt{@}} (2) (2) edge node[label] {\texttt{a,b}} (3) (3) edge node[label] {\texttt{.}} (4) (4) edge node[label] {\texttt{a,b}} (5) (5) edge node[label] {\texttt{a,b}} (6) (6) edge node[label] {\texttt{a,b}} (7) (1) edge[loop left] node[label] {\texttt{a,b}} (1) (3) edge[loop left] node[label] {\texttt{a,b}} (3) (0) edge[out=0, in=270] node[label] {\texttt{@,.}} (8) (1) edge[out=0, in=245] node[label] {\texttt{.}} (8) (2) edge[out=0, in=220] node[label] {\texttt{@,.}} (8) (3) edge[out=0, in=195] node[label] {\texttt{@}} (8) (4) edge[out=0, in=170] node[label] {\texttt{@,.}} (8) (5) edge[out=0, in=145] node[label] {\texttt{@,.}} (8) (6) edge[out=0, in=120] node[label] {\texttt{@,.}} (8) (7) edge[out=0, in=95] node[label] {\texttt{a,b,@,.}} (8) ; \draw[->, rounded corners = 5mm] (8) to +(1.5, 1) to node[label] {\texttt{a,b,@,.}} +(1.5, -1) to (8) ; \end{tikzpicture} \end{center} \end{solution} \vfill \pagebreak \problem{} Draw a state diagram for a DFA over an alphabet of your choice that accepts exactly $f(n)$ strings of length $n$ if \par \begin{itemize} \item $f(n) = n$ \item $f(n) = n+1$ \item $f(n) = 3^n$ \item $f(n) = n^2$ \item $f(n)$ is a Tribonacci number. \par Tribonacci numbers are defined by the sequence $f(0) = 0$, $f(1) = 1$, $f(2) = 1$, and $f(n) = f(n-1)+f(n-2)+f(n-3)$ for $n \ge 3$ \par \hint{Fibonacci numbers are given by the automaton prohibiting two \texttt{`a'}s in a row.} \end{itemize} \begin{solution} \textbf{Part 4:} $f(n) = n^2$ \par Consider the language of words over $\{0, 1, 2\}$ that have the sum of their digits equal to $2$. \par Such words must contain two ones or one two: \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[start] (start) at (-2, 0) {\texttt{start}}; \node[main] (0) at (0, 0) {$\phantom{0}$}; \node[accept] (1) at (2, 0) {$\phantom{0}$}; \node[main] (2) at (0, -2) {$\phantom{0}$}; \node[main] (3) at (2, -2) {$\phantom{0}$}; \end{scope} \clip (-2, 1.5) rectangle (4, -2.75); \draw[->] (start) edge (0) (0) edge[loop above] node[label] {\texttt{0}} (0) (1) edge[loop above] node[label] {\texttt{0}} (1) (2) edge[loop left] node[label] {\texttt{0}} (2) (3) edge[loop right] node[label] {\texttt{0,1,2}} (3) (0) edge node[label] {\texttt{2}} (1) (0) edge node[label] {\texttt{1}} (2) (1) edge node[label] {\texttt{1,2}} (3) (2) edge node[label] {\texttt{1}} (1) (2) edge node[label] {\texttt{2}} (3) ; \end{tikzpicture} \end{center} \linehack{} \textbf{Part 5:} Tribonacci numbers \par Using the hint, we get the following automaton. \par It rejects all strings with three \texttt{'a'}s in a row. \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[start] (start) at (-2, 0) {\texttt{start}}; \node[accept] (0) at (0, 0) {$\phantom{0}$}; \node[accept] (1) at (0, 2) {$\phantom{0}$}; \node[accept] (2) at (2, 0) {$\phantom{0}$}; \node[main] (3) at (4, 0) {$\phantom{0}$}; \end{scope} \draw[->] (start) edge (0) (0) edge[loop below] node[label] {\texttt{b}} (0) (3) edge[loop above] node[label] {\texttt{a,b}} (3) (0) edge[bend left] node[label] {\texttt{a}} (1) (1) edge[bend left] node[label] {\texttt{b}} (0) (1) edge[bend left] node[label] {\texttt{a}} (2) (2) edge node[label] {\texttt{a}} (3) (2) edge node[label] {\texttt{b}} (0) ; \end{tikzpicture} \end{center} This automaton rejects all strings with three \texttt{'a'}s in a row. If we count accepted strings, we get the Tribonacci numbers with an offest: $f(0) = 1$, $f(1) = 2$, $f(2)=4$, ... \par \pagebreak We can fix this by adding a node and changing the start state: \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[start] (start) at (1, -2) {\texttt{start}}; \node[accept] (0) at (0, 0) {$\phantom{0}$}; \node[accept] (1) at (0, 2) {$\phantom{0}$}; \node[accept] (2) at (2, 0) {$\phantom{0}$}; \node[main] (3) at (4, 0) {$\phantom{0}$}; \node[main] (4) at (3, -2) {$\phantom{0}$}; \end{scope} \draw[->] (start) edge (4) (2) edge node[label] {\texttt{b}} (0) (4) edge node[label] {\texttt{b}} (2) (4) edge node[label] {\texttt{a}} (3) (0) edge[loop below] node[label] {\texttt{b}} (0) (3) edge[loop above] node[label] {\texttt{a,b}} (3) (0) edge[bend left] node[label] {\texttt{a}} (1) (1) edge[bend left] node[label] {\texttt{b}} (0) (1) edge[bend left] node[label] {\texttt{a}} (2) (2) edge node[label] {\texttt{a}} (3) (2) edge node[label] {\texttt{b}} (0) ; \end{tikzpicture} \end{center} \end{solution} \vfill \pagebreak % \problem{} % Draw a DFA over an alphabet $\{a, b, c\}$, accepting all the suffixes of the string $abbc$ (including $\varepsilon$) and only them. \problem{} Draw a DFA recognizing the language of strings over $\{\texttt{0}, \texttt{1}\}$ in which \texttt{0} is the third digit from the end. \par Prove that any such DFA must have at least 8 states. \begin{solution} \textbf{Part 1:} \par Index the states by three-letter suffixes \texttt{000}, \texttt{001}, ..., \texttt{111}. All strings that end with letters $d_1d_2d_3$ will end up in the state $d_1d_2d_3$. We accept all states that start with a \texttt{0}. \par Note that we can start at any node if we ignore strings with fewer than three letters. \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[start] (start) at (-2, 0) {\texttt{start}}; \node[main] (7) at (0, 0) {\texttt{111}}; \node[accept] (3) at (0, -2) {\texttt{011}}; \node[main] (6) at (2, -2) {\texttt{110}}; \node[main] (4) at (4, -2) {\texttt{100}}; \node[accept] (1) at (-4, -4) {\texttt{001}}; \node[main] (5) at (0, -4) {\texttt{101}}; \node[accept] (2) at (-2, -4) {\texttt{010}}; \node[accept] (0) at (-2, -6) {\texttt{000}}; \end{scope} \draw[->] (0) edge[loop left, looseness = 7] node[label] {\texttt{0}} (0) (7) edge[loop above, looseness = 7] node[label] {\texttt{1}} (7) (start) edge (7) (0) edge[out=90,in=-90] node[label] {\texttt{1}} (1) (1) edge node[label] {\texttt{0}} (2) (1) edge[out=45,in=-135] node[label] {\texttt{1}} (3) (2) edge[bend left] node[label] {\texttt{1}} (5) (3) edge node[label] {\texttt{0}} (6) (3) edge node[label] {\texttt{1}} (7) (5) edge[bend left] node[label] {\texttt{0}} (2) (5) edge node[label] {\texttt{1}} (3) (6) edge[bend left] node[label] {\texttt{0}} (4) (6) edge[out=-90,in=0] node[label] {\texttt{1}} (5) (7) edge[out=0,in=90] node[label] {\texttt{0}} (6) ; \draw[->, rounded corners = 10mm] (4) to (4, 2) to node[label] {\texttt{1}} (-4, 2) to (1) ; \draw[->, rounded corners = 10mm] (4) to (4, -6) to node[label] {\texttt{0}} (0) ; \draw[->, rounded corners = 5mm] (2) to (-2, -5) to node[label] {\texttt{0}} (3, -5) to (3, -2) to (4) ; \end{tikzpicture} \end{center} \linehack{} \textbf{Part 2:} \par Strings \texttt{000}, \texttt{001}, ..., \texttt{111} must lead to pairwise different states. \par \vspace{2mm} Assume \texttt{101} and \texttt{010} lead to the same state. Append a \texttt{1} to the end of the string. \par \texttt{101} will become \texttt{011}, and \texttt{010} will become \texttt{101}. These must be different states, since we accept \texttt{011} and reject \texttt{101}. We now have a contradiction: one edge cannot lead to two states! \vspace{2mm} \texttt{101} and \texttt{010} must thus correspond to distinct states. \par We can repeat this argument for any other pair of strings. \par \end{solution} \vfill \pagebreak