\section{DFAs} This week, we will study computational devices called \textit{deterministic finite automata}. \par A DFA has a simple job: it will either \say{accept} or \say{reject} a string of letters. \vspace{2mm} Consider the automaton $A$ shown below: \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[main] (a) at (0, 0) {$a$}; \node[accept] (b) at (2, 0) {$b$}; \node[main] (c) at (5, 0) {$c$}; \end{scope} \draw[->] (a) edge node[label] {$1$} (b) (a) edge[loop above] node[label] {$0$} (a) (b) edge[bend left] node[label] {$0$} (c) (b) edge[loop above] node[label] {$1$} (b) (c) edge[bend left] node[label] {$0,1$} (b) ; \end{tikzpicture} \end{center} $A$ always starts in the state $q_1$. This is called the \textit{start state}. \par It takes strings using letters in the alphabet $\{0, 1\}$ and reads them left to right, moving between states along the edges marked by each letter. For example, consider the string \texttt{1011}. Processing this string, $A$ will go through the states $q_1 - q_2 - q_3 - q_2 - q_2$. \par Note that $q_2$ has a circle in the diagram above. This means that the state $q_2$ is \textit{accepting}, and that all the strings which end up in it are \textit{accepted}. Similarly, states $q_1$ and $q_3$ are \textit{rejecting} and the strings which end up there are \textit{rejected}. \problem{} Which of the following strings are accepted by $A$? \\ \begin{itemize} \item \texttt{1} \item \texttt{1010} \item \texttt{1110010} \item \texttt{1000100} \end{itemize} \vfill \problem{} Describe the general form of a string accepted by $A$. \hint{Work backwards from the accepting state, and decide what all the strings must look like at the end in order to be accepted.} \begin{solution} $A$ will accept strings that contain at least one $1$ and end with an even (possibly 0) number of zeroes. \end{solution} \vfill \pagebreak Now consider the automaton $B$, which uses the alphabet $\{a, b\}$. \par It starts in the state $s$ and has two accepting states $a_1$ and $b_1$. \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[main] (s) at (0, 0) {$s$}; \node[accept] (a1) at (-2, -0.5) {$a_1$}; \node[main] (a2) at (-2, -2.5) {$a_2$}; \node[accept] (b1) at (2, -0.5) {$b_1$}; \node[main] (b2) at (2, -2.5) {$b_2$}; \end{scope} \draw[->] (s) edge node[label] {\texttt{a}} (a1) (a1) edge[loop left] node[label] {\texttt{a}} (a1) (a1) edge[bend left] node[label] {\texttt{b}} (a2) (a2) edge[bend left] node[label] {\texttt{a}} (a1) (a2) edge[loop left] node[label] {\texttt{b}} (a2) (s) edge node[label] {\texttt{b}} (b1) (b1) edge[loop right] node[label] {\texttt{b}} (b1) (b1) edge[bend left] node[label] {\texttt{a}} (b2) (b2) edge[bend left] node[label] {\texttt{b}} (b1) (b2) edge[loop right] node[label] {\texttt{a}} (b2) ; \end{tikzpicture} \end{center} \problem{} Which of the following strings are accepted by $B$: \begin{itemize} \item \texttt{aa} \item \texttt{abba} \item \texttt{abbba} \item \texttt{baabab} \end{itemize} \vfill \problem{} Describe the strings accepted by $B$. \begin{solution} They are strings that start and end with the same letter. \end{solution} \vfill \pagebreak \definition{} An \textit{alphabet} is a finite set of symbols. \par \definition{} A \textit{string} over an alphabet $Q$ is a finite sequence of symbols from $Q$. \par We denote the empty string $\varepsilon$. \par \vspace{2mm} $Q^*$ is the set of all possible strings over $Q$. \par For example, $\{\texttt{0}, \texttt{1}\}^*$ is the set $\{\varepsilon, \texttt{0}, \texttt{1}, \texttt{00}, \texttt{01}, \texttt{10}, \texttt{11}, \texttt{000},... \}$ \par Note that this set contains the empty string. \definition{} A \textit{language} over an alphabet $Q$ is a subset of $Q^*$. \\ For example, the language \say{strings of length 2} over $\{\texttt{0}, \texttt{1}\}$ is $\{\texttt{00}, \texttt{01}, \texttt{10}, \texttt{11}\}$ \definition{} We say a language $L$ is \textit{recognized} by a DFA $A$ if that DFA accepts a string $w$ iff $w \in L$. %\begin{remark} %A machine, such as DFA or Turing machine, may accept several strings, but it always recognizes only one language. If the machine %accepts no strings, it still recognizes one language — namely, the empty language $\emptyset$. %\end{remark} \vspace{8mm} \problem{} How many strings of length $n$ are accepted by the automaton $C$? \begin{center} \begin{tikzpicture} \begin{scope}[layer = nodes] \node[main] (0) at (0, 0) {$0$}; \node[accept] (1) at (3, 0) {$1$}; \node[main] (2) at (5, 0) {$2$}; \end{scope} \draw[->] (a) edge[loop above] node[label] {\texttt{b}} (a) (a) edge[bend left] node[label] {\texttt{a}} (b) (b) edge[bend left] node[label] {\texttt{b}} (a) (b) edge node[label] {\texttt{a}} (c) (c) edge[loop above] node[label] {\texttt{a, b}} (c) ; \end{tikzpicture} \end{center} \begin{solution} If $A_n$ is the number of accepted strings of length $n$, then $A_n = A_{n-1}+A_{n-2}$. Together with initial conditions, we see that $A_n$ is an $n+2$-th Fibonacci number. \end{solution} %\begin{remark} %Note that all the states in our DFAs $A$, $B$ and $C$ from figures 1, 2, 3 have outgoing symbols for each letter of the alphabet. %Do the same for your DFAs. %\end{remark} \vfill \pagebreak \problem{} Draw DFAs that recognize the following languages. In all parts, the alphabet is $\{0,1\}$: \begin{itemize} \item $\{w~ | ~w~ \text{begins with a \texttt{1} and ends with a \texttt{0}}\}$ \item $\{w~ | ~w~ \text{contains at least three \texttt{1}s}\}$ \item $\{w~ | ~w~ \text{contains the substring \texttt{0101} (i.e, $w = x\texttt{0101}y$ for some $x$ and $y$)}\}$ \item $\{w~ | ~w~ \text{has length at least three and its third symbol is a \texttt{0}}\}$ \item $\{w~ | ~w~ \text{starts with \texttt{0} and has odd length, or starts with \texttt{1} and has even length}\}$ \item $\{w~ | ~w~ \text{doesn't contain the substring \texttt{110}}\}$ \end{itemize} \begin{solution} %\part{a} \includegraphics[width=0.3\linewidth]{6a.png} %\part{b} \includegraphics[width=0.4\linewidth]{6b.png} %\part{c} \includegraphics[width=0.3\linewidth]{6c.png} \medskip Notice that after getting two 0's in a row we don't reset to the initial state. %\part{d} \includegraphics[width=0.4\linewidth]{6d.png} %\part{e} \includegraphics[width=0.3\linewidth]{6e.png} %\part{f} \includegraphics[width=0.4\linewidth]{6f.png} \medskip Notice that after getting three 1's in a row we don't reset to the initial state. \end{solution} \vfill \problem{} Draw a DFA over an alphabet $\{\texttt{a}, \texttt{b}, \texttt{@}, \texttt{.}\}$ recognizing the language of strings of the form \texttt{user@website.domain}, where \texttt{user}, \texttt{website} and \texttt{domain} are nonempty strings over $\{\texttt{a}, \texttt{b}\}$ and \texttt{domain} has length 2 or 3. \begin{solution} %\includegraphics[width=0.9\linewidth]{Email.png} \end{solution} \vfill \pagebreak \problem{} Draw a state diagram for a DFA over an alphabet of your choice that recognizes exactly $f(n)$ strings of length $n$ if \\ \begin{itemize} \item $f(n) = n$ \item $f(n) = n+1$ \item $f(n) = 3^n$ \item $f(n) = n^2$ \item $f(n)$ is a Tribonacci number. \par \textit{Tribonacci numbers} are defined by the sequence $f(0) = 0$, $f(1) = 1$, $f(2) = 1$, and $f(n) = f(n-1)+f(n-2)+f(n-3)$ for $n \ge 3$ \par \hint{Fibonacci numbers are given by the automaton prohibiting two \texttt{a}s in a row.} \end{itemize} \begin{solution} \begin{itemize} \item You would need to have an alphabet with three letters. \item Consider the language of words over $\{0, 1, 2\}$ having the sum of digits equal to $2$, so they contain two 1's or one 2. %\includegraphics[width=0.5\linewidth]{NSqrd.png} \item Following the hint gives the automaton %\includegraphics[width=0.5\linewidth]{Trib1.png} \item For this automaton $f(n)$ gives Tribonacci numbers with a shift: $f(0)=1$, $f(1)=2$, $f(2)=4$, $f(3)=7$. To account for the shift one can move the starting state in, e.g., this fashion: %\includegraphics[width=0.5\linewidth]{Trib2.png} \end{itemize} \end{solution} \vfill % \problem{} % Draw a DFA over an alphabet $\{a, b, c\}$, accepting all the suffixes of the string $abbc$ (including $\varepsilon$) and only them. % % \com{TD}{Something suffix automaton} \problem{} Draw a DFA recognizing the language of strings over $\{\texttt{0}, \texttt{1}\}$ in which \texttt{0} is the third digit from the end. \par Prove that any such DFA must have at least 8 states. \begin{solution} \textbf{Part 1:} \par Index the states by triples of digits \texttt{000}, \texttt{001}, ..., \texttt{111}. All strings which end by 3 digits $d_1d_2d_3$ will end up in the state $d_1d_2d_3$. The starting state will be \texttt{111}. The transitions from $d_1d_2d_3$ by \texttt{0} and \texttt{1} will lead to $d_2d_3\texttt{0}$ and $d_2d_3\texttt{1}$, respectively. Accepting states are states with indices starting with \texttt{0}. %\includegraphics[width=0.7\linewidth]{9.png} \linehack{} \textbf{Part 2:} \par Strings \texttt{000}, \texttt{001}, ..., \texttt{111} should lead to pairwise different states since they differ in $i$-th position and after completing them with $i-1$ digit, they will need to be in different states. \end{solution} \vfill \pagebreak