From 4bb7cd7c5df7988e03765a58ce0155e59528159b Mon Sep 17 00:00:00 2001 From: Mark Date: Tue, 31 Jan 2023 14:45:34 -0800 Subject: [PATCH] Edits to regex warmup --- Misc/Warm-Ups/regex.tex | 51 ++++++++++++++++++++++------------------- 1 file changed, 27 insertions(+), 24 deletions(-) diff --git a/Misc/Warm-Ups/regex.tex b/Misc/Warm-Ups/regex.tex index 4eb6be7..833b616 100644 --- a/Misc/Warm-Ups/regex.tex +++ b/Misc/Warm-Ups/regex.tex @@ -58,10 +58,10 @@ The pattern \htexttt{linea?r} will match only \texttt{linear} and \texttt{liner} \\ \vspace{2mm} - Patterns with brackets \htexttt{\{min, max\}} are the most flexible quantifier. \\ + Brackets \htexttt{\{min, max\}} are the most flexible quantifier. \\ They specify exactly how many tokens to match: \\ \htexttt{ab\{2\}a} will match only \texttt{abba}. \\ - \htexttt{ab\{1,3\}a} will match \texttt{aba}, \texttt{abba}, \texttt{abbba}. \\ + \htexttt{ab\{1,3\}a} will match \texttt{aba}, \texttt{abba}, and \texttt{abbba}. \\ \htexttt{ab\{2,\}a} will match any \texttt{ab...ba} with at least two \texttt{b}s. \vspace{5mm} @@ -81,11 +81,8 @@ - \textbf{Characters, Sets, and Groups} - - Characters tell us what to match. - - Usually we specify them literally, as shown above: \\ + \textbf{Characters, Sets, and Groups} \\ + We specify characters literally, as shown above: \\ \texttt{a+} means \say{one or more \texttt{a} character} \\ \vspace{2mm} @@ -97,6 +94,8 @@ The first such way is the \textit{set}, denoted \htexttt{[ ]}. A set can pretend to be any character inside it. \\ For example, \htexttt{m[aoy]th} will match \texttt{math}, \texttt{moth}, or \texttt{myth}. \\ \htexttt{a[01]+b} will match \texttt{a0b}, \texttt{a111b}, \texttt{a1100110b}, and any other similar string. \\ + You may negate a set with a \htexttt{\textasciicircum}. \\ + \htexttt{[\textasciicircum abc]} will match any character except \texttt{a}, \texttt{b}, or \texttt{c}, including symbols and spaces. \vspace{2mm} @@ -110,16 +109,29 @@ \problem{} You are now familiar with most of the tools regex has to offer. \\ - Match the following strings: - \begin{enumerate} + Write patterns that match the following strings: + \begin{enumerate}[itemsep=1mm] \item An ISO-8601 date, like \texttt{2022-10-29}. \\ - Invalid dates like \texttt{2022-13-29} should also be matched. \\ + \hint{Invalid dates like \texttt{2022-13-29} should also be matched.} + + \item An email address. \\ + \hint{Don't forget about subdomains, like \texttt{math.ucla.edu}.} + + \item A UCLA room number, like \texttt{MS 5118} or \texttt{Kinsey 1220B}. - \item A hexadecimal integer of any length. - \item A UCLA room number, like \texttt{MS 5118} or \texttt{Kinsey 1220B} \item Any ISBN-10 of the form \texttt{0-316-00395-7}. \\ - Remember that the check digit can be an \texttt{X}. \\ - Dashes are optional. + \hint{Remember that the check digit may be an \texttt{X}. Dashes are optional.} + + \item A word of even length. \\ + \hint{The set \texttt{[A-z]} contains every english letter, capitalized and lowercase. \\ + \texttt{[a-z]} will only match lowercase letters.} + + \item A word with exactly 3 vowels. \\ + \hint{The special token \texttt{\textbackslash w} will match any word character. It is equivalent to \texttt{[A-z0-9\_]}} + + \item A word that has even length and exactly 3 vowels. + + \item A sentence that does not start with a capital letter. \end{enumerate} @@ -131,15 +143,6 @@ \problem{} - If you'd like to know more, check out \texttt{regexr.com}. There's an interative regex prompt that provices explanations, as well as a cheatsheet that explains every regex token there is. You can find a nice set of challenges at \texttt{http://regex.alf.nu} \\ + If you'd like to know more, check out \texttt{regexr.com}. It offers an interative regex prompt, as well as a cheatsheet that explains every other regex token there is. You can find a nice set of challenges at \texttt{http://regex.alf.nu}. \\ I especially encourage you to look into this if you are interested in computer science. - \pagebreak - - \problem{} - Draw a DFA for each of the following regex strings. \\ - \begin{itemize} - \item Your solution to \ref{regex}, Part 2 - \item Your solution to \ref{regex}, Part 3 - \item Your solution to \ref{regex}, Part 4 - \end{itemize} \end{document} \ No newline at end of file