SmithLogo

CSC 250

Theory of Computation

Smith Computer Science



Lecture Notes 04:
Regular Languages and Regular Expressions


Outline

This class we'll discuss:




A Map of what we'll be doing in class

What are we going to do?


As we mentioned last class, We are going to analyze what types of problems can be solved with minimal "machinery".


We are going to work up from basic machines up to the modern computer.



How are we going to do? it


Since we are not really going to build those machines in hardware, we will need to "represent" them symbolically.

We will be building symbolic machines to solve problems of greater and greater complexity.



A candidate "simple" problem:

\[ \begin{align*} & 1001011 \\ & + \\ & 0111011 \end{align*} \]

While this might look simple, this is really a very advanced problem, it requires that the machine know:

In other words, performing Arithmetic operations is a few "machines" away.



Step 1: recognizing a pattern


You can think of a machine that recognizes things and can clasify them as being in a set of known elements or being outside that set of know elements.

With our symbolic replicas, a first task would be to be able to specify the structure or pattern of a set of symbols that we would like to be able to recognize.

How to build a "recognizing" machine


Activity 1 [1 minute]:
Can you think of a super simple (mechanical) machine that "recognizes" things?
Think before revealing some examples:
(Wait; then Click)

think of a sieve: Recognize small/big!
It "recognizes" small particles and lets them through!



Activity 2 [1 minute]:
Can you think of a super simple (mechanical) machine that "recognizes" a pattern (sequence of things)?
Think before revealing some examples:
(Wait; then Click)

think of a locking mechanism:
It "recognizes" small particles and lets them through!



Regular Expressions are like lock mechanisms that can be built to recognize one or many "keys".






Regular Expressions

An expression is a combination of objects and operators that can be "resolved" into a value.

In Arithmetic, the objects are numbers, and the operators are: \( + \text{, } - \text{, } * \text{, } ÷ \text{, }\) etc.
A (correct) arithmetic expression looks like this: \[ (27 - 6)* 2\]

The value of a resolved arithmetic expression is a number, in this case \(42\).


A Reglar Expression (RE or RegEx) is an algebraic way to describe a set of words.

In reglar expressions, we also have objects and symbols, which can be combined to form an expression that might look like this: \[ 0^*(101 + 11011)0^* \]
The value of a resolved regular expression is a Language (or a set of possible words), or
"the set of words that follow the pattern of the regular expression".

in this case:

a sequence that starts with any number of sequential zeroes (zero or more 0s); followed by either the exact sequence 101 or the exact sequence 11011; and concluding with any number of zeros (zero or more 0s)"


Another way you'll see this written is as: \( \mathrm{L} (R) \), (for some RE \(R\))
which refers to the Language (set of words) described by the expression ( \(R\) ) inside parenthesis.

We sometimes refer to the Language with some symbol like \(\mathrm{L}_A\), where language \(L\) has some property \(A\).



Defining a Regular Expression

We will follow a sequence of steps to understand how to build regular expressions.
First, we'll need some basic definitions

RE Definitions



The following are some examples:



Regular Operations



Definition of Regular Expressions (Recursively)

Basic (axiomatic) definitions:


Recursive extensions:

Precedence rules are: do parenthesis first, then do star first, then concatenation, then union.



Let's trt some exercises:

Activity 4 [2 minutes]:
What is the expression that gives us all binary strings?
Describe it in words or with any notation you find useful. We'll develop the rigorous notation later.
Hints:
1) what is the alphabet \(\Sigma\)?
2)Which operator might help us expand this into the correct language \(\mathrm{L}_b\)?



Activity 5 [2 minutes]:
What is the expression that gives us all binary strings that begin with a 1?



Activity 6 [2 minutes]:
What is the expression that gives us all binary strings that begin with a 1, end with a 0, and have an even number of digits?



The RegEx Minigolf!


golf

Activity 7 [5-10 minutes!!]:
Make teams of 2 or 3 and complete the following challenges.
Let me know when your team is done.

Hole 1:
What is the expression that gives us:

\(\{w \mid w \text{ contains a single 1}\}\)



Hole 2:
What is the expression that gives us:

\(\{w \mid w \text{ contains at least one 1}\}\)



Hole 3:
What is the expression that gives us:

\(\{w \mid w \text{ contains 001 as a substring}\}\)



Hole 4:
What is the expression that gives us:

\(\{w \mid \text{ every 0 in w is followed by at least one 1}\}\)



Hole 5:
What is the expression that gives us:

\(\{w \mid w \text{ w is a string of even length}\}\)



Hole 6:
What is the expression that gives us:

\(\{w \mid w \text{ w is a string of odd length}\}\)



Hole 7:
What is the expression that gives us:

\(\{w \mid w \text{ the length of w is a multiple of 3}\}\)



Hole 8:
What is the expression that gives us:

How do we search for this set exactly: \( \{ 01 \text{, } 10\} \)?



Hole 9:
If \( \Sigma={0,1} \), how do we get the following expression?

\( \{w \mid w \text{ w starts and ends with the same symbol} \} \)






Regular Languages

A regular language is one that can be generated using a regular expression.

Activity 8 [1 minute]:
Answer the following questions:
(Wait; then Click)

Regular Languages are more powerful!
more descriptive power!



Properties

The following properties are true about regular languages:



Practice Proof!

Claim: Regular languages are closed under union

Tips: Use the axioms defined above!
In particular:




Next Class: Overleaf + Latex, Regex Recap, and Intro to Finite Automata!!




Homework


Review today's class and keep working on the Problem Set 1!
Fill out the When2Meet Form!! Today.

Fill out the Team Forming Formy Form Today!
Today at ~4, I will assign teams for the next homework.

[Optional] How would you build a "flowchart" to solve this regex:
Given a \(\Sigma = \{0,1\}\), get the regex that gets the language \( \{w \mid w \text{ contains at least one 1} \}\)