Chapter 3:Lexical Analysis

Notation ....

. If r and s are regular expressions denoting the languages L(r) and L(s) then

. (r)|(s) is a regular expression denoting L(r) U L(s)

. (r)(s) is a regular expression denoting L(r)L(s)

. (r)* is a regular expression denoting (L(r))*

. (r) is a regular expression denoting L(r )

Suppose r and s are regular expressions denoting the languages L(r) and L(s). Then,

. (r)|(s) is a regular expression denoting L(r) U L(s).

. (r) (s) is a regular expression denoting L(r) L(s).

. (r)* is a regular expression denoting (L(r))*.

. (r) is a regular expression denoting L(r).

Let us take an example to illustrate: Let S = {a, b}.

1. The regular expression a|b denotes the set {a,b}.

2. The regular expression (a | b) (a | b) denotes {aa, ab, ba, bb}, the set of all strings of a's and b's of length two. Another regular expression for this same set is aa | ab | ba | bb.

3. The regular expression a* denotes the set of all strings of zero or more a's i.e., { ? , a, aa, aaa, .}.

4. The regular expression (a | b)* denotes the set of all strings containing zero or more instances of an a or b, that is, the set of strings of a's and b's. Another regular expression for this set is (a*b*)*.

5. The regular expression a | a*b denotes the set containing the string a and all strings consisting of zero or more a's followed by a b.

If two regular expressions contain the same language, we say r and s are equivalent and write r = s. For example, (a | b) = (b | a).