Chapter 3: Lexical Analysis

How to recognize tokens

. Consider

relop < | <= | = | <> | >= | >

id letter(letter|digit)*

num digit + ('.' digit + )? (E('+'|'-')? digit + )?

delim blank | tab | newline

ws delim +

. Construct an analyzer that will return <token, attribute> pairs

We now consider the following grammar and try to construct an analyzer that will return <token, attribute> pairs.

relop < | = | = | <> | = | >

id letter (letter | digit)*

num digit+ ('.' digit+)? (E ('+' | '-')? digit+)?

delim blank | tab | newline

ws delim+

Using set of rules as given in the example above we would be able to recognize the tokens. Given a regular expression R and input string x , we have two methods for determining whether x is in L(R). One approach is to use algorithm to construct an NFA N from R, and the other approach is using a DFA. We will study about both these approaches in details in future slides.