Chapter 3: Lexical Analysis

How does LEX work?

. Regular expressions describe the languages that can be recognized by finite automata

. Translate each token regular expression into a non deterministic finite automaton (NFA)

. Convert the NFA into an equivalent DFA

. Minimize the DFA to reduce number of states

. Emit code driven by the DFA tables

In this section, we will describe the working of lexical analyzer tools such as LEX. LEX works on some fundamentals of regular expressions and NFA - DFA. First, it reads the regular expressions which describe the languages that can be recognized by finite automata. Each token regular expression is then translated into a corresponding non-deterministic finite automaton (NFA). The NFA is then converted into an equivalent deterministic finite automaton (DFA). The DFA is then minimized to reduce the number of states. Finally, the code driven by DFA tables is emitted.