Chapter 4: Syntax Analysis

Limitations of regular languages

. How to describe language syntax precisely and conveniently. Can regular expressions be used?

. Many languages are not regular, for example, string of balanced parentheses

- ((((.))))

- { ( i ) i | i = 0 }

- There is no regular expression for this language

. A finite automata may repeat states, however, it cannot remember the number of times it has been to a particular state

. A more powerful language is needed to describe a valid string of tokens

Regular expressions cannot be used to describe language syntax precisely and conveniently. There are many languages which are not regular. For example, consider a language consisting of all strings of balanced parentheses. There is no regular expression for this language. Regular expressions can not be used for syntax analysis (specification of grammar) because: . The pumping lemma for regular languages prevents the representation of constructs like a string of balanced parenthesis where there is no limit on the number of parenthesis. Such constructs are allowed by most of the programming languages. . This is because a finite automaton may repeat states, however, it does not have the power to remember the number of times a state has been reached. . Many programming languages have an inherently recursive structure that can be defined by Context Free Grammars (CFG) rather intuitively. So a more powerful language is needed to describe valid string of tokens.