snippetModerate
Does the language of Regular Expressions need a push down automata to parse it?
Viewed 0 times
theneedregularautomataexpressionslanguageparsepushdowndoes
Problem
I want to convert a user entered regular expression into an NFA so that I can then run the NFA against a string for matching purposes. What is the minimum machine that can be used to parse regular expresssions?
I assume it must be a push down automaton because the presense of brackets means the need to count and a DFA/NFA cannot perform arbitrary counting. Is this assumption correct? For example, the expression a(bc*)d would require a PDA so that the sub-expression in brackets is handled correctly.
I assume it must be a push down automaton because the presense of brackets means the need to count and a DFA/NFA cannot perform arbitrary counting. Is this assumption correct? For example, the expression a(bc*)d would require a PDA so that the sub-expression in brackets is handled correctly.
Solution
You are correct. It is easy to show that the syntax of regular expressions is not regular using standard techniques.
One possibility is to use a homomorphism (which $\mathrm{REG}$ is closed against) to get rid of all symbols but the parentheses, which leaves you with the Dyck language which is well-known to be non-regular. If in doubt, use the Pumping lemma on $(^p)^p$.
That said, you probably do not want to code a PDA by hand. Consider using a parser generator like ANTLR or byacc. If, on the other hand, you want to investigate the parsing of languages by programming parsers yourself, you should continue with other basic parsing algorithms such as CYK, Earley, recursive descent and LR.
One possibility is to use a homomorphism (which $\mathrm{REG}$ is closed against) to get rid of all symbols but the parentheses, which leaves you with the Dyck language which is well-known to be non-regular. If in doubt, use the Pumping lemma on $(^p)^p$.
That said, you probably do not want to code a PDA by hand. Consider using a parser generator like ANTLR or byacc. If, on the other hand, you want to investigate the parsing of languages by programming parsers yourself, you should continue with other basic parsing algorithms such as CYK, Earley, recursive descent and LR.
Context
StackExchange Computer Science Q#1939, answer score: 10
Revisions (0)
No revisions yet.