HiveBrain v1.2.0
Get Started
← Back to all entries
snippetModerate

Does the language of Regular Expressions need a push down automata to parse it?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
theneedregularautomataexpressionslanguageparsepushdowndoes

Problem

I want to convert a user entered regular expression into an NFA so that I can then run the NFA against a string for matching purposes. What is the minimum machine that can be used to parse regular expresssions?

I assume it must be a push down automaton because the presense of brackets means the need to count and a DFA/NFA cannot perform arbitrary counting. Is this assumption correct? For example, the expression a(bc*)d would require a PDA so that the sub-expression in brackets is handled correctly.

Solution

You are correct. It is easy to show that the syntax of regular expressions is not regular using standard techniques.

One possibility is to use a homomorphism (which $\mathrm{REG}$ is closed against) to get rid of all symbols but the parentheses, which leaves you with the Dyck language which is well-known to be non-regular. If in doubt, use the Pumping lemma on $(^p)^p$.

That said, you probably do not want to code a PDA by hand. Consider using a parser generator like ANTLR or byacc. If, on the other hand, you want to investigate the parsing of languages by programming parsers yourself, you should continue with other basic parsing algorithms such as CYK, Earley, recursive descent and LR.

Context

StackExchange Computer Science Q#1939, answer score: 10

Revisions (0)

No revisions yet.