HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Recursive descent parser with backtracking for the grammar $S \rightarrow aSa\ |\ aa$

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
theparserwithrecursiveasagrammarforrightarrowdescentbacktracking

Problem

Can someone enlighten me why a recursive descent parser with backtracking that tries the productions $S \rightarrow aSa$ and $S \rightarrow aa$ (in that order) does not recognize the language formed by the grammar $S \rightarrow aSa\ |\ aa$.

It appears to only parse words from the language $\{a^{2^n}\ |\ n \ge 1 \}$.

I generated such a parser using this ABNF Parser Generator with the production rule S = "a" S "a" / "aa" and the parser does not recognize the word aaaaaa, for example.

I would expect it to use the production $S \rightarrow aSa$ until the concatenation of the parse tree's terminal nodes from the left starts with 7 a's, and then go up the parse tree choosing the production $S \rightarrow aa$ instead until the tree looks like this:

S 
 / | \
a  S  a
 / | \
a  S  a
  / \
 a   a

Solution

This is not much of an answer, but the parse trees do not fit the
normal comments.

Your grammar $S \rightarrow aSa\ |\ aa$ should parse the string
$aaaaaa$.

But the parse tree has the following form:

S 
     /|\
    / S \
   / /|\ \
  / / S \ \
 / / / \ \ \
a a a   a a a


or if you prefer this presentation, with the terminals on different
lines

S 
   / | \
  a  S  a
   / | \
  a  S  a
    / \
   a   a


I checked that the ABNF parser generator does not seem to work, but I
do not know how to trace what it does.

It indeed seems to recongnize the set $\{a^{2^n}\ |\ n \ge 1 \}$ wich
is not what the grammar defines.

It is a bit surprising to have such an elaborate site around a buggy parser,
which furthermore uses a totally uninteresting parsing technique.

After a further look at it:

I think I found one source of the problem. The square brackets mean
optional.

So your grammar should be written either
S = "a" S "a" / "aa"
or S = "a" [S] "a". Then it seems to work correctly.

But the system is apparently lost when having twice the same rule in
different forms. I am not sure why.

I did not find a page explaining these syntactic issues for specifying
the grammar.

I still consider that buggy.

Code Snippets

S 
     /|\
    / S \
   / /|\ \
  / / S \ \
 / / / \ \ \
a a a   a a a
S 
   / | \
  a  S  a
   / | \
  a  S  a
    / \
   a   a

Context

StackExchange Computer Science Q#32749, answer score: 6

Revisions (0)

No revisions yet.