HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Arithmetic expressions grammar transformation

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
transformationgrammarexpressionsarithmetic

Problem

In the article Parsing Expressions by Recursive Descent by Theodore Norvell (1999) the author starts with the following grammar for arithmetic expressions:

E --> E "+" E | E "-" E | "-" E | E "*" E | E "/" E | E "^" E | "(" E ")" | v


which is quite bad, because it's ambiguous and left-recursive. So he starts from removing the left recursion from it, and his result is as such:

E --> P {B P}
P --> v | "(" E ")" | U P
B --> "+" | "-" | "*" | "/" | "^"
U --> "-"


But I can't figure out how did he get to this result. When I try to remove the left recursion myself, I'm doing it the following way:

-
Firs, I group together the productions which doesn't have left recursion in one group, and other (left-recursive) in another group:

E --> E "+" E | E "-" E | E "*" E | E "/" E | E "^" E     // L-recursive
E --> v | "(" E ")" | "-" E


-
Next, I name them and factor for easier manipulations:

E --> E B E  // L-recursive; B stands for "Binary operator"
E --> P  // not L-recursive; P stands for "Primary Expression"
P --> v | "(" E ")" | U E   // U stands for "Unary operator"
B --> "+" | "-" | "*" | "/" | "^"
P --> "-"


Now I need to deal only with the first two productions, which are now easier to deal with.

-
I rewrite those first two productions by starting from the non-L-recursive production (which is simply P, the Primary expression) and following it by the optional Tail T, which I define as the rest of the original production less the first left-recursive nonterminal (that is, just B E) followed by the Tail T, or which could be empty:

E --> P T
T --> B E T |


(note the empty alternative for the tail).

-
These two productions I can now rewrite in EBNF like this:

E --> P {B E}


which is nearly what the author get, but I have E instead of P there inside the zero-or-more repetition pattern (the Tail). The other productions I get quite the same as he have got:

`P --> v | "(" E ")" | U E
B -> "+" | "-" | "*" |

Solution

The missing step:

E --> P T
T --> B E T |


rewrite E in T:

E --> P T
T --> B P T T |


Simplify T:

E --> P T
T --> B P T |


Equivalent to:

E --> P T
T --> {B P}


And there you are.

Code Snippets

E --> P T
T --> B E T |
E --> P T
T --> B P T T |
E --> P T
T --> B P T |
E --> P T
T --> {B P}

Context

StackExchange Computer Science Q#3053, answer score: 8

Revisions (0)

No revisions yet.