patternModerate
Correct name for a recursive descent parser that uses loops to handle left recursion?
Viewed 0 times
leftrecursionhandleparserloopsusesrecursivenameforcorrect
Problem
This grammar is left recursive:
So in theory, recursive descent won't work. But by exploiting the properties of the grammar that each left-recursive rule corresponds to a specific precedence level, and that lookahead of a single token is enough to choose the correct production, the left-recursive rules can be individually parsed with while loops.
For example, to parse the AdditionExpression non-terminal, this pseudocode suffices:
What is the correct name for this type of parser? This informative article only refers to it as the "Classic Solution": https://www.engr.mun.ca/~theo/Misc/exp_parsing.htm
There must be a proper name for this type of parser.
Expression ::= AdditionExpression
AdditionExpression ::=
MultiplicationExpression
| AdditionExpression '+' MultiplicationExpression
| AdditionExpression '-' MultiplicationExpression
MultiplicationExpression ::=
Term
| MultiplicationExpression '*' Term
| MultiplicationExpression '/' Term
Term ::=
Number
| '(' AdditionExpression ')'
Number ::=
[+-]?[0-9]+(\.[0-9]+)?So in theory, recursive descent won't work. But by exploiting the properties of the grammar that each left-recursive rule corresponds to a specific precedence level, and that lookahead of a single token is enough to choose the correct production, the left-recursive rules can be individually parsed with while loops.
For example, to parse the AdditionExpression non-terminal, this pseudocode suffices:
function parse_addition_expression() {
num = parse_multiplication_expression()
while (has_token()) {
get_token()
if (current_token == PLUS)
num += parse_multiplication_expression()
else if (current_token == MINUS)
num -= parse_multiplication_expression()
else {
unget_token()
return num
}
}
return num
}What is the correct name for this type of parser? This informative article only refers to it as the "Classic Solution": https://www.engr.mun.ca/~theo/Misc/exp_parsing.htm
There must be a proper name for this type of parser.
Solution
It is just an LL(1) parser implemented with recursive descent.
Starts with:
apply left-recursion removal to get an LL(1) grammar:
write the corresponding functions:
remove tail recursion:
inline:
and you have just to add the semantic processing to get your function.
Starts with:
AdditionExpression ::=
MultiplicationExpression
| AdditionExpression '+' MultiplicationExpression
| AdditionExpression '-' MultiplicationExpressionapply left-recursion removal to get an LL(1) grammar:
AdditionExpression ::=
MultiplicationExpression AdditionExpressionTail
AdditionExpressionTail ::=
| '+' MultiplicationExpression AdditionExpressionTail
| '-' MultiplicationExpression AdditionExpressionTailwrite the corresponding functions:
function parse_AdditionExpression() {
parse_MultiplicationExpression()
parse_AdditionExpressionTail()
}
function parse_AdditionExpressionTail() {
if (has_token()) {
get_token()
if (current_token == PLUS) {
parse_MultiplicationExpression()
parse_AdditionExpressionTail()
} else if (current_token == MINUS) {
parse_MultiplicationExpression()
parse_AdditionExpressionTail()
} else {
unget_token()
}
}
}remove tail recursion:
function parse_AdditionExpressionTail() {
while (has_token()) {
get_token()
if (current_token == PLUS)
parse_MultiplicationExpression()
else if (current_token == MINUS)
parse_MultiplicationExpression()
else {
unget_token()
return
}
}
}inline:
function parse_AdditionExpression() {
parse_MultiplicationExpression()
while (has_token()) {
get_token()
if (current_token == PLUS)
parse_MultiplicationExpression()
else if (current_token == MINUS)
parse_MultiplicationExpression()
else {
unget_token()
return
}
}
}and you have just to add the semantic processing to get your function.
Code Snippets
AdditionExpression ::=
MultiplicationExpression
| AdditionExpression '+' MultiplicationExpression
| AdditionExpression '-' MultiplicationExpressionAdditionExpression ::=
MultiplicationExpression AdditionExpressionTail
AdditionExpressionTail ::=
| '+' MultiplicationExpression AdditionExpressionTail
| '-' MultiplicationExpression AdditionExpressionTailfunction parse_AdditionExpression() {
parse_MultiplicationExpression()
parse_AdditionExpressionTail()
}
function parse_AdditionExpressionTail() {
if (has_token()) {
get_token()
if (current_token == PLUS) {
parse_MultiplicationExpression()
parse_AdditionExpressionTail()
} else if (current_token == MINUS) {
parse_MultiplicationExpression()
parse_AdditionExpressionTail()
} else {
unget_token()
}
}
}function parse_AdditionExpressionTail() {
while (has_token()) {
get_token()
if (current_token == PLUS)
parse_MultiplicationExpression()
else if (current_token == MINUS)
parse_MultiplicationExpression()
else {
unget_token()
return
}
}
}function parse_AdditionExpression() {
parse_MultiplicationExpression()
while (has_token()) {
get_token()
if (current_token == PLUS)
parse_MultiplicationExpression()
else if (current_token == MINUS)
parse_MultiplicationExpression()
else {
unget_token()
return
}
}
}Context
StackExchange Computer Science Q#74460, answer score: 11
Revisions (0)
No revisions yet.