patternjavaMinor
Numeric expression parser - calculator
Viewed 0 times
expressionparsercalculatornumeric
Problem
In particular, if someone could describe a better way to go from the tokenized list to the Expression tree, it would be super helpful. I would like to get rid of the casting in the parser but am not sure how.
Full code here
```
package org.bws.calc;
import java.util.ArrayList;
import org.bws.calc.exception.ParseException;
import org.bws.calc.expression.Expression;
import org.bws.calc.expression.ValueExpression;
import org.bws.calc.tokens.OpToken;
import org.bws.calc.tokens.Token;
public class SimpleParser implements Parsable{
public Expression parse(ArrayList tokens){
if(tokens.size() == 1){
Token t = tokens.get(0);
try{
int intVal = Integer.parseInt(t.getValue());
return new ValueExpression(intVal);
}catch(NumberFormatException nfe){
throw new ParseException("expected int but found: "+t.getValue());
}
}
Expression left;
Expression right;
int index=0;
Token firstToken = tokens.get(index++);
if("(".equals(firstToken.getValue())){
int openParens =1;
while(openParens > 0 ){
if(index>tokens.size()-1){throw new ParseException("Missing right Parenthesis");}
Token token = tokens.get(index);
if("(".equals(token.getValue())){ ++openParens;}
if(")".equals(token.getValue())){ --openParens;}
++index;
}
if(index == tokens.size()){
return parse(new ArrayList(tokens.subList(1, index-1) ) );
}
left = parse(new ArrayList(tokens.subList(1,index-1)));
}else{
int tokenVal = Integer.parseInt(firstToken.getValue());
left = new ValueExpression(tokenVal);
}
Token op = tokens.get(index);
if(! (op instanceof OpToken) ){
throw new ParseException("Invalid Syntax. expected Operator but found:"+
Full code here
```
package org.bws.calc;
import java.util.ArrayList;
import org.bws.calc.exception.ParseException;
import org.bws.calc.expression.Expression;
import org.bws.calc.expression.ValueExpression;
import org.bws.calc.tokens.OpToken;
import org.bws.calc.tokens.Token;
public class SimpleParser implements Parsable{
public Expression parse(ArrayList tokens){
if(tokens.size() == 1){
Token t = tokens.get(0);
try{
int intVal = Integer.parseInt(t.getValue());
return new ValueExpression(intVal);
}catch(NumberFormatException nfe){
throw new ParseException("expected int but found: "+t.getValue());
}
}
Expression left;
Expression right;
int index=0;
Token firstToken = tokens.get(index++);
if("(".equals(firstToken.getValue())){
int openParens =1;
while(openParens > 0 ){
if(index>tokens.size()-1){throw new ParseException("Missing right Parenthesis");}
Token token = tokens.get(index);
if("(".equals(token.getValue())){ ++openParens;}
if(")".equals(token.getValue())){ --openParens;}
++index;
}
if(index == tokens.size()){
return parse(new ArrayList(tokens.subList(1, index-1) ) );
}
left = parse(new ArrayList(tokens.subList(1,index-1)));
}else{
int tokenVal = Integer.parseInt(firstToken.getValue());
left = new ValueExpression(tokenVal);
}
Token op = tokens.get(index);
if(! (op instanceof OpToken) ){
throw new ParseException("Invalid Syntax. expected Operator but found:"+
Solution
Before I begin, let me say, it looks like a well engineered project. You've separated the work into a tokenizer, parser, and calculator, which is excellent. The unit tests were helpful as well. Congratulations.
Before you begin to write a parser, you should define the grammar, which you should state as a long comment in your parser's JavaDoc. From my interpretation of your code, I see the following productions:
You don't support unary minus (negative numbers). That's OK; we'll keep it simple for now.
It would be a good idea to add the following unit test:
If you had defined the grammar carelessly, such as the following…
… then it would be ambiguous whether
The main problem is your handling of parentheses. Scanning the entire token list in advance to check for matching parentheses is inelegant; you should be able to detect such syntax errors as part of your parsing routine. Also, having four if-statements checking for opening/closing parentheses is a sign of confusion.
To enforce some structure to your parser, you should have a helper function for each production listed above.
Instead of taking
Before you begin to write a parser, you should define the grammar, which you should state as a long comment in your parser's JavaDoc. From my interpretation of your code, I see the following productions:
EXPRESSION = BINARY_EXPRESSION |
TERM
BINARY_EXPRESSION = EXPRESSION BINARY_OPERATOR TERM
BINARY_OPERATOR = "+" | "-"
TERM = VALUE_EXPRESSION |
PARENTHESIZED_EXPRESSION
VALUE_EXPRESSION = number
PARENTHESIZED_EXPRESSION = "(" EXPRESSION ")"You don't support unary minus (negative numbers). That's OK; we'll keep it simple for now.
It would be a good idea to add the following unit test:
@Test
public void evaluateThreeOperatorWithNoParens(){
int result = c.evaluate("5-2+1");
Assert.assertEquals(4,result);
}If you had defined the grammar carelessly, such as the following…
EXPRESSION = VALUE_EXPRESSION |
BINARY_EXPRESSION |
PARENTHESIZED_EXPRESSION
VALUE_EXPRESSION = number
BINARY_EXPRESSION = EXPRESSION BINARY_OPERATOR EXPRESSION
BINARY_OPERATOR = "+" | "-"
PARENTHESIZED_EXPRESSION = "(" EXPRESSION ")"… then it would be ambiguous whether
"5-2+1" should be interpreted as (5-2)+1 or 5-(2+1). The evaluateThreeOperatorWithNoParens() test enforces the former interpretation.The main problem is your handling of parentheses. Scanning the entire token list in advance to check for matching parentheses is inelegant; you should be able to detect such syntax errors as part of your parsing routine. Also, having four if-statements checking for opening/closing parentheses is a sign of confusion.
To enforce some structure to your parser, you should have a helper function for each production listed above.
Instead of taking
.subList()s, I think it would be more elegant to use a ListIterator to keep track of your position. Also, there's no reason why your parser has to take an ArrayList, when any List would be acceptable.Parsable is an odd name for the interface. I'd consider a string to be parsable. Therefore, I suggest breaking with the "-able" suffix convention and just call it a Parser. (Plenty of interfaces in the standard Java API aren't named with an "-able" suffix either.)/**
* (JavaDoc describing the grammar goes here.)
*/
public class SimpleParser implements Parser {
public Expression parse(List tokens) {
ListIterator tokenIter = tokens.listIterator();
Expression expr = parseExpression(tokenIter);
if (tokenIter.hasNext()) {
throw new ParseException("Extra text after expression: " + tokenIter.next().getValue());
}
return expr;
}
private static Expression parseExpression(ListIterator tokenIter) {
if (!tokenIter.hasNext()) {
throw new ParseException("Premature end of expression");
}
Expression expr = parseTerm(tokenIter);
while (tokenIter.hasNext()) {
Token op = tokenIter.next();
if (op instanceof OpToken) {
expr = parseBinaryExpression(expr, (OpToken)op, tokenIter);
} else {
tokenIter.previous();
break;
}
}
return expr;
}
private static Expression parseBinaryExpression(Expression leftExpr, OpToken op, ListIterator tokenIter) {
return op.toExpression(leftExpr, parseTerm(tokenIter));
}
private static Expression parseTerm(ListIterator tokenIter) {
Token t = tokenIter.next();
if ("(".equals(t.getValue())) {
return parseParenthesizedExpression(tokenIter);
} else {
return parseValueExpression(t);
}
}
private static Expression parseValueExpression(Token t) {
try {
int intVal = Integer.parseInt(t.getValue());
return new ValueExpression(intVal);
} catch (NumberFormatException nfe) {
throw new ParseException("expected int but found: " + t.getValue());
}
}
private static Expression parseParenthesizedExpression(ListIterator tokenIter) {
Expression innerExpr = parseExpression(tokenIter);
if (!tokenIter.hasNext() || !")".equals(tokenIter.next().getValue())) {
throw new ParseException("Missing right parenthesis");
}
return innerExpr;
}
}Code Snippets
EXPRESSION = BINARY_EXPRESSION |
TERM
BINARY_EXPRESSION = EXPRESSION BINARY_OPERATOR TERM
BINARY_OPERATOR = "+" | "-"
TERM = VALUE_EXPRESSION |
PARENTHESIZED_EXPRESSION
VALUE_EXPRESSION = number
PARENTHESIZED_EXPRESSION = "(" EXPRESSION ")"@Test
public void evaluateThreeOperatorWithNoParens(){
int result = c.evaluate("5-2+1");
Assert.assertEquals(4,result);
}EXPRESSION = VALUE_EXPRESSION |
BINARY_EXPRESSION |
PARENTHESIZED_EXPRESSION
VALUE_EXPRESSION = number
BINARY_EXPRESSION = EXPRESSION BINARY_OPERATOR EXPRESSION
BINARY_OPERATOR = "+" | "-"
PARENTHESIZED_EXPRESSION = "(" EXPRESSION ")"/**
* (JavaDoc describing the grammar goes here.)
*/
public class SimpleParser implements Parser {
public Expression parse(List<Token> tokens) {
ListIterator<Token> tokenIter = tokens.listIterator();
Expression expr = parseExpression(tokenIter);
if (tokenIter.hasNext()) {
throw new ParseException("Extra text after expression: " + tokenIter.next().getValue());
}
return expr;
}
private static Expression parseExpression(ListIterator<Token> tokenIter) {
if (!tokenIter.hasNext()) {
throw new ParseException("Premature end of expression");
}
Expression expr = parseTerm(tokenIter);
while (tokenIter.hasNext()) {
Token op = tokenIter.next();
if (op instanceof OpToken) {
expr = parseBinaryExpression(expr, (OpToken)op, tokenIter);
} else {
tokenIter.previous();
break;
}
}
return expr;
}
private static Expression parseBinaryExpression(Expression leftExpr, OpToken op, ListIterator<Token> tokenIter) {
return op.toExpression(leftExpr, parseTerm(tokenIter));
}
private static Expression parseTerm(ListIterator<Token> tokenIter) {
Token t = tokenIter.next();
if ("(".equals(t.getValue())) {
return parseParenthesizedExpression(tokenIter);
} else {
return parseValueExpression(t);
}
}
private static Expression parseValueExpression(Token t) {
try {
int intVal = Integer.parseInt(t.getValue());
return new ValueExpression(intVal);
} catch (NumberFormatException nfe) {
throw new ParseException("expected int but found: " + t.getValue());
}
}
private static Expression parseParenthesizedExpression(ListIterator<Token> tokenIter) {
Expression innerExpr = parseExpression(tokenIter);
if (!tokenIter.hasNext() || !")".equals(tokenIter.next().getValue())) {
throw new ParseException("Missing right parenthesis");
}
return innerExpr;
}
}Context
StackExchange Code Review Q#41769, answer score: 4
Revisions (0)
No revisions yet.