HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Numeric expression parser - calculator

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
expressionparsercalculatornumeric

Problem

In particular, if someone could describe a better way to go from the tokenized list to the Expression tree, it would be super helpful. I would like to get rid of the casting in the parser but am not sure how.

Full code here

```
package org.bws.calc;

import java.util.ArrayList;

import org.bws.calc.exception.ParseException;
import org.bws.calc.expression.Expression;
import org.bws.calc.expression.ValueExpression;
import org.bws.calc.tokens.OpToken;
import org.bws.calc.tokens.Token;

public class SimpleParser implements Parsable{

public Expression parse(ArrayList tokens){
if(tokens.size() == 1){
Token t = tokens.get(0);
try{
int intVal = Integer.parseInt(t.getValue());
return new ValueExpression(intVal);
}catch(NumberFormatException nfe){
throw new ParseException("expected int but found: "+t.getValue());
}
}

Expression left;
Expression right;

int index=0;
Token firstToken = tokens.get(index++);

if("(".equals(firstToken.getValue())){
int openParens =1;
while(openParens > 0 ){
if(index>tokens.size()-1){throw new ParseException("Missing right Parenthesis");}
Token token = tokens.get(index);
if("(".equals(token.getValue())){ ++openParens;}
if(")".equals(token.getValue())){ --openParens;}
++index;
}
if(index == tokens.size()){
return parse(new ArrayList(tokens.subList(1, index-1) ) );
}
left = parse(new ArrayList(tokens.subList(1,index-1)));

}else{
int tokenVal = Integer.parseInt(firstToken.getValue());
left = new ValueExpression(tokenVal);
}

Token op = tokens.get(index);
if(! (op instanceof OpToken) ){
throw new ParseException("Invalid Syntax. expected Operator but found:"+

Solution

Before I begin, let me say, it looks like a well engineered project. You've separated the work into a tokenizer, parser, and calculator, which is excellent. The unit tests were helpful as well. Congratulations.

Before you begin to write a parser, you should define the grammar, which you should state as a long comment in your parser's JavaDoc. From my interpretation of your code, I see the following productions:

EXPRESSION = BINARY_EXPRESSION |
             TERM

BINARY_EXPRESSION = EXPRESSION BINARY_OPERATOR TERM

BINARY_OPERATOR = "+" | "-"

TERM = VALUE_EXPRESSION |
       PARENTHESIZED_EXPRESSION

VALUE_EXPRESSION = number

PARENTHESIZED_EXPRESSION = "(" EXPRESSION ")"


You don't support unary minus (negative numbers). That's OK; we'll keep it simple for now.

It would be a good idea to add the following unit test:

@Test
public void evaluateThreeOperatorWithNoParens(){
        int result = c.evaluate("5-2+1");
        Assert.assertEquals(4,result);
}


If you had defined the grammar carelessly, such as the following…

EXPRESSION = VALUE_EXPRESSION |
             BINARY_EXPRESSION |
             PARENTHESIZED_EXPRESSION

VALUE_EXPRESSION = number

BINARY_EXPRESSION = EXPRESSION BINARY_OPERATOR EXPRESSION

BINARY_OPERATOR = "+" | "-"

PARENTHESIZED_EXPRESSION = "(" EXPRESSION ")"


… then it would be ambiguous whether "5-2+1" should be interpreted as (5-2)+1 or 5-(2+1). The evaluateThreeOperatorWithNoParens() test enforces the former interpretation.

The main problem is your handling of parentheses. Scanning the entire token list in advance to check for matching parentheses is inelegant; you should be able to detect such syntax errors as part of your parsing routine. Also, having four if-statements checking for opening/closing parentheses is a sign of confusion.

To enforce some structure to your parser, you should have a helper function for each production listed above.

Instead of taking .subList()s, I think it would be more elegant to use a ListIterator to keep track of your position. Also, there's no reason why your parser has to take an ArrayList, when any List would be acceptable.

Parsable is an odd name for the interface. I'd consider a string to be parsable. Therefore, I suggest breaking with the "-able" suffix convention and just call it a Parser. (Plenty of interfaces in the standard Java API aren't named with an "-able" suffix either.)

/**
 * (JavaDoc describing the grammar goes here.)
 */
public class SimpleParser implements Parser {

    public Expression parse(List tokens) {
        ListIterator tokenIter = tokens.listIterator();
        Expression expr = parseExpression(tokenIter);
        if (tokenIter.hasNext()) {
            throw new ParseException("Extra text after expression: " + tokenIter.next().getValue());
        }
        return expr;
    }

    private static Expression parseExpression(ListIterator tokenIter) {
        if (!tokenIter.hasNext()) {
            throw new ParseException("Premature end of expression");
        }

        Expression expr = parseTerm(tokenIter);
        while (tokenIter.hasNext()) {
            Token op = tokenIter.next();
            if (op instanceof OpToken) {
                expr = parseBinaryExpression(expr, (OpToken)op, tokenIter);
            } else {
                tokenIter.previous();
                break;
            }
        }
        return expr;
    }

    private static Expression parseBinaryExpression(Expression leftExpr, OpToken op, ListIterator tokenIter) {
        return op.toExpression(leftExpr, parseTerm(tokenIter));
    }

    private static Expression parseTerm(ListIterator tokenIter) {
        Token t = tokenIter.next();
        if ("(".equals(t.getValue())) {
            return parseParenthesizedExpression(tokenIter);
        } else {
            return parseValueExpression(t);
        }
    }

    private static Expression parseValueExpression(Token t) {
        try {
            int intVal = Integer.parseInt(t.getValue());
            return new ValueExpression(intVal);
        } catch (NumberFormatException nfe) {
            throw new ParseException("expected int but found: " + t.getValue());
        }
    }

    private static Expression parseParenthesizedExpression(ListIterator tokenIter) {
        Expression innerExpr = parseExpression(tokenIter);
        if (!tokenIter.hasNext() || !")".equals(tokenIter.next().getValue())) {
            throw new ParseException("Missing right parenthesis");
        }
        return innerExpr;
    }
}

Code Snippets

EXPRESSION = BINARY_EXPRESSION |
             TERM

BINARY_EXPRESSION = EXPRESSION BINARY_OPERATOR TERM

BINARY_OPERATOR = "+" | "-"

TERM = VALUE_EXPRESSION |
       PARENTHESIZED_EXPRESSION

VALUE_EXPRESSION = number

PARENTHESIZED_EXPRESSION = "(" EXPRESSION ")"
@Test
public void evaluateThreeOperatorWithNoParens(){
        int result = c.evaluate("5-2+1");
        Assert.assertEquals(4,result);
}
EXPRESSION = VALUE_EXPRESSION |
             BINARY_EXPRESSION |
             PARENTHESIZED_EXPRESSION

VALUE_EXPRESSION = number

BINARY_EXPRESSION = EXPRESSION BINARY_OPERATOR EXPRESSION

BINARY_OPERATOR = "+" | "-"

PARENTHESIZED_EXPRESSION = "(" EXPRESSION ")"
/**
 * (JavaDoc describing the grammar goes here.)
 */
public class SimpleParser implements Parser {

    public Expression parse(List<Token> tokens) {
        ListIterator<Token> tokenIter = tokens.listIterator();
        Expression expr = parseExpression(tokenIter);
        if (tokenIter.hasNext()) {
            throw new ParseException("Extra text after expression: " + tokenIter.next().getValue());
        }
        return expr;
    }

    private static Expression parseExpression(ListIterator<Token> tokenIter) {
        if (!tokenIter.hasNext()) {
            throw new ParseException("Premature end of expression");
        }

        Expression expr = parseTerm(tokenIter);
        while (tokenIter.hasNext()) {
            Token op = tokenIter.next();
            if (op instanceof OpToken) {
                expr = parseBinaryExpression(expr, (OpToken)op, tokenIter);
            } else {
                tokenIter.previous();
                break;
            }
        }
        return expr;
    }

    private static Expression parseBinaryExpression(Expression leftExpr, OpToken op, ListIterator<Token> tokenIter) {
        return op.toExpression(leftExpr, parseTerm(tokenIter));
    }

    private static Expression parseTerm(ListIterator<Token> tokenIter) {
        Token t = tokenIter.next();
        if ("(".equals(t.getValue())) {
            return parseParenthesizedExpression(tokenIter);
        } else {
            return parseValueExpression(t);
        }
    }

    private static Expression parseValueExpression(Token t) {
        try {
            int intVal = Integer.parseInt(t.getValue());
            return new ValueExpression(intVal);
        } catch (NumberFormatException nfe) {
            throw new ParseException("expected int but found: " + t.getValue());
        }
    }

    private static Expression parseParenthesizedExpression(ListIterator<Token> tokenIter) {
        Expression innerExpr = parseExpression(tokenIter);
        if (!tokenIter.hasNext() || !")".equals(tokenIter.next().getValue())) {
            throw new ParseException("Missing right parenthesis");
        }
        return innerExpr;
    }
}

Context

StackExchange Code Review Q#41769, answer score: 4

Revisions (0)

No revisions yet.