patterncsharpModerate
Math expression parser in C#
Viewed 0 times
expressionparsermath
Problem
EDIT: Added refactored version 2.0 to the end!
I have a parser that is supposed to take a string similar to a math expression and return a Tuple of:
I have honestly no idea how Buffer like readers are usually implemented so I'm open to suggestions on how to make my class a little less messy (I particularly don't like the fact I'm using
```
using System;
using System.Collections.Generic;
using System.Text;
namespace MathExpressionSolver.Parser
{
class ExpressionParser
{
const int TKNLGHT = 4;
public bool SkipInvalidChars { get; set; } = true;
private StringBuilder charBuffer;
private int bufferPointer = 0;
private List listOfParsedExpressions;
private List listOfParsedTypes;
private string stringExpression;
public string StringExpression
{
set
{
listOfParsedExpressions.Clear();
listOfParsedExpressions.Capacity = stringExpression.Length / TKNLGHT;
listOfParsedTypes.Clear();
listOfParsedTypes.Capacity = stringExpression.Length / TKNLGHT;
stringExpression = value;
}
}
public ExpressionParser()
{
charBuffer = new StringBuilder(TKNLGHT);
listOfParsedExpressions = new List();
listOfParsedTypes = new List();
stringExpression = string.Empty;
}
public ExpressionParser(string expression) : this()
{
StringExpression = expression;
}
public Tuple ParseExpression()
I have a parser that is supposed to take a string similar to a math expression and return a Tuple of:
- An array of strings where each string corresponds to one 'element' (number, operator, func name).
- An array of custom enum indicating whether corresponding string in array is Name, Number, etc.
I have honestly no idea how Buffer like readers are usually implemented so I'm open to suggestions on how to make my class a little less messy (I particularly don't like the fact I'm using
bufferPointer in five different functions. The way I'm testing it I'm not out of my array in two places also doesn't seem to me as a good practice (though I couldn't find any good solution)).```
using System;
using System.Collections.Generic;
using System.Text;
namespace MathExpressionSolver.Parser
{
class ExpressionParser
{
const int TKNLGHT = 4;
public bool SkipInvalidChars { get; set; } = true;
private StringBuilder charBuffer;
private int bufferPointer = 0;
private List listOfParsedExpressions;
private List listOfParsedTypes;
private string stringExpression;
public string StringExpression
{
set
{
listOfParsedExpressions.Clear();
listOfParsedExpressions.Capacity = stringExpression.Length / TKNLGHT;
listOfParsedTypes.Clear();
listOfParsedTypes.Capacity = stringExpression.Length / TKNLGHT;
stringExpression = value;
}
}
public ExpressionParser()
{
charBuffer = new StringBuilder(TKNLGHT);
listOfParsedExpressions = new List();
listOfParsedTypes = new List();
stringExpression = string.Empty;
}
public ExpressionParser(string expression) : this()
{
StringExpression = expression;
}
public Tuple ParseExpression()
Solution
Naming and Style
Ternary expressions
You shouldn't use them like you do.
can be expressed as
which in turn is just
which is more readable.
Simplification
can be reduced to
ExpressionParser
-
Instead of incrementing the
-
Instead of returning a
-
in the currents state there is no advantage of having a
-
you should add a
Refactoring
I am not 100% happy with this, but it is a start
First we need a method to get the
as you can see, I have already added
Next we add a method to check if it is a possible compound expression (a name or num)
but, keeping in mind there could be the possibility (because the input is faulted), that a number would be followed by a name or opposite, we need to check if the lasttype is the same as the current.
I have added another value to the enum:
The
and the former
```
private void ParseExpression(String expression)
{
StringBuilder charBuffer = new StringBuilder(expression.Length);
listOfParsedExpressions.Clear();
listOfParsedTypes.Clear();
Boolean lastExpressionWasCompound = false;
ParsedSubstringType lastType = ParsedSubstringType.Invalid;
foreach (Char token in expression)
{
ParsedSubstringType currentType = GetTokenType(token);
if (IsPossibleCompoundExpression(currentType, lastType))
{
if (lastType == ParsedSubstringType.NotSet || currentType != lastType)
{
listOfParsedExpressions.Add(charBuffer.ToString());
listOfParsedTypes.Add(lastType);
charBuffer.Clear();
}
charBuffer.Append(token);
lastExpressionWasCompound = true;
lastType = currentType;
continue;
}
if (lastExpressionWasCompound)
{
listOfParsedExpressions.Add(charBuffer.ToString());
listOfParsedTypes.Add(lastType);
charBuffer.Clear();
lastExpressionWasCompound = false;
}
if (currentType == ParsedSubstringType.Invalid && SkipInvalidChars)
{
continue;
}
listOfParsedTypes.Add(currentType);
String parsedExpression = currentType == Parsed
- Based on the naming guidlines method names should be using
PascalCasecasing.
- you should not prefix the variablenames with the datatype ->
listOfParsedExpressions
- variable names should be meaningful so you or Mr.Maintainer will understand them in 6 months also ->
const int TKNLGHT = 4;
- you should use braces
{}for singleifstatements also. This will make your code less errorprone.
Ternary expressions
You shouldn't use them like you do.
public static bool IsSeparator(char a)
{
return (a == ';') ? true : false;
}can be expressed as
public static bool IsSeparator(char a)
{
if (a == ';')
{
return true;
}
else
{
return false;
}
}which in turn is just
public static bool IsSeparator(char a)
{
return (a == ';');
}which is more readable.
Simplification
public static bool IsOperator (char a)
{
switch (a)
{
case '+':
return true;
case '-':
return true;
case '*':
return true;
case '/':
return true;
case '%':
return true;
case '=':
return true;
default:
return false;
}
}can be reduced to
public static bool IsOperator(char a)
{
return "+-*/=".Contains(a);
}ExpressionParser
-
Instead of incrementing the
bufferPointer variable in multiple places, you should consider to either di it at one place (one method) only or change the parseNextToken() method to take a char as input parameter, where I prefer the later. -
Instead of returning a
Tuple you should add 2 properties where the lists can be accessed. -
in the currents state there is no advantage of having a
IsLeftBracket and a IsRightBracket check in the parseNextToken() method. -
you should add a
ParsedSubstringType.InvalidToken for the result beeing more clear Refactoring
I am not 100% happy with this, but it is a start
First we need a method to get the
ParsedSubstringType of a given char private ParsedSubstringType GetTokenType(char c)
{
if (ParserHelper.IsNameChar(c))
{
return ParsedSubstringType.Name;
}
if (ParserHelper.IsNum(c))
{
return ParsedSubstringType.Num;
}
if (ParserHelper.IsWhiteSpace(c))
{
return ParsedSubstringType.WhiteSpace;
}
if (ParserHelper.IsLeftBracket(c) || ParserHelper.IsRightBracket(c))
{
return ParsedSubstringType.Bracket;
}
if (ParserHelper.IsOperator(c))
{
return ParsedSubstringType.Operator;
}
if (ParserHelper.IsSeparator(c))
{
return ParsedSubstringType.Separator;
}
return ParsedSubstringType.Invalid;
}as you can see, I have already added
Invalid to the enum. Next we add a method to check if it is a possible compound expression (a name or num)
private bool IsPossibleCompoundExpression(ParsedSubstringType currentType)
{
return (
currentType == ParsedSubstringType.Name ||
currentType == ParsedSubstringType.Num
);
}but, keeping in mind there could be the possibility (because the input is faulted), that a number would be followed by a name or opposite, we need to check if the lasttype is the same as the current.
I have added another value to the enum:
NotSet indicating especial the same The
StringBuilder charBuffer is now local to the method.and the former
parseExpression() method where I added an input parameter ```
private void ParseExpression(String expression)
{
StringBuilder charBuffer = new StringBuilder(expression.Length);
listOfParsedExpressions.Clear();
listOfParsedTypes.Clear();
Boolean lastExpressionWasCompound = false;
ParsedSubstringType lastType = ParsedSubstringType.Invalid;
foreach (Char token in expression)
{
ParsedSubstringType currentType = GetTokenType(token);
if (IsPossibleCompoundExpression(currentType, lastType))
{
if (lastType == ParsedSubstringType.NotSet || currentType != lastType)
{
listOfParsedExpressions.Add(charBuffer.ToString());
listOfParsedTypes.Add(lastType);
charBuffer.Clear();
}
charBuffer.Append(token);
lastExpressionWasCompound = true;
lastType = currentType;
continue;
}
if (lastExpressionWasCompound)
{
listOfParsedExpressions.Add(charBuffer.ToString());
listOfParsedTypes.Add(lastType);
charBuffer.Clear();
lastExpressionWasCompound = false;
}
if (currentType == ParsedSubstringType.Invalid && SkipInvalidChars)
{
continue;
}
listOfParsedTypes.Add(currentType);
String parsedExpression = currentType == Parsed
Code Snippets
public static bool IsSeparator(char a)
{
return (a == ';') ? true : false;
}public static bool IsSeparator(char a)
{
if (a == ';')
{
return true;
}
else
{
return false;
}
}public static bool IsSeparator(char a)
{
return (a == ';');
}public static bool IsOperator (char a)
{
switch (a)
{
case '+':
return true;
case '-':
return true;
case '*':
return true;
case '/':
return true;
case '%':
return true;
case '=':
return true;
default:
return false;
}
}public static bool IsOperator(char a)
{
return "+-*/=".Contains(a);
}Context
StackExchange Code Review Q#73308, answer score: 13
Revisions (0)
No revisions yet.