patternpythonMinor
Parsing mathematical expressions in Reverse Polish Notation
Viewed 0 times
polishreverseexpressionsparsingnotationmathematical
Problem
I've wanted to make a calculator for a long time now, so I wrote one of my first parsers, where the input is in reverse polish notation, \$1 \ 1 +\$ becomes \$2\$.
It supports to following operators;
While it's the best calculator I've written, the tokens are limited to only ones that take two operators. And I don't really like the idea of having a separate dictionary for ones that take more or less. This is a shame as now I can't add functions like
Personally I dislike the error you get when you
```
TOKENS = {
'+': lambda a, b: a + b,
'-': lambda a, b: a - b,
'': lambda a, b: a b,
'/': lambda a, b: a / b,
'^': lambda a, b: a ** b,
'': lambda a, b: a b,
'%': lambda a, b: a % b,
'mod': lambda a, b: a % b,
'//': lambda a, b: a // b
}
class CalculatorError(Exception):
pass
def make_pop(stack):
def pop():
try:
return stack.pop()
except IndexError:
raise CalculatorError('Not enough arguments for function')
return pop
def call():
stack = []
pop = make_pop(stack)
buf = []
calculation = iter(input('> ') + ' ')
for char in calculation:
if char not in ' \t':
buf.append(char)
continue
token = ''.join(buf)
if not token:
continue
if token in TOKENS:
a = pop()
b = pop()
function = TOKENS[token]
try:
return_value = function(b, a)
except ZeroDivisionError:
raise Calcul
It supports to following operators;
+, -, *, /, // floor division, ^ or ** exponent, and % or mod modulo.While it's the best calculator I've written, the tokens are limited to only ones that take two operators. And I don't really like the idea of having a separate dictionary for ones that take more or less. This is a shame as now I can't add functions like
sqrt or sin, or constants like pi or e, without having to have a different group for each.Personally I dislike the error you get when you
^C out of the program, and so I silenced it. I also made a new error CalculatorError to not mask unknown errors, to easily exit out of the current equation if more than one function is called, make_pop, and to have a human readable error message.```
TOKENS = {
'+': lambda a, b: a + b,
'-': lambda a, b: a - b,
'': lambda a, b: a b,
'/': lambda a, b: a / b,
'^': lambda a, b: a ** b,
'': lambda a, b: a b,
'%': lambda a, b: a % b,
'mod': lambda a, b: a % b,
'//': lambda a, b: a // b
}
class CalculatorError(Exception):
pass
def make_pop(stack):
def pop():
try:
return stack.pop()
except IndexError:
raise CalculatorError('Not enough arguments for function')
return pop
def call():
stack = []
pop = make_pop(stack)
buf = []
calculation = iter(input('> ') + ' ')
for char in calculation:
if char not in ' \t':
buf.append(char)
continue
token = ''.join(buf)
if not token:
continue
if token in TOKENS:
a = pop()
b = pop()
function = TOKENS[token]
try:
return_value = function(b, a)
except ZeroDivisionError:
raise Calcul
Solution
I'm not quite sure what you're asking for, but I was bored so I typed this up anyway. Hopefully it helps. ;-)
Working from the outside in:
Invocation
This is all fine, but it probably should be in the
Mostly, this is a matter of convention, but also, it's handy to have a function that can be run from another Python script that will just do everything.
Organizing the driver code
The process of evaluating an RPN expression is something that makes a lot of sense to put as its own function. That's a self-contained task you could easily want to reuse elsewhere. So I would suggest making a function
Evaluating
This new
Let the stack just be a stack
The only real purpose of
Do tokenization the easy way
Your input strings consist of tokens (numbers, operators, functions) separated by whitespace, right? There's a built-in method to split a whitespace-separated string:
Don't overuse variables
You don't really have to save
If different parts of this expression could raise the same type of error, then that would be a good reason to break it down, so that you could tell exactly where a given error comes from. But that's not an issue here. You'll only ever get
Same goes for the float-parsing part:
For complicated sequences of code, you have to make a subjective judgement about where to use variables and where not to in order to make the code as clear as possible. But in this case, I think most programmers would agree that the variables are fairly unnecessary.
Let output formatting be controlled by the formatting routines
Evaluating an RPN expression doesn't include formatting - in particular, deciding whether the result is an integer or not is a separate task from just getting the result in the first place. So at the end of
and it will automatically strip trailing zeros as needed. Of course, this does display large numbers in scientific notation (
Use descriptive variable names
Your
Allowing arbitrary numbers of arguments
This is really more about adding a feature than reviewing your existing code, but I might as well point you in the right direction. All you have to do i
Working from the outside in:
Invocation
if __name__ == '__main__':
print('Press ^C to exit')
try:
main()
except KeyboardInterrupt:
pass
# For some reason my Windows raises EOF on ^C...
except EOFError:
print()
print('Goodbye!')This is all fine, but it probably should be in the
main() function, and you just writeif __name__ == '__main__':
main()Mostly, this is a matter of convention, but also, it's handy to have a function that can be run from another Python script that will just do everything.
Organizing the driver code
The process of evaluating an RPN expression is something that makes a lot of sense to put as its own function. That's a self-contained task you could easily want to reuse elsewhere. So I would suggest making a function
evaluate_rpn(expr) which does exactly that. You can then use this function in your main loop as follows:def main():
print(...)
try:
while True:
try:
print(evaluate_rpn(input('> ')))
except CalculatorError as e:
print(e)
except KeyboardInterrupt:
# and so onEvaluating
This new
evaluate_rpn() function is going to do mostly the same thing as your call() function currently does, except that instead of prompting using input(), it will take a string argument, and instead of printing the result, it will return it. Let me pick on a few things in that function, though.Let the stack just be a stack
stack = []
pop = make_pop(stack)The only real purpose of
make_pop is to make it so that you get a CalculatorError instead of an IndexError when a function or operator doesn't have enough arguments. But I don't think it makes sense to do that. See, stack is just a stack. (Well, a list that is serving as a stack.) It knows nothing of numbers or arguments or calculators and really has no business raising a CalculatorError. The code in evaluate_rpn() is the thing that knows it is a calculator. That's where you should be raising CalculatorErrors. So instead of using make_pop, just watch for IndexErrors during the evaluation, catch them, and raise a CalculatorError in the handling code. As a side effect you won't need make_pop() anymore; you can just dispense with it entirely.try:
a = pop()
b = pop()
except IndexError:
raise CalculatorError(...)Do tokenization the easy way
Your input strings consist of tokens (numbers, operators, functions) separated by whitespace, right? There's a built-in method to split a whitespace-separated string:
string.split(). You don't need to bother with a buffer, or with iterating through the string's characters.for token in calculation.split():
# and so onDon't overuse variables
You don't really have to save
a and b and return_value separately; you only use them once each. if token in TOKENS:
try:
stack.append(TOKENS[token](stack.pop(), stack.pop())
except ZeroDivisionError:
raise CalculatorError(...)
except IndexError: # from a couple paragraphs ago
# and so onIf different parts of this expression could raise the same type of error, then that would be a good reason to break it down, so that you could tell exactly where a given error comes from. But that's not an issue here. You'll only ever get
ZeroDivisionErrors for one reason.Same goes for the float-parsing part:
else:
try:
stack.append(float(token))
except ValueError:
raise CalculatorError(...)For complicated sequences of code, you have to make a subjective judgement about where to use variables and where not to in order to make the code as clear as possible. But in this case, I think most programmers would agree that the variables are fairly unnecessary.
Let output formatting be controlled by the formatting routines
Evaluating an RPN expression doesn't include formatting - in particular, deciding whether the result is an integer or not is a separate task from just getting the result in the first place. So at the end of
evaluate_rpn, just return a float. When you print it, instead of print(result) you can doprint('{:g}'.format(result))and it will automatically strip trailing zeros as needed. Of course, this does display large numbers in scientific notation (
2.89146e+07); if that's a problem, then you might have to do your own formatting.Use descriptive variable names
Your
TOKENS list doesn't actually contain tokens; it contains operators, and could be generalized to contain functions. So I'd suggest naming it OPERATORS or some such thing.Allowing arbitrary numbers of arguments
This is really more about adding a feature than reviewing your existing code, but I might as well point you in the right direction. All you have to do i
Code Snippets
if __name__ == '__main__':
print('Press ^C to exit')
try:
main()
except KeyboardInterrupt:
pass
# For some reason my Windows raises EOF on ^C...
except EOFError:
print()
print('Goodbye!')if __name__ == '__main__':
main()def main():
print(...)
try:
while True:
try:
print(evaluate_rpn(input('> ')))
except CalculatorError as e:
print(e)
except KeyboardInterrupt:
# and so onstack = []
pop = make_pop(stack)try:
a = pop()
b = pop()
except IndexError:
raise CalculatorError(...)Context
StackExchange Code Review Q#106407, answer score: 5
Revisions (0)
No revisions yet.