Simple Parsing Calculator

Ali Erbay
5 min readAug 24, 2021

--

Photo by Mert Kahveci on Unsplash

A couple of months ago I was looking for some code challenges to solve and found a problem that seems to be straightforward.

Here it is,

Create a function that calculates and returns the value of an integer expression given as a space separated string.

calculate(“2 + 3”)

Support the 4 basic operators: addition (+), subtraction (-), multiplication (*) and division

calculate("3 * 2 + 1")

Support both positive and negative integers.

calculate("3 * -2 + 6")

The first and easiest solution to this problem could be using the built-in eval() function of Javascript. Yes, it parses the input string and runs as a Javascript code. But we don’t like eval() and it is considered dangerous to use anyway, so better not get accustomed to using it. We will discover some other options but first, let’s write some test code with Jest, I will do some assertions also covering some edge cases for the function input.

I will add some input sanitation to throw errors when some irregular inputs are provided, this is not necessary but I took it as a part of the challenge to handle edge cases.

Iterator Solution

This solution proposal consists of two nested for loops and solves the calculation string obeying mathematical operation precedence rules.

At the top, I check for input validation with an ugly regex expression which could have been done maybe a bit better, but it still works for our purposes. I also need to point out that it will hurt the performance of the function if we are putting large inputs with very long strings because Regex calls are generally expensive in the aspect of computational resources. Then I had to define possible math operands for our use in their precedence order.

As a student normally would do, the function first selects the highest operational precedence math operators which in this case Multiplication and Division, and searches through the input string from left to right(still obeying the operation precedence). Until it encounters a multiplication or division operand it continues to push original calculation string elements to an intermediate array called NyCalc, when it encounters a multiplication or division it makes the calculation with the previous element and the element after the operand and pushes the result to the NyCalc intermediate array.

2 + 4 * 4 / 2 - 1 + 3 * 3

becomes,

2 + 8 - 1 + 9

Now, the function’s outer loop will switch to the second operand couple, and reduce the original array from left to right to a single number result.

18

This implementation passes the tests, but looking for a better solution or proper solution led me to research parsers. I am not a computer science graduate so I just had a fairly weak understanding of the parser concept.

Parser Solution

The term parsing comes from Latin pars (orationis), meaning part (of speech). Within computational linguistics the term is used to refer to the formal analysis by a computer of a sentence or other string of words into its constituents, resulting in a parse tree showing their syntactic relation to each other. Our parser will read the input left to right and try to make connections between the elements according to our grammar rule.

The parser should create an Abstract Syntax Tree where each node of the tree denotes a mathematical operation such as addition, multiplication etc…

Abstract Syntax Tree example

In order to get this tree, we need to define grammar rules to follow when we parse through the input string. One of the solutions to recursive-descent parsing of expressions is to create a new nonterminal for each level of precedence. Following productions are needed to define the grammar rule ;

Pnum : Number
Pmuldiv : Pnum{('*'|'/')Pnum}
Paddsub : Pmuldiv{('+'|'-')Pmuldiv}

{ and } enclose parts of the productions that may be repeated 0 or more times, and | separates alternatives. According to these rules, we can formulate the parsing by applying a simple split() lexer to do the tokenization of the input and check input validity then process functions for precedence levels.

splitString function is the tokenizer and happens to do an input validation at the same time, this is not ideal since the separation of concerns should be applied here, we will deal with that later.

In the parse function definition, there are checkPos and jumpNext function definitions which let us move forward on the input string and return the actual token we are on. parseNum, parseAddSub, parseMulDiv functions help us to follow the input string tokens one by one and return objects containing the parse tree structure.

The parser will do a single left-to-right pass over checkPos with no backtracking, pos is the index of the next token, so we start at 0, we increment it as we go. jumpNext consumes one token moving the pos to the next one.

The entrance point of the parsing process starts at parseAddSub since it is the lowest precedence in our calculator model.

Let’s take a look at the abstract syntax tree again;

Abstract syntax tree

The integer values would appear at the leaf nodes, while the interior nodes represent the operators.
To evaluate the syntax tree, a recursive approach can be followed,

Recursive function for final calculation from the parse tree

Some improvements

If you are still here, we can make a simple improvement to get rid of that long RegEx expression to check the input. Here is what we are going to do, we will give our parser to check the input on the go. We will not try to check the whole input before even start parsing.

I just added isNumber function to do input checks in parseNumber function.

Adding a new operand

What would we need to do if we want to add Exponentiation to our calculator?

First, we need to re-evaluate our grammar rule then we need to add a new production at the correct precedence level. In the usual computer science jargon, exponentiation in mathematics is right-associative, we must reflect this in our production.

Pnum : Number
Pexp : Pnum{'^'Pexp}
Pmuldiv : Pexp{('*'|'/')Pexp}
Paddsub : Pmuldiv{('+'|'-')Pmuldiv}

Note that the left-associative and the right-associative operators are treated differently; left-associative operators are consumed in a loop, while right-associative operators are handled with right-recursive productions.

parseExp function is provided for the exponential operand. Now let’s update the final calculation recursive function.

Final Words

Here it is, we have written a parser that obeys grammar rules and constructs an abstract syntax tree with enough information to solve the calculation by a recursive function.

It has been great practice and opportunity for me to learn more about computer science concepts. My next goal would be to understand and read more about parsers and compilers in general. I just barely scratched the surface but everybody needs a place to start!

--

--

Ali Erbay
Ali Erbay

Written by Ali Erbay

Full Stack Developer BDD/TDD || Ruby on Rails || React || React Native || NodeJS || Express https://github.com/kermit-klein