Image Map of Navigational Panel to Home / Contents / Search Parsing Expressions of Interest

Part I : The Early Years

Image of line

Most projects have two major parts to the code that has to be written between the time you read the spec and the fat lady sings. The boring code and the interesting code. If you are lucky there will be more of the latter than the former. I have been reasonably lucky in that I have recently been working on a project which includes quite a large amount of interesting code to write (unfortunately the total amount of code to write was enormous). Much of it involved metadata and layers of abstraction and doesn't translate very well into other contexts, so I won't talk about that here.

However, one section of interest involved storing calculations in a database that would then be applied to figures entered in a spreadsheet control. Sure, the spreadsheet control could have been given predefined formulas in most cases, but the spec required that calculations be stored in the database and that, in some ways, simplified maintenance. All I had to do then was write a parser that would take a string representing a valid mathematical expression and resolve it to an answer. I used a simple context-based parser that evaluates left to right. I used left to right evaluation for simplicity, as implementing a correct order of operation was a little more tricky and we needed to get at least the basic parser up and running quickly.

The code for this simple parser is included in the downloadable zip file (22kb), it is too long and complex to include here. The pseudocode looks a little like this:

  while there is still an unresolved portion of the expression
      if we have no left operand
          determine what the left operand is - store in LOp
      determine what the operator is - store in Op
      determine what the right operand is - store in ROp
      evaluate the simple expression LOp Op ROp
      store the result in the left operand
      clear the current operator
      clear the current right operator
  return the Left Operand as the result

To explain further, as each character is read from the string we check to see if it fits into the current context. If it does we evaluate it in that context. If it does not, we clear the current context and let the character we have just read establish what the new context is. For example, if we are in the context of reading the left operand and we encounter a digit, then we keep the current context and add that digit to the left operand. If we encounter something which is not a digit (or decimal point and so on), then we clear the context and allow the character to determine what the context should be. If the character was a + we would shift to operator context and add that character to the current operator. Because operators are only ever one character, we immediately clear the context ready for the fresh evaluation of the next character. And so on. Once we have a left operand, operator and a right operator we perform a calculation.

As you can see what we do is resolve the string from left to right. It works pretty well. However, because the function deals with its variables in strings, and returns a string, we ran into a problem with very small and very large numbers. When they were rendered as strings they would be rendered in scientific notation. When this was converted back to a number the scientific notation was ignored and we invariably ended up with a very silly number. For example 0.0000000000124564 is a very small number that came out as 1.24564 because of the incorrect evaluation of the string '1.24564E-11'. It's understandable, but annoying. This is why the code forces a format when converting numbers to strings, we lose decimal places after a certain point - but it otherwise gives correct answers to that level of significance. The level of significance was deemed suitable for the application in question. You may want to adjust how that is done if you use the same code in another application.


Image of arrow to previous article Image of arrow to previous article

Image of line

[HOME] [TABLE OF CONTENTS] [SEARCH]