Binary Expression Trees


Expressions are made up of values, on which binary operations may be performed. Each node of a binary tree may have at most two children; therefore a simple binary expression can be represented as a two level binary tree. The root node contains the operator, and the two children contain the two operands.



Recall that a tree node (doubleLinkNode) has the following instance variables:


   Private leftNode As doubleLinkNode
   Private info As Variant
   Private rightNode As doubleLinkNode

Because the info field is Variant, each node is capable of storing either an operand (numeric) or an operator (character).


The various parts of a complicated expression have lower and higher precedence of evaluation. For instance, in the expression (A + B) * C, the part (A + B) is evaluated first. A binary tree can represent a complex expression by using the levels of the tree to indicate the precedence of evaluation.


To evaluate the tree, begin at the root. In the figure above the root contains the operator, *, so its children are examined to determine the two operands. The subtrees to the left and right of the root contain the two operands. Since the node to the left of the root contains another operator, -, it is apparent that the left subtree consists of an expression. The left subtree is a simple expression because both children are operands. The left subtree must be evaluated before the multiplication specified by the root is performed. This is true of the right subtree as well. From this example, it can be seen that operations at higher levels of the tree are evaluated later than those below them. The operation at the root of the tree will always be the last to be performed.

The value of the complete tree is equal to

<operand 1> <bin operator> <operand 2>

where <bin operator> is one of the binary operators in the root node, <operand 1> is the value of its left subtree, and <operand 2> is the value of its right subtree. 

The value of a subtree can be determined as follows: 

 


   Definition of problem: Evaluate the expression represented by the binary tree.

   Size of problem: The entire tree. 

   Base case: If the content of the node is an operand, then return the value of 
      the operand.

   General case: If the contents of the node is an operator, then call the
      evaluateTree method on the left and right subtrees, and then apply the
      operator to the value returned.  That is, 

   evaluateTree = evaluateTree (left subtree) <bin operator> _ 
                             evaluateTree (right subtree)

 


   '--------------------------------------------------------------------------------------------
   ' Evaluates the expression represented by the  binary expression 
   ' 
tree pointed to by node.  A Variant is returned because the 
   ' operands can be any valid numeric data type.
   '--------------------------------------------------------------------------------------------
   Public Function evaluateTree (ByRef node as doubleLinkNode ) as Variant

      ' Base Case: Node contains an operand
      If Not isOperator (node.getInfo( )) Then
            evaluateTree = Val(node.getInfo( ))
      Else
            ' General case: Node contains an operator. Call the method on 
            ' the left and right subtrees, and then apply the operator to the 
            ' value returned. 
            Select Case node.getInfo( )
                Case "+"
                   
evaluateTree = evaluateTree(node.getLeftNode( )) 
                                              evaluateTree(node.getRightNode( ))   
                Case "-"
                   
evaluateTree = evaluateTree(node.getLeftNode( )) 
                                              evaluateTree(node.getRightNode( ))   
                Case "*"
                   
evaluateTree = evaluateTree(node.getLeftNode( )) 
                                              evaluateTree(node.getRightNode( ))   
                Case "/"
                   
evaluateTree = evaluateTree(node.getLeftNode( )) 
                                              evaluateTree(node.getRightNode( ))   
                Case "%"
                   
evaluateTree = evaluateTree(node.getLeftNode( ))  Mod 
                                              evaluateTree(node.getRightNode( ))   
                Case "^"
                   
evaluateTree = evaluateTree(node.getLeftNode( )) 
                                              evaluateTree(node.getRightNode( ))   
            End Select
        EndIf
    End Function             


   '---------------------------------------------------------------------------------
   ' Returns True if the character is an arithmetic operator
   '---------------------------------------------------------------------------------
   Private Function isOperator(ByVal character As String) As Boolean
      character = getChar(character)
      isOperator = (character = "+") Or (character = "-") Or _
                            (character = "*") Or (character = "/") Or _
                            (character = "^") Or (character = "%")
   End Function

 


 


   '---------------------------------------------------------------------------------------------    
   ' doubleLinkNode class
   '---------------------------------------------------------------------------------------------    

   ' instance variables
   Private leftNode As doubleLinkNode
   Private info As Variant
   Private rightNode As doubleLinkNode

   ' class methods
   '---------------------------------------------------------------------------------------------    
   ' Constructor
   '---------------------------------------------------------------------------------------------    
   Private Sub Class_Initialize( )
      Set leftNode = Nothing
      Set rightNode = Nothing
   End Sub

   '---------------------------------------------------------------------------------------------    
   ' Set node value
   '---------------------------------------------------------------------------------------------    
   Public Sub setInfo(ByVal newValue As Variant)
      info = newValue
   End Sub

   '---------------------------------------------------------------------------------------------    
   ' Return node value
   '---------------------------------------------------------------------------------------------    
   Public Function getInfo( ) As Variant
      getInfo = info
   End Function

   '---------------------------------------------------------------------------------------------    
   ' Reset rightNode reference
   '---------------------------------------------------------------------------------------------    
   Public Sub setRightNode(ByRef followingNode As doubleLinkNode)
      Set rightNode = followingNode
   End Sub

   '---------------------------------------------------------------------------------------------    
   ' Reset leftNode reference
   '---------------------------------------------------------------------------------------------    
   Public Sub setLeftNode(ByRef previousNode As doubleLinkNode)
      Set leftNode = previousNode
   End Sub

   '---------------------------------------------------------------------------------------------    
   ' Return rightNode reference
   '---------------------------------------------------------------------------------------------    
   Public Function getRightNode( ) As doubleLinkNode
      Set getRightNode = rightNode
   End Function

   '---------------------------------------------------------------------------------------------    
   ' Return leftNode reference     
   '---------------------------------------------------------------------------------------------    
   Public Function getLeftNode( ) As doubleLinkNode
      Set getLeftNode = leftNode
   End Function

 

Building a Binary Expression Tree

This section will explain an algorithm for building a binary expression tree from an expression in prefix notation. For simplicity, single letters will be used as operands.

The basic format of the prefix expression is

BinOperator Operand1 Operand2

For a simple prefix expression, like + A B, the operator, +, will go in the root node, and the operands, A and B, will go in its left and right child nodes, respectively. 

In more complex cases, one or more of the operands may also be expressions.  For example, to represent + * A Y B [equivalent to (A * Y) + B] in a tree:



In order to insert a node into a tree the algorithm will move left each time until an operand has been inserted.  Then it backtracks to the last operator, and inserts the next node to its right. The pattern continues: If an operator node has just been inserted then  the next node is inserted to its left; if an operand node has  just been inserted then it backtracks and puts the next node to the right of the last operator.

In addition to the tree that is being created, there must also be a temporary data structure in which to store pointers to the operator nodes (to support the backtracking described above). Backtracking indicates the that the data structure should be a stack.  A flag, leftMove, will be used to indicate whether the next node should be attached to the left or the right, based on whether the current node contains an operator or an operand. 

A more detailed version of the algorithm appears below:


   ' Create the root node.
   Get first operator
   Create newNode 
   newNode.setInfo ( operator)
   Set Root = newNode

   ' Loop through remainder of expression 
   leftMove = True 
   Clear stack of pointers 
   Get next Symbol

   ' Add Symbols to the tree. 
   While more Symbols 
      Set lastNode = newNode ' Keep pointer to the previous node.
      Create newNode 
      newNode.setInfo ( Symbol )
   
      ' Attach newNode to the tree. 
      If leftMove THEN
            Attach newNode to the left of lastNode
            Push pointer lastNode onto pointer stack
      Else ' leftMove = False (or Right)
            Pop pointer stack to get pointer lastNode
            Attach newNode to the right of lastNode
     EndIf

      ' Reset leftMove according to the type of symbol. 
      If Symbol is an operator THEN 
            leftMove = True
      Else ' Symbol is an operand. 
            Set newNode.Left = Nothing
            Set newNode.Right = Nothing
           leftMove = False
      EndIf

      Get next Symbol
   Wend


In order to better understand the algorithm, the expression
* + A - B C D -- the same as ((A + (B - C)) * D) -- will be traced.  

The algorithm begins by getting the first operator, *, and creating the root node. 

Get the next symbol, +, and because it is not the last symbol, enter the loop.

In the loop, set
lastNode (a back pointer) to newNode. Then a new node is allocated and symbol is stored in it. The flag leftMove is still True, so the new node is inserted to the left of lastNode. The algorithm then pushes lastNode onto the stack so that it can return to this node, eventually, in order to attach its right child node (the other operand). Before the loop is exited, leftMove is reset according to the symbol type: The + symbol is another operator, so leftMove is reset to True, which means that the next node will be attached to the left of newNode. (Whenever a node contains an operator it must have first a left child and then a right child containing its operands. The next symbol in the expression is then read.  At this point the data resembles the figure below.


As indicated above, the next symbol is
A.  Since there is more of the expression to process, the loop is reentered.  A new node is created and set to the value A. Because leftMove is still True, this node is inserted in the tree as before, with lastNode being pushed on the stack. Next, leftMove is reset, but because the symbol is an operand, the Else clause is taken, setting leftMove to False, indicating that the next node will be attached to the right of the last lastNode placed on the stack.  The first operand was just inserted to the left of the last operator, so the second operand must be placed on its right. Then the next symbol is retrieved. At this point the data can be represented as follows:

 

As indicated above, the next symbol is -.  Since there is more of the expression to process, the loop is reentered.  A new node is created and set to the value -, and then it will be inserted in the tree. The flag leftMove now equals False, indicating insertion to the right, so the Else branch is taken.  Note that the new node will be linked to the last operator node, not the last node built. The pointer to this node is recovered by popping it from the stack. Then the right child pointer of the last operator node is set to newNode. Next, because the current symbol is another operator, the next insertion should be to the left of this new node, so leftNode is reset to True and the next symbol is retrieved.  The figure below shows the current status.  



A trace of the next three iterations of the loop produce the data pictured in the figures below.  



 

The input into buildTree will be a valid prefix expression, made up of single-digit operands and the binary operators +, *, -, /, % and ^. The stack methods cIearStack, push, and pop are assumed to be available.

Procedure
buildTree takes a prefix expression and creates the equivalent binary expression tree.


   '---------------------------------------------------------------------
   ' Builds the binary expression tree that corresponds to the
   ' valid prefix expression prefixExpression.
   '---------------------------------------------------------------------
   Public Sub buildTree(ByVal prefixExpression As String) 
      Dim leftMove As Boolean ' direction in which the next node will be attached
      Dim ptrStack As New clsStack ' stack of pointers
      Dim lastNode As doubleLinkNode
      Dim newNode As doubleLinkNode
      Dim symbol As String
      Dim charIndex As Integer

      symbol = Mid(prefixExpression, 1, 1) ' Get the first symbol.

      ' Build the root node.
      Set newNode = New doubleLinkNode
      Call newNode.setInfo(symbol)
      Set root = newNode
      leftMove = True

      Call ptrStack.clearStack

      ' Add the next symbol to the tree.
      For charIndex = 2 To Len(prefixExpression)
            symbol = Mid(prefixExpression, charIndex, 1)

            ' Save pointer to the previous node.
            Set lastNode = newNode

            ' Build the new node.
            Set newNode = New doubleLinkNode
            Call newNode.setInfo(symbol)
            
            ' Attach the new node to the tree.
            If leftMove Then ' Attach to the left of lastNode
                  Call lastNode.setLeftNode(newNode)
                  Call ptrStack.push(lastNode) ' Save lastNode pointer.
                  ' end of nextMove = Left
            Else ' Attach to the right of the last operator node.
                  Set lastNode = ptrStack.pop( )
                  Call lastNode.setRightNode(newNode)
            End If ' end of nextMove = Right

            ' Reset leftMove according to symbol type
            If isOperator(symbol) Then ' symbol is an operator.
                  leftMove = True
            Else ' symbol is an operand.
                  Call newNode.setLeftNode(Nothing)
                  Call newNode.setRightNode(Nothing)
                  leftMove = False
            End If

      Next charIndex
   End Sub ' buildTree

Note that both buildTree and evaluateTree are public methods in a class called clsExprTree.


Binary expression trees can be used to evaluate expressions or print them in several different notations. The tree format can also be used to do more complicated processing, such as differentiating the expression with respect to one of its variables.


 

 

Calling routine


   Private Sub cmdEvaluate_Click()
      Dim prefixExpression As String
      Dim tree As New clsExprTree

      prefixExpression = txtPreFix.Text
      Call tree.buildTree(prefixExpression)
      txtResult.Text = tree.evaluateTree(tree.returnRoot)

   End Sub