Binary Trees
INTRODUCTION
One of the major drawbacks to using linked lists is the amount of time it takes to search a long list. In order to find the node containing the value 10 in the linked list
below, the entire list must be traversed. This causes severe problems when the list contains many nodes. Organizing the linked list into a binary search tree solves many of these problems. A binary search tree provides a structure that retains the flexibility of a linked list, while allowing quicker access to any node in the list.
The binary search tree gets its tree structure by allowing each node to point to two other nodes: one that precedes it in the list,
and one that follows it. These nodes may be any nodes in the list, as long as they satisfy the basic rule: the node to the left contains a value smaller than the node pointing to it, and the node to the right contains a larger value.
The figure below shows a binary search tree that could have been created from the nodes in
the preceding figure. For any given node, the nodes to the left contain smaller values and the nodes to the right contain larger values. The first node in the tree is pointed to by an external pointer, called the
root
of the tree.
To search for a value, such as 10, in the binary search tree, the root is examined. The value in the root is smaller than 10, so we know by the basic rule that the node being searched for is located somewhere to the right. The value in the node immediately to the right is compared to 10. It is smaller, so the search moves to the right. The process continues until it arrives at the node containing 10. By following the path from the root to the node containing 10, the search required only four comparisons. Searching for the same value in the linked list would have required ten comparisons.
Duplicate nodes in trees can be handled in a variety of ways, depending on the application. In some applications duplicates are not essential, so they can be ignored. In other situations duplicated nodes are noted by special flag fields or counter fields within each node. Other applications require that duplicate nodes are included in the tree, either to the right or left of the original node. In this discussion assume that all values are unique.
The following figure illustrates the relationship among the nodes in a binary tree.

Each binary tree has a unique first element called the
root. The node to the left is called the
left child, the node to the right is called the
right child. Any node pointing to other nodes is called the
parent of those nodes. Any node may have 0, 1, or 2 children. A node with no children is called a
leaf. Two nodes are
siblings
if they have the same parent. A node is an ancestor
of another node if it is the parent of the node, or the parent of some other ancestor of that node. The root is an ancestor of every other node in the tree. A node is a
descendant
of another node if it is the child of the node or the child of some other descendant of that node. All nodes are descendants of the root. Descendants to the left of a node comprise its
left subtree, whose root is the left child of the node. Descendants to the right of a node comprise the
right subtree, whose root is the right child of the node.
The level of a node refers to its distance from the root. The root is level 0, the next level is level 1, etc. The maximum number of nodes at any level N is
2N.
The tree will be accessed through the external pointer ROOT. Nodes are accessed through their pointers. The nodes in the following examples will contain three fields:
'--------------------------------------------------------------------------------------------- ' doubleLinkNode class '--------------------------------------------------------------------------------------------- ' instance variables Private leftNode As doubleLinkNode Private info As Variant Private rightNode As doubleLinkNode ' class methods '--------------------------------------------------------------------------------------------- ' Constructor '--------------------------------------------------------------------------------------------- Private Sub Class_Initialize( ) Set leftNode = Nothing Set rightNode = Nothing End Sub '--------------------------------------------------------------------------------------------- ' Set node value '--------------------------------------------------------------------------------------------- Public Sub setInfo(ByVal newValue As Variant) info = newValue End Sub '--------------------------------------------------------------------------------------------- ' Return node value '--------------------------------------------------------------------------------------------- Public Function getInfo( ) As Variant getInfo = info End Function '--------------------------------------------------------------------------------------------- ' Reset rightNode reference '--------------------------------------------------------------------------------------------- Public Sub setRightNode(ByRef followingNode As doubleLinkNode) Set rightNode = followingNode End Sub '--------------------------------------------------------------------------------------------- ' Reset leftNode reference '--------------------------------------------------------------------------------------------- Public Sub setLeftNode(ByRef previousNode As doubleLinkNode) Set leftNode = previousNode End Sub '--------------------------------------------------------------------------------------------- ' Return rightNode reference '--------------------------------------------------------------------------------------------- Public Function getRightNode( ) As doubleLinkNode Set getRightNode = rightNode End Function '--------------------------------------------------------------------------------------------- ' Return leftNode reference '--------------------------------------------------------------------------------------------- Public Function getLeftNode( ) As doubleLinkNode Set getLeftNode = leftNode End Function |
SEARCHING
Executing a binary search on a tree involves moving a pointer to the left or right until the desired value is found. The following search routine returns a pointer to the node containing a given value known to be in the tree. It uses an external pointer P to search the tree.
P is first set to the root. Then the INFO field is compared to the value being searched for. If the INFO field is equal to the value, the desired node has been found and the routine is exited, returning the current value of P. If the INFO field is greater than the value, P is set to P.LEFTNODE, otherwise P is set to P.RIGHTNODE. The comparisons continue until the correct node is found. The algorithm (in pseudocode) is:
P = ROOT
WHILE P. INFO <> VAL
IF P. INFO > VAL THEN
P = P. LEFTNODE
ELSE
P = P.RIGHTNODE
END IF
WEND
The maximum number of comparisons in a binary search on a tree equals the level of the lowest node in the tree plus 1. For the tree in the figure above, the maximum number of comparisons needed to find any node in the tree is four.
For the same information ordered in a linear linked list, the maximum number of comparisons equals the number of nodes in the list, and on the average half of the nodes must be searched. In the worst case -- searching for the last node in a linear linked list -- a linked list containing 1000 nodes requires 1000 comparisons. If the nodes were arranged in a binary tree, and the tree was evenly balanced (more on balancing later), a maximum of 11 comparisons would be required.
'--------------------------------------------------------------------------------------------- ' If an item is found in the tree, returns a ptr to the node '--------------------------------------------------------------------------------------------- Public Function searchTree(ByVal keyValue As Variant) As doubleLinkNode Dim p As doubleLinkNode Dim valueInTree As Boolean Set p = root valueInTree = False While Not p Is Nothing And Not valueInTree If p.getInfo( ) = keyValue Then valueInTree = True ElseIf p.getInfo( ) > keyValue Then Set p = p.getLeftNode( ) Else Set p = p.getRightNode( ) End If Wend If Not p Is Nothing Then Set searchTree = p Else Set searchTree = Nothing End If End Function |
INSERTION
To create and maintain a binary tree, it is necessary to have a routine that will insert new nodes into the tree. A new node will always be inserted into its appropriate position in the tree as a leaf.
The linked figure shows a series of insertions into a binary tree.
Given the root of a binary tree and a value to be added to the tree, there are several tasks to be performed:
Steps 2 and 3 can be combined. The complete process can be outlined as follows:
- Allocate space for the node.
- Set the INFO field equal to VAL.
- Set the left and right pointers to Nothing.
In the following code segment, assume the following declarations:
Private root As doubleLinkNode
'--------------------------------------------------------------------------------------------- ' I. Create a new node. ' A. Set the INFO field equal to VAL. ' B. Set the left and right pointers to Nothing. ' II. Insert new node. ' A. If root is nothing, set root to newnode and stop, otherwise ' examine nodes beginning with the root. ' B. With the current node ' 1. If newValue is less than TREENODE.INFO. then move left. ' a. If TREENODE.LEFT = Nothing then insert NEWNODE and ' stop, otherwise move left and repeat step B. ' 2. If newValue is greater than TREENODE.INFO, then move right. ' a. If TREENODE.RIGHT = Nothing then insert NEWNODE and stop, ' otherwise move right and repeat step B. '--------------------------------------------------------------------------------------------- Public Sub insertNode(ByVal newValue As Variant) Dim newNode As doubleLinkNode Dim treeNode As doubleLinkNode Dim inserted As Boolean ' Create and initialize a new node. Set newNode = New doubleLinkNode Call newNode.setInfo(newValue) Call newNode.setLeftNode(Nothing) Call newNode.setRightNode(Nothing) If root Is Nothing Then Set root = newNode Else ' search root ancestors inserted = False Set treeNode = root While Not inserted ' move down tree If newValue < treeNode.getInfo( ) Then 'move left If treeNode.getLeftNode( ) Is Nothing Then Call treeNode.setLeftNode(newNode) inserted = True Else Set treeNode = treeNode.getLeftNode( ) End If ' move left Else ' move right If treeNode.getRightNode( ) Is Nothing Then Call treeNode.setRightNode(newNode) inserted = True Else Set treeNode = treeNode.getRightNode( ) End If ' move right End If Wend 'move down tree End If ' search root ancestors End Sub |
The order in which the nodes are inserted determines the shape of the tree. The following figures illustrate how the same data, inserted in different orders, will produce differently-shaped trees.


- or -

Since the height of the tree determines the maximum number of comparisons in a binary search, the number of levels in a tree is important. Minimizing the number of levels in the tree will maximize search efficiency.
Deletion can be performed on isolated nodes or on entire subtrees. This discussion will focus on deletion of nodes. This operation varies depending on the position of the node in the tree. It is simpler to delete a leaf than it is to delete the root of the tree.
The deletion algorithm consists of three cases, depending on the number of children linked to the node to be deleted.








See this figure for additional examples.
In order to locate and delete the target node, a routine similar to DELETE is used.
'--------------------------------------------------------------------------------------------- ' The node with the value findValue will be found and deleted ' from the binary tree. Assumes the node is in the tree. '--------------------------------------------------------------------------------------------- Public Sub delete(ByRef findValue As Variant) Dim back As doubleLinkNode, ptr As doubleLinkNode ' Search tree for node containing findValue Set ptr = root Set back = Nothing While ptr.getInfo( ) <> findValue Set back = ptr If ptr.getInfo( ) > findValue Then Set ptr = ptr.getLeftNode( ) Else Set ptr = ptr.getRightNode( ) End If Wend Call deleteNode(ptr, back) End Sub ' delete |
Notice that DELETE calls DELETENODE to perform the actual deletion. It passes as a parameter the pointer to the node within the tree, ptr, as well as a pointer to the parent node, back. Since the node to be deleted has been located, the algorithm must determine which of the three cases it satisfies.
==========
The first case, in which the node to be deleted is a leaf, can be detected by the statement If ptr.getLeftNode( ) Is Nothing And ptr.getRightNode( ) Is Nothing. Then, if back is nothing then there are no other nodes in the tree and the root is set to Nothing. Note that ptr points to the node to be deleted, and back points to its predecessor.
If back Is Nothing Then ' it is the only node in the tree
Set root = Nothing
Otherwise, whichever pointer field of the back node
that currently points to the same node as ptr must be set to Nothing.
Else ' delete the leaf
If back.getRightNode Is ptr Then
Call back.setRightNode(Nothing)
Else '
back.getLeftNode is a ptr
Call back.setLeftNode(Nothing)
End If
==========
In the second case the node to be deleted has 1 child.
If the condition Not ptr.getRightNode( ) Is Nothing is true, then the node to be deleted has a right child.
Then, if back is nothing then the node to be deleted is the root and the root pointer must be reset.
If back Is Nothing Then
Set root = ptr.getRightNode( ) ' delete root
In the case of a non-root node, the algorithm must then determine if the node to be selected is a right node or a left node.
If it is a right node (If back.getRightNode( ) Is ptr), then the right pointer of the back node must be set to the ptr node's right pointer (because it has a right child), thereby bypassing the ptr node.
ElseIf
back.getRightNode( ) Is ptr Then ' delete nonroot node
Call back.setRightNode(ptr.getRightNode( ))
Otherwise it is a left
node, and
the left pointer of the back
node must be set to the ptr
node's right pointer (because it has a right child), thereby bypassing the ptr
node.
Else
Call back.setLeftNode(ptr.getRightNode( ))
------
|
The Else condition in this case (has one child) is that the node to be deleted has a left child. Then, if back is nothing then the node to be deleted is the root and the root pointer must be reset.
If back Is Nothing Then In the case of a non-root node, the algorithm must then determine if the node to be selected is a right node or a left node. If it is a right node (If back.getRightNode( ) Is ptr), then the right pointer of the back node must be set to the ptr node's left pointer (because it has a left child), thereby bypassing the ptr node.
ElseIf
back.getRightNode( ) Is ptr Then ' delete nonroot node Otherwise it is a left
node, and
the left pointer of the back
node must be set to the ptr
node's left pointer (because it has a left child), thereby bypassing the ptr
node.
|
==========
In the third case, the node has two children. It can be detected by the statement If Not ptr.getLeftNode( ) Is Nothing And Not ptr.getRightNode( ) Is Nothing.
Deleting a node with two children involves searching the tree for the key value that is closest to the key value of the node to be deleted (immediately before or immediately after). The ptr node will not actually be deleted in this case. Instead, its contents will be replaced by the contents of the node with the closest key value, and then the node whose value was moved will be deleted. The algorithm guarantees that the node that is ultimately deleted (the one containing the replacement value) will have at most one child, so its deletion is not overly complex.
This algorithm will use the value immediately preceding the value to be deleted. Therefore the left subtree will be searched in order to find the replacement value. This value will be located in one of two places. If the node to the left of Ptr has no right child, then this node contains the replacement value.

Otherwise, the replacement value is found in the rightmost descendant of the node to the left of Ptr.

Locate the node containing the replacement value.
' find the node containing the closest value that is less than the
' value being deleted
Set back = ptr
Set temp = ptr.getLeftNode( )
While Not temp.getRightNode( ) Is Nothing
Set back = temp
Set temp =
temp.getRightNode( )
Wend
When the node containing the replacement value is found the values are copied into
the info field of ptr:
' copy replacement value into ptr info field
ptr.setInfo (temp.getInfo(
))
... and then the node
from which the replacement value was extracted is deleted.
' delete the node from the tree
If back Is ptr Then
Call
back.setLeftNode(temp.getLeftNode( ))
Else
Call
back.setRightNode(temp.getLeftNode( ))
End If
|
'--------------------------------------------------------------------------------------------- ' Removes a node from a binary tree '--------------------------------------------------------------------------------------------- Private Sub deleteNode(ByRef ptr As doubleLinkNode, _ ByRef back As doubleLinkNode) Dim temp As doubleLinkNode ' case of no children If ptr.getLeftNode( ) Is Nothing And ptr.getRightNode( ) Is Nothing Then If back Is Nothing Then ' it is the only node in the tree Set root = Nothing Else ' delete the leaf If back.getRightNode Is ptr Then Call back.setRightNode(Nothing) Else Call back.setLeftNode(Nothing) End If End If ' case of deleting node with two children ElseIf Not ptr.getLeftNode( ) Is Nothing And _ Not ptr.getRightNode( ) Is Nothing Then ' find the node containing the closest value that is less than the ' value being deleted Set back = ptr Set temp = ptr.getLeftNode( ) While Not temp.getRightNode( ) Is Nothing Set back = temp Set temp = temp.getRightNode( ) Wend ' copy replacement value into ptr info field ptr.setInfo (temp.getInfo( )) ' delete the node from the tree If back Is ptr Then Call back.setLeftNode(temp.getLeftNode( )) Else Call back.setRightNode(temp.getLeftNode( )) End If Else ' node has only one child ' reset one of the pointer fields of back according to whether ' the node being deleted has a right or left child If Not ptr.getRightNode( ) Is Nothing Then ' there is a right child If back Is Nothing Then Set root = ptr.getRightNode( ) ' delete root ElseIf back.getRightNode( ) Is ptr Then ' delete nonroot node Call back.setRightNode(ptr.getRightNode( )) Else Call back.setLeftNode(ptr.getRightNode( )) End If Else ' there is a left child If back Is Nothing Then Set root = ptr.getLeftNode( ) ' delete root ElseIf back.getRightNode( ) Is ptr Then ' delete nonroot node Call back.setRightNode(ptr.getLeftNode( )) Else Call back.setLeftNode(ptr.getLeftNode( )) End If End If End If End Sub |
PRINTING THE TREE
(TREE TRAVERSAL)
Printing each of the nodes in a tree involves traversing the tree, or operating on every node in the tree. Traversals were simple with linked lists, but when attempting to print out all of the data stored in a binary tree, an algorithm cannot proceed linearly from one end to another. Rather, from any particular node, it may have to move left for some data and then right for more data. We must keep track of what has been printed at a node and on its left and
right, and the code can become quite involved.
The algorithm to print the information in order involves three steps:
One way of keeping track of which nodes have been printed is to
use a stack. A recursive solution can also be used, in which case VB will keep track of
all of this information automatically.
'--------------------------------------------------------------------------------------------- ' Prints the binary tree in order from smallest ' to largest. This is a recursive procedure. '--------------------------------------------------------------------------------------------- Public Sub inOrderPrint(ByRef p As doubleLinkNode) ' Base case: if P is Nothing then do nothing. If Not p Is Nothing Then ' general case ' Traverse the left subtree to print the smaller values. Call inOrderPrint(p.getLeftNode) ' Print the value of current node. Debug.Print (p.getInfo()) ' Traverse the right subtree to print the larger values. Call inOrderPrint(p.getRightNode) End If ' general case End Sub ' inOrder |
This will be invoked initially by the statement Call inOrderPrint(root)
PREORDER AND
POSTORDER TRAVERSALS
Sometimes a tree must be printed in different orders. A preorder traversal of a binary tree:
visits the root
visits the left subtree preorder
visits the right subtree preorder
The results
of a preorder traversal are shown above.
The preorder print procedure can be written recursively by changing the order of the statements in the previous routine.
'--------------------------------------------------------------------------------------------- ' Prints the binary tree in preorder. ' This is a recursive sub. '--------------------------------------------------------------------------------------------- Private Sub preOrderPrint(ByRef p As doubleLinkNode) ' Base case: if P is Nothing then do nothing. If Not p Is Nothing Then ' general case ' Print the value of current node. Debug.Print (p.getInfo( )) ' Traverse the left subtree. Call preOrderPrint(p.getLeftNode( )) ' Traverse the right subtree. Call preOrderPrint(p.getRightNode( )) End If ' general case End Sub ' preorder |
A postorder traversal of a binary tree
The results of a preorder
traversal are shown above.
A procedure to print out the elements in a binary tree in postorder follows. It also rearranges the order of the three cases in the general case to change the order of printing.
'--------------------------------------------------------------------------------------------- ' Prints the binary tree in postorder. ' This is a recursive sub. '--------------------------------------------------------------------------------------------- Private Sub postOrderPrint(ByRef p As doubleLinkNode) ' Base case: if P is Nothing then do nothing. If Not p Is Nothing Then ' general case ' Traverse the left subtree. Call postOrderPrint(p.getLeftNode( )) ' Traverse the right subtree. Call postOrderPrint(p.getRightNode( )) ' Print the value of current node. Debug.Print (p.getInfo( )) End If ' general case End Sub ' postorder |
Note: You may need a method to return a pointer to the tree root:
'--------------------------------------------------------------------------------------------- ' Returns a pointer to the root of the tree. The return type is ' doubleLinkNode. '--------------------------------------------------------------------------------------------- Public Function returnRoot( ) As doubleLinkNode Set returnRoot = root End Function |
Note II: The declaration of Root belongs in clsTree.
APPLICATIONS OF BINARY TREES
Considerable use is made of tree data structures in representing the structure of computer programs, written in languages such as
VB, and in the actual writing of compilers. Trees offer a convenient structure for recording syntactical information about a program and then using this information in the translation of the program to machine language.