Tree Balancing
A balanced tree is one that has, as much as possible, an equal number of descendants in each node's left and right subtrees. Since items are inserted "root first" that means that a tree will be balanced if the item that falls in the "middle" of the list can be inserted first. (If the items are listed from smallest to largest, the "middle" item is the one in the middle of the list. There will be an (approximately) equal number of smaller and larger items in the list.)
If the middle item is inserted first, it becomes the root of the tree. It will have as many items in its left subtree as it has in its right subtree. The "root" of the left subtree should be the "middle" item of all the items that are less than the main root. That item is inserted next. As the items that are less than the root are inserted, about half will be placed in the left subtree, and about half will be placed in the right subtree. Recall the the discussion of recursion stated that "When data structures are used, the recursive case is often in terms of a smaller structure rather than a smaller value. The base case occurs when there are no values left to process in the structure." This process can most easily be coded as a recursive procedure.
The process will be accomplished by dumping the contents of the unbalanced tree into an array, accessing the items in the order described above, and recreating the tree by inserting the items individually.
The algorithm for balancing a tree consists of two parts: one iterative and one recursive. The iterative part, balanceTree, creates the array and invokes the recursive part, rebuildTree, that rebuilds the tree. The balanceTree method first creates a dynamic array and redimensions it to the number of nodes in the tree, as determined by function countTreeNodes. It then stores the node values of the tree into the array using an inorder traversal, inOrderCopy. Therefore they are stored in ascending order, from smallest to largest. It then resets the root of the tree to Nothing, in effect discarding the original tree, and then calls the recursive routine, rebuildTree, passing it the bounds of the array.
The rebuildTree method checks the array bounds it is passed. If the low and high bounds are the same (base case 1) it simply inserts the corresponding array element into the tree. If the bounds only differ by one location (base case 2) then it inserts both elements into the tree. Otherwise, it computes the "middle" element of the subarray, inserts it into the tree, and then makes two recursive calls to itself: one to process the elements less than the middle element, and one to process the elements greater than the middle.
'------------------------------------------------------------------------------ ' This method balances a binary tree by copying the ' nodes inOrder into an array and then recreating the ' tree in a methodical manner so it is balanced. '------------------------------------------------------------------------------ Public Sub balanceTree( ) Dim nodeArray( ) As Variant Dim nodeCount As Integer nodeCount = countTreeNodes (root) ReDim nodeArray (nodeCount - 1) ' subtract 1 because base is 0, not 1 Call inOrderCopy (root, nodeArray) Set root = Nothing Call rebuildTree (nodeArray, LBound(nodeArray), UBound(nodeArray)) End Sub '------------------------------------------------------------------------------ ' Recursive method to recreate a tree so it is balanced. ' If the low and high bounds are the same (base case 1) ' it simply inserts the corresponding array element into ' the tree. If the bounds only differ by one location (base ' case 2) then it inserts both elements into the tree. ' Otherwise, it computes the "middle" element of the ' subarray, inserts it into the tree, and then makes two 'recursive calls to itself: one to process the elements ' less than the middle element, and one to process the ' elements greater than the middle. '------------------------------------------------------------------------------ Private Sub rebuildTree(ByRef nodeArray( ) As Variant, _ ByVal lowIndex As Integer, ByVal highIndex As Integer) Dim midIndex As Integer If lowIndex = highIndex Then ' base case 1 Call insertNode(nodeArray(lowIndex)) ElseIf (lowIndex + 1) = highIndex Then ' base case 2 Call insertNode(nodeArray(lowIndex)) Call insertNode(nodeArray(highIndex)) Else ' general case midIndex = (lowIndex + highIndex) / 2 Call insertNode(nodeArray(midIndex)) Call rebuildTree(nodeArray, lowIndex, midIndex - 1) Call rebuildTree(nodeArray, midIndex + 1, highIndex) End If End Sub '------------------------------------------------------------------------------ ' Counts the nodes in a binary tree. ' This is a recursive sub. '------------------------------------------------------------------------------ Public Function countTreeNodes(ByRef p As doubleLinkNode) _ As Integer ' Base case: if P is Nothing then do nothing. If Not p Is Nothing Then ' general case ' count current node. countTreeNodes = 1 ' Traverse the left subtree. countTreeNodes = countTreeNodes + _ countTreeNodes(p.getLeftNode( )) ' Traverse the right subtree. countTreeNodes = countTreeNodes + _ countTreeNodes(p.getRightNode( )) End If ' general case End Function ' countTreeNodes '------------------------------------------------------------------------------ ' This recursive method copies the contents of a binary ' tree into an array in ascending order. Calls insertIntoArray. '------------------------------------------------------------------------------ Public Sub inOrderCopy (ByRef ptr As doubleLinkNode, _ ByRef nodeArray( ) As Variant) If Not ptr Is Nothing Then Call inOrderCopy (ptr.getLeftNode( ), nodeArray) Call insertIntoArray(nodeArray, ptr.getInfo( )) Call inOrderCopy (ptr.getRightNode( ), nodeArray) End If End Sub '------------------------------------------------------------------------------ ' This method inserts an item into the next available array ' element '------------------------------------------------------------------------------ Private Sub insertIntoArray(ByRef nodeArray( ) As Variant, _ ByVal insertValue As Variant) Static nextAvailableIndex As Integer nodeArray(nextAvailableIndex) = insertValue nextAvailableIndex = nextAvailableIndex + 1 End Sub |
If a tree is balanced,
the number of levels, or depth, of a tree with N nodes will be around log2(N).
Although VB6 provides a log10 function but not a log2
function, log2(N) = log10(N) / log10(2).
The code segment below contains two routines: one to determine the number of
levels in a tree, and another to determine if the tree is balanced. The
following statement can be used to balance the tree if it is determined that it
is not balanced: If Not tree.optimalLevels(tree.returnRoot, 0) Then Call tree.balanceTree
'------------------------------------------------------------------------------ ' When called as tree.optimalLevels (root, 0) this will ' return True if the tree has the optimal number of levels ' (depth). Otherwise it will return False. The second ' argument specifies a tolerance of how close the current ' must be to balanced. Zero tolerance indicates that it ' must be perfectly balanced, a tolerance of 1 indicates ' that it can be within one level of the optimal number of ' levels, etc. '------------------------------------------------------------------------------ Public Function
optimalLevels ( ByVal treeNode As doubleLinkNode, _
'------------------------------------------------------------------------------ |