2. Operations on Binary Search Trees

BSTs are binary trees, so all the operations we've defined for binary trees can be applied to BSTs. However, one of our tree operations does not preserve the special properties of a BST. The problematic operation is JOIN. Recall that JOIN was a problem with sorted lists - we have the same problem, for the same reasons, when we JOIN two BSTs: the result may not be a BST.

Our solution for sorted lists was to recognize that there is always has a unique correct position in a sorted list for any new value. The same is true of a BST - if you give me a BST and an value, there is a unique position for that new value in the BST - assuming that we are not allowed/willing to re-arrange the existing values. If the existing values must stay where they are, we have no choice about where to put a new value. If it is smaller than the root value, it must go in the left subtree; if larger it must go in the right subtree. This reasoning applies recursively until we reach a node where the required subtree does not exist. And that is where we place the new value.

Example: Where to place `2' in our example tree:


Answer: It must go in 6's left subtree, 3's left subtree, 1's right subtree. 1 has no right subtree, so we make a singleton with `2' and it becomes 1's right subtree.

Where to place `7'? `10'?

Here is the general pseudocode for BSTINSERT(V,T) (assume V is not in T):

        BSTINSERT(V,T) {
          if T is empty
          then T = create_singleton(V)
          else if V > rootvalue(T)
               then if T's right subtree exists
                    then BSTINSERT(V,T's right subtree)
                    else T's right subtree = create_singleton(V)
               else if T's left subtree exists
                    then BSTINSERT(V,T's left subtree)
                    else T's left subtree = create_singleton(V) }
I hope you can see that this can very easily be written using our tree operations. I also hope you see that this is exactly the same as the SEARCH(V,T) operation described above, except for
  1. the processing we do in the base case (empty tree or subtree)
  2. the assumption that V is not in the tree for INSERT.

Special Precondition: Things can go wrong if we allow the user to INSERT a value directly into a subtree. If he tried to insert 99 into the subtree rooted at 3, it would end up in the wrong part of the whole tree (it ought to be in `6's right subtree, but the user started the process out in its left subtree). So we must add a precondition: argument T must not be a subtree. But then, strictly speaking, this procedure may not call itself recursively, because this violates the precondition. What we actually need is a procedure called INSERT_IN_SUBTREE which is not available to the user and which does not have this precondition.

There is one final operation on BSTs to discuss. And that is, how to DELETE a value from a BST. Although we have not included DELETE as a primitive operation on general binary trees, many people do. Furthermore DELETE is particularly important for BSTs because DELETE is a standard operation on Collections and the primary use of BSTs is as a very efficient way to implement Collections.

It is not perfectly obvious how we can delete a node from a BST and have the resulting tree still have the special properties of a BST.