# 14.1 Properties of red-black trees

Each node of the tree now contains the fields color, key, left, right, and p. If a child or the parent of a node does not exist, the corresponding pointer field of the node contains the value NIL. We shall regard these NIL'S as being pointers to external nodes (leaves) of the binary search tree and the normal, key-bearing nodes as being internal nodes of the tree.

A binary search tree is a red-black tree if it satisfies the following red-black properties:

1. Every node is either red or black.

2. Every leaf (NIL) is black.

3. If a node is red, then both its children are black.

4. Every simple path from a node to a descendant leaf contains the same number of black nodes.

An example of a red-black tree is shown in Figure 14.1.

#### Figure 14.1 A red-black tree with black nodes darkened and red nodes shaded. Every node in a red-black tree is either red or black, every leaf (NIL) is black, the children of a red node are both black, and every simple path from a node to a descendant leaf contains the same number of black nodes. Each non-NIL node is marked with its black-height; NIL'S have black-height 0.

The following lemma shows why red-black trees make good search trees.

A red-black tree with n internal nodes has height at most 21g(n + 1).

Proof We first show that the subtree rooted at any node x contains at least 2bh(x) -1 internal nodes. We prove this claim by induction on the height of x. If the height of x is 0, then x must be a leaf (NIL), and the subtree rooted at x indeed contains at least 2bh(x)-1 = 20 - 1 = 0 internal nodes. For the inductive step, consider a node x that has positive height and is an internal node with two children. Each child has a black-height of either bh(x) or bh(x) - 1, depending on whether its color is red or black, respectively. Since the height of a child of x is less than the height of x itself, we can apply the inductive hypothesis to conclude that each child has at least 2bh(x) -1 -1 internal nodes. Thus, the subtree rooted at x contains at least (2bh(x)-1 -1) + (2bh(x)-1 -1) + 1 = 2bh(x)- 1 internal nodes, which proves the claim.

To complete the proof of the lemma, let h be the height of the tree. According to property 3, at least half the nodes on any simple path from the root to a leaf, not including the root, must be black. Consequently, the black-height of the root must be at least h/2; thus,

`n  2h/2 - 1.`

Moving the 1 to the left-hand side and taking logarithms on both sides yields lg (n + 1) h/2, or h 21g (n + 1).

## Exercises

Draw the complete binary search tree of height 3 on the keys {1, 2, . . . , 15}. Add the NIL leaves and color the nodes in three different ways such that the black-heights of the resulting red-black trees are 2, 3, and 4.

Suppose that the root of a red-black tree is red. If we make it black, does the tree remain a red-black tree?

Show that the longest simple path from a node x in a red-black tree to a descendant leaf has length at most twice that of the shortest simple path from node x to a descendant leaf.

What is the largest possible number of internal nodes in a red-black tree with black-height k? What is the smallest possible number?

Describe a red-black tree on n keys that realizes the largest possible ratio of red internal nodes to black internal nodes. What is this ratio? What tree has the smallest possible ratio, and what is the ratio?

# 14.2 Rotations

#### Figure 14.2 The rotation operations on a binary search tree. The operation RIGHT-ROTATE(T,x) transforms the configuration of the two nodes on the left into the configuration on the right by changing a constant number of pointers. The configuration on the right can be transformed into the configuration on the left by the inverse operation LEFT-ROTATE(T,y). The two nodes might occur anywhere in a binary search tree. The letters , , and represent arbitrary subtrees. A rotation operation preserves the inorder ordering of keys: the keys in precede key[x], which precedes the keys in , which precede key[y], which precedes the keys in .

We change the pointer structure through rotation, which is a local operation in a search tree that preserves the inorder key ordering. Figure 14.2 shows the two kinds of rotations: left rotations and right rotations. When we do a left rotation on a node x, we assume that its right child y is non-NIL. The left rotation "pivots" around the link from x to y. It makes y the new root of the subtree, with x as y's left child and y's left child as x's right child.

## Exercises

Draw the red-black tree that results after TREE-INSERT is called on the tree in Figure 14.1 with key 36. If the inserted node is colored red, is the resulting tree a red-black tree? What if it is colored black?

Write pseudocode for RIGHT-ROTATE.

Argue that rotation preserves the inorder key ordering of a binary tree.

Show that any arbitrary n-node tree can be transformed into any other arbitrary n-node tree using O(n) rotations. (Hint: First show that at most n -1 right rotations suffice to transform any tree into a right-going chain.)

# 14.3 Insertion

The code for RB-INSERT is less imposing than it looks. We shall break our examination of the code into three major steps. First, we shall determine what violations of the red-black properties are introduced in lines 1-2 when the node x is inserted and colored red. Second, we shall examine the overall goal of the while loop in lines 3-17. Finally, we shall explore each of the three cases into which the while loop is broken and see how they accomplish the goal. Figure 14.4 shows how RB-INSERT operates on a sample red-black tree.

Which of the red-black properties can be violated after lines 1-2? Property 1 certainly continues to hold, as does property 2, since the newly inserted red node has NIL'S for children. Property 4, which says that the number of blacks is the same on every path from a given node, is satisfied as well, because node x replaces a (black) NIL, and node x is red with NIL children. Thus, the only property that might be violated is property 3 which says that a red node cannot have a red child. Specifically, property 3 is violated if x's parent is red, since x is itself colored red in line 2. Figure 14.4(a) shows such a violation after the node x has been inserted.

#### Figure 14.4 The operation of RB-INSERT. (a) A node x after insertion. Since x and its parent p[x] are both red, a violation of property 3 occurs. Since x's uncle y is red, case 1 in the code can be applied. Nodes are recolored and the pointer x is moved up the tree, resulting in the tree shown in (b). Once again, x and its parent are both red, but x's uncle y is black. Since x is the right child of p[x], case 2 can be applied. A left rotation is performed, and the tree that results is shown in (c). Now x is the left child of its parent, and case 3 can be applied. A right rotation yields the tree in (d), which is a legal red-black tree.

The goal of the while loop in lines 3-17 is to move the one violation of property 3 up the tree while maintaining property 4 as an invariant. At the beginning of each iteration of the loop, x points to a red node with a red parent--the only violation in the tree. There are two possible outcomes of each iteration of the loop: the pointer x moves up the tree, or some rotations are performed and the loop terminates.

There are actually six cases to consider in the while loop, but three of them are symmetric to the other three, depending on whether x's parent p[x] is a left child or a right child of x's grandparent p[p[x]], which is determined in line 4. We have given the code only for the situation in which p[x] is a left child. We have made the important assumption that the root of the tree is black--a property we guarantee in line 18 each time we terminate--so that p[x] is not the root and p[p[x]] exists.

Case 1 is distinguished from cases 2 and 3 by the color of x's parent's sibling, or "uncle." Line 5 makes y point to x's uncle right [p[p[x]]], and a test is made in line 6. If y is red, then case 1 is executed. Otherwise, control passes to cases 2 and 3. In all three cases, x's grandparent p[p[x]] is black, since its parent p[x] is red, and property 3 is violated only between x and p[x].

The situation for case 1 (lines 7-10) is shown in Figure 14.5. Case 1 is executed when both p[x] and y are red. Since p[p[x]] is black, we can color both p[x] and y black, thereby fixing the problem of x and p[x] both being red, and color p[p[x]] red, thereby maintaining property 4. The only problem that might arise is that p[p[x]] might have a red parent; hence, we must repeat the while loop with p[p[x]] as the new node x.

In cases 2 and 3, the color of x's uncle y is black. The two cases are distinguished by whether x is a right or left child of p[x]. Lines 12-13 constitute case 2, which is shown in Figure 14.6 together with case 3. In case 2, node x is a right child of its parent. We immediately use a left rotation to transform the situation into case 3 (lines 14-16), in which node x is a left child. Because both x and p[x] are red, the rotation affects neither the black-height of nodes nor property 4. Whether we enter case 3 directly or through case 2, x's uncle y is black, since otherwise we would have executed case 1. We execute some color changes and a right rotation, which preserve property 4, and then, since we no longer have two red nodes in a row, we are done. The body of the while loop is not executed another time, since p[x] is now black.

What is the running time of RB-INSERT? Since the height of a red-black tree on n nodes is O(lg n), the call to TREE-INSERT takes O(lg n) time. The while loop only repeats if case 1 is executed, and then the pointer x moves up the tree. The total number of times the while loop can be executed is therefore O(lg n). Thus, RB-INSERT takes a total of O(lg n) time. Interestingly, it never performs more than two rotations, since the while loop terminates if case 2 or case 3 is executed.

## Exercises

In line 2 of RB-INSERT, we set the color of the newly inserted node x to red. Notice that if we had chosen to set x's color to black, then property 3 of a red-black tree would not be violated. Why didn't we choose to set x's color to black?

In line 18 of RB-INSERT, we set the root's color to black. What is the advantage of doing so?

Show the red-black trees that result after successively inserting the keys 41, 38, 31, 12, 19, 8 into an initially empty red-black tree.

Suppose that the black-height of each of the subtrees , , , , in Figures 14.5 and 14.6 is k. Label each node in each figure with its black-height to verify that property 4 is preserved by the indicated transformation.

Consider a red-black tree formed by inserting n nodes with RB-INSERT. Argue that if n > 1, the tree has at least one red node.

Suggest how to implement RB-INSERT efficiently if the representation for red-black trees includes no storage for parent pointers.

# 14.4 Deletion

We use sentinels so that we can treat a NIL child of a node x as an ordinary node whose parent is x. We could add a distinct sentinel node for each NIL in the tree, so that the parent of each NIL is well defined, but that would waste space. Instead, we use the one sentinel nil[T] to represent all the NIL'S. When we wish to manipulate a child of a node x, however, we must be careful to set p[nil[T]] to x first.

`RB-DELETE (T, z)`

`1 if left[z] = nil[T] or right[z] = nil[T]`

`2      then y  z`

`3      else y  TREE-SUCCESSOR(z)`

`4 if left[y]  nil[T]     `

`5     then x  left[y]`

`6     else x  right[y]`

`7 p[x]  p[y]`

`8 if p[y] = nil[T]`

`9    then root[T]  x`

`10    else if y = left[p[y]]`

`11            then left[p[y]]  x`

`12            else right[p[y]]  x`

`13 if y  z`

`14     then key[z]  key[y]`

`15           If y has other fields, copy them, too.`

`16 if color[y] = BLACK`

`17     then RB-DELETE-FIXUP (T,x)`

`18 return y`

There are three differences between the procedures TREE-DELETE and RB-DELETE. First, all references to NIL in TREE-DELETE have been replaced by references to the sentinel nil[T] in RB-DELETE. Second, the test for whether x is NIL in line 7 of TREE-DELETE has been removed, and the assignment p[x] p[y] is performed unconditionally in line 7 of RB-DELETE. Thus, if x is the sentinel nil[T], its parent pointer points to the parent of the spliced-out node y. Third, a call to RB-DELETE-FIXUP is made in lines 16-17 if y is black. If y is red, the red-black properties still hold when y is spliced out, since no black-heights in the tree have changed and no red nodes have been made adjacent. The node x passed to RB-DELETE-FIXUP is the node that was y's sole child before y was spliced out if y had a non-NIL child, or the sentinel nil[T] if y had no children. In the latter case, the unconditional assignment in line 7 guarantees that x's parent is now the node that was previously y's parent, whether x is a key-bearing internal node or the sentinel nil[T].

We can now examine how the procedure RB-DELETE-FIXUP restores the red-black properties to the search tree.

If the spliced-out node y in RB-DELETE is black, its removal causes any path that previously contained node y to have one fewer black node. Thus, property 4 is now violated by any ancestor of y in the tree. We can correct this problem by thinking of node x as having an "extra" black. That is, if we add 1 to the count of black nodes on any path that contains x, then under this interpretation, property 4 holds. When we splice out the black node y, we "push" its blackness onto its child. The only problem is that now node x may be "doubly black," thereby violating property 1.

The procedure RB-DELETE-FIXUP attempts to restore property 1. The goal of the while loop in lines 1-22 is to move the extra black up the tree until (1) x points to a red node, in which case we color the node black in line 23, (2) x points to the root, in which case the extra black can be simply "removed," or (3) suitable rotations and recolorings can be performed.

Within the while loop, x always points to a nonroot black node that has the extra black. We determine in line 2 whether x is a left child or a right child of its parent p[x]. (We have given the code for the situation in which x is a left child; the situation in which x is a right child--line 22--is symmetric.) We maintain a pointer w to the sibling of x. Since node x is doubly black, node w cannot be nil[T]; otherwise, the number of blacks on the path from p[x] to the NIL leaf w would be smaller than the number on the path from p[x] to x.

The four cases in the code are illustrated in Figure 14.7. Before examining each case in detail, let's look more generally at how we can verify that the transformation in each of the cases preserves property 4. The key idea is that in each case the number of black nodes from (and including) the root of the subtree shown to each of the subtrees , , . . ., is preserved by the transformation. For example, in Figure 14.7(a), which illustrates case 1, the number of black nodes from the root to either subtree or is 3, both before and after the transformation. (Remember, the pointer x adds an extra black.) Similarly, the number of black nodes from the root to any of , , , and is 2, both before and after the transformation. In Figure 14.7(b), the counting must involve the color c, which can be either red or black. If we define count(RED) = 0 and count(BLACK) = 1, then the number of black nodes from the root to is 2 + count(c), both before and after the transformation. The other cases can be verified similarly (Exercise 14.4-5).

Case 1 (lines 5-8 of RB-DELETE-FIXUP and Figure 14.7(a)) occurs when node w, the sibling of node x, is red. Since w must have black children, we can switch the colors of w and p[x] and then perform a left-rotation on p[x] without violating any of the red-black properties. The new sibling of x, one of w's children, is now black, and thus we have converted case 1 into case 2, 3, or 4.

Cases 2, 3, and 4 occur when node w is black; they are distinguished by the colors of w's children. In case 2 (lines 10-11 of RB-DELETE-FIXUP and Figure 14.7(b)), both of w's children are black. Since w is also black, we take one black off both x and w, leaving x with only one black and leaving w red, and add an extra black to p[x]. We then repeat the while loop with p[x] as the new node x. Observe that if we enter case 2 through case 1, the color c of the new node x is red, since the original p[x] was red, and thus the loop terminates when it tests the loop condition.

Case 3 (lines 13-16 and Figure 14.7(c)) occurs when w is black, its left child is red, and its right child is black. We can switch the colors of w and its left child left[w] and then perform a right rotation on w without violating any of the red-black properties. The new sibling w of x is now a black node with a red right child, and thus we have transformed case 3 into case 4.

Case 4 (lines 17-21 and Figure 14.7(d)) occurs when node x's sibling w is black and w's right child is red. By making some color changes and performing a left rotation on p[x], we can remove the extra black on x without violating any of the red-black properties. Setting x to be the root causes the while loop to terminate when it tests the loop condition.

#### Figure 14.7 The cases in the while loop of the procedure RB-DELETE. Darkened nodes are black, heavily shaded nodes are red, and lightly shaded nodes, which may be either red or black, are represented by c and c'. The letters , , . . ., represent arbitrary subtrees. In each case, the configuration on the left is transformed into the configuration on the right by changing some colors and/or performing a rotation. A node pointed to by x has an extra black. The only case that causes the loop to repeat is case 2. (a) Case I is transformed to case 2, 3, or 4 by exchanging the colors of nodes B and D and performing a left rotation. (b) In case 2, the extra black represented by the pointer x is moved up the tree by coloring node D red and setting x to point to node B. If we enter case 2 through case 1, the while loop terminates, since the color c is red. (c) Case 3 is transformed to case 4 by exchanging the colors of nodes C and D and performing a right rotation. (d) In case 4, the extra black represented by x can be removed by changing some colors and performing a left rotation (without violating the red-black properties), and the loop terminates.

What is the running time of RB-DELETE? Since the height of a red-black tree of n nodes is O(lg n), the total cost of the procedure without the call to RB-DELETE-FIXUP takes O(lg n) time. Within RB-DELETE-FIXUP, cases 1, 3, and 4 each terminate after performing a constant number of color changes and at most three rotations. Case 2 is the only case in which the while loop can be repeated, and then the pointer x moves up the tree at most O(lg n) times and no rotations are performed. Thus, the procedure RB-DELETE-FIXUP takes O(lg n) time and performs at most three rotations, and the overall time for RB-DELETE is therefore also O(lg n).

## Exercises

Argue that the root of the red-black tree is always black after RB-DELETE executes.

In Exercise 14.3-3, you found the red-black tree that results from successively inserting the keys 41, 38, 31, 12,19, 8 into an initially empty tree. Now show the red-black trees that result from the successive deletion of the keys in the order 8, 12, 19, 31, 38, 41.

In each of the cases of Figure 14.7, give the count of black nodes from the root of the subtree shown to each of the subtrees ,, . . ., , and verify that each count remains the same after the transformation. When a node has a color c or c', use the notation count(c) or count(c') symbolically in your count.

Suppose that a node x is inserted into a red-black tree with RB-INSERT and then immediately deleted with RB-DELETE. Is the resulting red-black tree the same as the initial red-black tree? Justify your answer.

# Problems

Consider a persistent set S with the operations INSERT, DELETE, and SEARCH, which we implement using binary search trees as shown in Figure 14.8(a). We maintain a separate root for every version of the set. In order to insert the key 5 into the set, we create a new node with key 5. This node becomes the left child of a new node with key 7, since we cannot modify the existing node with key 7. Similarly, the new node with key 7 becomes the left child of a new node with key 8 whose right child is the existing node with key 10. The new node with key 8 becomes, in turn, the right child of a new root r' with key 4 whose left child is the existing node with key 3. We thus copy only part of the tree and share some of the nodes with the original tree, as shown in Figure 14.8(b).

a. For a general persistent binary search tree, identify the nodes that need to be changed to insert a key k or delete a node y.

c. If the height of the persistent binary search tree T is h, what are the time and space requirements of your implementation of PERSISTENT-TREE-INSERT? (The space requirement is proportional to the number of new nodes allocated.)

d. Suppose that we had included the parent field in each node. In this case, PERSISTENT-TREE-INSERT would need to perform additional copying. Prove that PERSISTENT-TREE-INSERT would then require (n) time and space, where n is the number of nodes in the tree.

e. Show how to use red-black trees to guarantee that the worst-case running time and space is O(lg n) per insertion or deletion.

#### Figure 14.8 (a) A binary search tree with keys 2, 3, 4, 7, 8, 10. (b) The persistent binary search tree that results from the insertion of key 5. The most recent version of the set consists of the nodes reachable from the root r', and the previous version consists of the nodes reachable from r. Heavily shaded nodes are added when key 5 is inserted.

a. Given a red-black tree T, we store its black-height as the field bh[T]. Argue that this field can be maintained by RB-INSERT and RB-DELETE without requiring extra storage in the tree and without increasing the asymptotic running times. Show that while descending through T, we can determine the black-height of each node we visit in O(1) time per node visited.

We wish to implement the operation RB-JOIN(T1,x,T2), which destroys T1 and T2 and returns a red-black tree T = T1 {x} T2. Let n be the total number of nodes in T1 and T2.

b. Assume without loss of generality that bh[T1] bh[T2]. Describe an O(lg n)-time algorithm that finds a black node y in T1 with the largest key from among those nodes whose black-height is bh[T2].

c. Let Ty be the subtree rooted at y. Describe how Ty can be replaced by Ty {x} T2 in O(1) time without destroying the binary-search-tree property.

d. What color should we make x so that red-black properties 1, 2, and 4 are maintained? Describe how property 3 can be enforced in O(lg n) time.

e. Argue that the running time of RB-JOIN is O(lg n).

# Chapter notes

Red-black trees were invented by Bayer [17] under the name "symmetric binary B-trees." Guibas and Sedgewick [93] studied their properties at length and introduced the red/black color convention.