These algorithms share an interesting property: *the sorted order they determine is based only on comparisons between the input elements*. We call such sorting algorithms * comparison sorts*. All the sorting algorithms introduced thus far are comparison sorts.

In a comparison sort, we use only comparisons between elements to gain order information about an input sequence *a*_{1}, *a*_{2}, . . . ,*a _{n}) *That is, given two elements

Comparison sorts can be viewed abstractly in terms of * decision trees*. A decision tree represents the comparisons performed by a sorting algorithm when it operates on an input of a given size. Control, data movement, and all other aspects of the algorithm are ignored. Figure 9.1 shows the decision tree corresponding to the insertion sort algorithm from Section 1.1 operating on an input sequence of three elements.

Any decision tree that sorts *n* elements has height (*n *lg *n*).

n! 2,^{h}

which, by taking logarithms, implies

hlg(n!) ,

since the lg function is monotonically increasing. From Stirling's approximation (2.11), we have

where *e* = 2.71828 . . . is the base of natural logarithms; thus

Heapsort and merge sort are asymptotically optimal comparison sorts.

What is the smallest possible depth of a leaf in a decision tree for a sorting algorithm?

* Counting sort* assumes that each of the

Counting sort is illustrated in Figure 9.2. After the initialization in lines 1-2, we inspect each input element in lines 3-4. If the value of an input element is *i*, we increment *C*[*i*]. Thus, after lines 3-4, *C*[*i*] holds the number of input elements equal to *i* for each integer *i* = 1, 2, . . . , *k*. In lines 6-7, we determine for each *i *= 1, 2, . . . , *k*, how many input elements are less than or equal to *i*; this is done by keeping a running sum of the array *C*.

An important property of counting sort is that it is * stable:* numbers with the same value appear in the output array in the same order as they do in the input array. That is, ties between two numbers are broken by the rule that whichever number appears first in the input array appears first in the output array. Of course, the property of stability is important only when satellite data are carried around with the element being sorted. We shall see why stability is important in the next section.

Prove that COUNTING-SORT is stable.

Suppose that the **for** loop in line 9 of the COUNTING-SORT procedure is rewritten:

9forj1tolength[A]

Show that the algorithm still works properly. Is the modified algorithm stable?

* Radix sort* is the algorithm used by the card-sorting machines you now find only in computer museums. The cards are organized into 80 columns, and in each column a hole can be punched in one of 12 places. The sorter can be mechanically "programmed" to examine a given column of each card in a deck and distribute the card into one of 12 bins depending on which place has been punched. An operator can then gather the cards bin by bin, so that cards with the first place punched are on top of cards with the second place punched, and so on.

329 720 720 329

457 355 329 355

657 436 436 436

839 457 839 457

436 657 355 657

720 329 457 720

355 839 657 839

The code for radix sort is straightforward. The following procedure assumes that each element in the *n*-element array *A* has *d *digits, where digit 1 is the lowest-order digit and digit *d* is the highest-order digit.

RADIX-SORT(A, d)

1fori1 tod

2douse a stable sort to sort arrayAon digiti

Which of the following sorting algorithms are stable: insertion sort, merge sort, heapsort, and quicksort? Give a simple scheme that makes any sorting algorithm stable. How much additional time and space does your scheme entail?

Show how to sort* n* integers in the range 1 to *n*^{2} in *O*(*n*) time.

* Bucket sort* runs in linear time on the average. Like counting sort, bucket sort is fast because it assumes something about the input. Whereas counting sort assumes that the input consists of integers in a small range, bucket sort assumes that the input is generated by a random process that distributes elements uniformly over the interval [0,1). (See Section 6.2 for a definition of uniform distribution.)

The idea of bucket sort is to divide the interval [0, 1) into* n* equal-sized subintervals, or * buckets,* and then distribute the

BUCKET-SORT(A)

1nlength[A]

2fori1ton

3doinsertA[i] into listB[nA[i]]

4fori0ton-1

5dosort listB[i] with insertion sort

6 concatenate the listsB[0],B[1], . . . ,B[n- 1] together in order

Figure 9.4 shows the operation of bucket sort on an input array of 10 numbers.

To see that this algorithm works, consider two elements *A*[*i*] and *A*[*j*]. If these elements fall in the same bucket, they appear in the proper relative order in the output sequence because their bucket is sorted by insertion sort. Suppose they fall into different buckets, however. Let these buckets be *B*[*i*'] and *B*[*j**'*], respectively, and assume without loss of generality that *i*' < *j*'. When the lists of *B* are concatenated in line 6, elements of bucket *B*[*i*'] come before elements of *B*[*j*'], and thus *A*[*i*] precedes *A*[*j*] in the output sequence. Hence, we must show that *A*[*i*] *A*[*j*]. Assuming the contrary, we have

i' =nA[i]

nA[j]

=j',

which is a contradiction, since i' < j'. Thus, bucket sort works.

We are given *n* points in the unit circle, *p _{i}* = (

A* probability distribution function P*(

9-1 Average-case lower bounds on comparison sorting

In this problem, we prove an (*n* lg *n*) lower bound on the expected running time of any deterministic or randomized comparison sort on *n* inputs. We begin by examining a deterministic comparison sort *A* with decision tree *T _{A}*. We assume that every permutation of

9-2 Sorting in place in linear time

* a. *Suppose that we have an array of