Mergesort
To sort an array, divide it into two halves, sort the two halves (recursively), and then merge the results.
Goal. Given two sorted subarrays a[lo] to a[mid] and a[mid] to a[hi], replace with sorted subarray a[lo] to a[hi].
Accordingly, we use the method signature merge(a, lo, mid, hi) to specify a merge method that puts the result of merging the subarrays a[lo…mid] with a[mid+1…hi] into a single ordered array, leaving the result in a[lo…hi].
Assertions. Statement to test assumptions about your program.
Helps detect logic bugs.
Documents code.
Proposition. Mergesort uses between 1⁄2NlgN and NlgN compares and at most 6NlgN array accesses to sort any array of length N.
Proposition. If D(N) satisfies D(N) = 2 D(N/2) + N for N>1, with D(1) =0, then D(N) = NlgN.
Practical improvements.
Use insertion sort for small subarrays.
Mergesort has too much overhead for tiny subarrays.
Cutoff to insertion sort for =7 items
private static void sort(Comparable[] a, int lo, int hi)
{ // Sort a[lo..hi].
if (hi <= lo + CUTOFF - 1)
{
Insertion.sort(a, lo, hi)
return;
}
int mid = lo + (hi - lo)/2;
sort(a, lo, mid); // Sort left half.
sort(a, mid+1, hi); // Sort right half.
merge(a, lo, mid, hi); // Merge results (code on page 271).
} }
Stop if already sorted.
Is biggest item in first half<=smallest item in second half?
Helps for partiallly-orderd arrays.
private static void sort(Comparable[] a, int lo, int hi)
{ // Sort a[lo..hi].
if (hi <= lo) return;
int mid = lo + (hi - lo)/2;
sort(a, lo, mid); // Sort left half.
sort(a, mid+1, hi); // Sort right half.
if (!less(a[mid+1],a[mid])) return;
merge(a, lo, mid, hi); // Merge results (code on page 271).
} }
Eliminate the copy to auxiliary array. Save time(but not space) by switching the role of the input and auxiliary array in each recursive call.
public static void merge(Comparable[] a, int lo, int mid, int hi)
{ // Merge a[lo..mid] with a[mid+1..hi].
int i = lo, j = mid+1;
for (int k = lo; k <= hi; k++) // Copy a[lo..hi] to aux[lo..hi].
aux[k] = a[k];
for (int k = lo; k <= hi; k++) // Merge back to a[lo..hi].
if (i > mid) a[k] = aux[j++];
else if (j > hi ) a[k] = aux[i++];
else if (less(aux[j], aux[i])) a[k] = aux[j++]; // merge form a[] to aux[]
else a[k] = aux[i++];
}
private static void sort(Comparable[] a, Comparable[] aux, int lo, int hi)
{ // Sort a[lo..hi].
if (hi <= lo) return;
int mid = lo + (hi - lo)/2;
sort(aux, a, lo, mid); // Sort left half.
sort(aux, a, mid+1, hi); // Sort right half.
merge(a, aux, lo, mid, hi); // Merge results (code on page 271).
} } // switch roles of aux[] and a[]
Note: sort(a) initializes aux[] and sets aux[i] = a[i] for each i.
Bottom-up mergesort
Another way to implement mergesort is to organize the merges so that we do all the merges of tiny subarrays on one pass, then do a second pass to merge those sub- arrays in pairs, and so forth, continuing until we do a merge that encompasses the whole array.
Pass through array, merging subarrays of size 1.
Repeat for subarrays of size 2, 4, 8, 16
No recursion needed.
The complexity of sorting
Computational complexity. Framework to study efficiency of algorithms for solving a particular problem X.
Model of computation. Allowable operations.
Cost model. Operation counts.
Upper bound. Cost guaratee provided by some algorithm for X.
Lower bound. Proven limit on cost guarantee of all algorithms for X.
Optimal algorithm. Algorithm with best possible cost guarantee for X.
Proposition. Any compare-based sorting algorithm must use at least lg(N !) ~ N lg N compares in the worst-case.
Lower bound may not hold if the algorithm has information about:
The initial order of the input
The distribution of key values.
The representation of the keys.
Partially-ordered arrays. Depending on the initial order of the input, we may not need NlgN compares.(insertion sort requires only N-1 compares if input array is sorted)
Duplicate keys. Depending on the input distribution of duplicates, we may not need NlgN conpares.(stay tuned for 3-way quicksort)
Digital properties of keys. We can use digit/character compares instead of key compares for numbers and strings.
(stay tuned for radix sorts)
Stability
A stable sort preserves the relative order of items with equal keys.
Insertion sort and mergesort(but not selection sort or shellsort) are stable.
Insertion sort: Equal items never move past each other
Seleciton sort: Long-distance exchange might move an item past some equal item.
Shellsort: Long-distance exchange
Mergesort: Suffices to verify that merge operation is stable. Takes from left subarray if equal keys.