算法导论Lecture 5:线性时间排序

How fast can we sort? (Depends on the sorting model: what you can do with the elements)

 

Comparison sorts: only use comparisons to determine relative order of elements:

 

quicksort - Theta(nlgn) randomized version

heapsort - Theta(nlgn)

merge sort - Theta(nlgn)

insertion sort - Theta(n^2)

 

Any comparison sorting algorithms require at least Omega(nlgn) comparisons in the worst case: proof by decision tree model, height h of the decision tree corresponds to the worst case time, each of the n! permutations of the input appears as some leaf, n! < 2^h, thus h = Omega(nlgn).

 

Heap sort and merge sort are asymptotically optimal comparison sorts (worst-case also have upper bound nlgn).

 

Counting sort

 

Assume each of the n input elements is an integer in the range 0 to k. If k=O(n), the sort runs in Theta(n) time.

 

Basic idea: for each input element x determine number of elements less than x.

 

Input: A[1..n], each A[i] is in the range 0 to k.

Output: B[1..n] = sorted A

Temp storage: C[0..k] for counting, k is the range.

 

COUNTING-SORT(A, B, k)
for i := 0 to k             // O(k)
    do C[i] := 0
for j := 1 to length[A]       //O(n)
    do C[A[i]] := C[A[j]] + 1
for i := 1 to k             // O(k)
    do C[i] := C[i] + C[i - 1]
for j := length[A] downto 1  //O(n)
    do B[C[A[j]]] := A[j]
       C[A[j]] := C[A[j]] - 1

 

So the running time is O(n+k), if k = O(n) then O(n) time. Counting sort is only effective to integers with small range, meaning that k is small is a requirement.

 

Another important property of counting sort: Stability, which is used in the following algorithm: radix sort.

 

Stable sort preserves the relative order of equal elements.

 

[Q] Which other sorts are stable? http://en.wikipedia.org/wiki/Stable_sort#Stability

 

 

Radix sort [Herman Hollerith -1890], effective to large range integers. Idea is to sort on the least significant digits of those integers with a stable sorting algorithm, say, the counting sort.

 

[Ex].             3 2 9                       7 2 0                       7 2 0                      3 2 9 

4 5 7                       3 5 5                       3 2 9                      3 5 5 

6 5 7                       4 3 6                       4 3 6                      4 3 6 

8 3 9         =>         4 5 7         =>         8 3 9         =>         4 5 7

4 3 6                       6 5 7                       3 5 5                      6 5 7 

7 2 0                       3 2 9                       4 5 7                      7 2 0 

3 5 5                       8 3 9                       6 5 7                      8 3 9 

 

Proof the correctness: use induction on digit t.

 

1. assume by induction the array is sorted on low-order t-1 digits.

2. sort on digit t:

    - if two elements have same digits on the position t, after sorting they have

      the same order   because of stability.

    - if no same digits, they are just sorted.

So the induction hypothesis is true.

 

In practice, integers or other stuffs in computer world are binary bits, so group them as digits, for example, 32 bits integers can be grouped as four bits digits or bytes, etc.

 

Analysis: 

 

- if using counting sort as the stability requirement, we need O(k+n) for each round (digit).

- suppose sorting n integers, each is b bits, so the range is [0, 2^b - 1].

- split integer into b/r "digits", each r bits, thus each digit ranges in [0, 2^r - 1], k = 2^b.

  The counting sort runs b/r rounds.

 

Time: O(b/r * (n + k))

     = O(b/r * (n + 2^r))

 

so r is choosed properly to minimize the time: r = lgn  =>

Time: O(bn/lgn).

你可能感兴趣的:(C++,c,算法,C#,J#)