1. Average Running Time of QuickSort:
For every input array of length n , the average running time of QuickSort (with random pivots) is O(nlogn).
Note: It holds for every input, no assumption on the input data. "Average" is over random pivots choice made by the algorithm.
2. For a certain input array A of length n.
Sample Space Ω = All possible outcomes of random pivots choices in Quick Sort. ( A sample is a pivot sequence)
3. Random variable C defined on Ω :
For any s in Ω , C(s) = number of comparisons between two input elements made by QuickSort (give random pivots choice s )
While running time of QuickSort is dominated by comparisons, So we expected that E[C] = O(nlogn) (E[C] means the expectation of random variable C)
4. Zi = ith smallest element of A.
For s in Ω, indices i < j , let Xij (s) = the number of times Zi, Zj get compared in QuickSort with pivot sequence s.
Two elements get compared only when one is the pivot, which is excluded from future recursive calls. So Xij is an "Indicator" random variable. (the value is either 0 or 1 )
5. For s in Ω , C(s) = Sum(i=1 ~ n-1) { Sum (j=i+1~n) {Xij(s) } } .
By linearity of expectation : E[C] = Sum(i=1 ~ n-1) { Sum (j=i+1~n) {E[Xij] } }
= Sum(i=1 ~ n-1) { Sum (j=i+1~n) {P[Zi , Zj get compared] } }
6. A General Decomposition Principle
a) Identify random variable Y that you really care about.
b) Express Y as sum of indicator random variables :
Y = Sum (i =1 ~ n) {Xi}
c) Apply linearity of expectation : E[Y] = Sum (i =1 ~ n) {P[Xi = 1] }
7. For a specific Zi , Zj with i < j , Consider Zi, Zi+1, Zi+2 ..., Zj-1, Zj.
As long as none of these are chosen as a pivot, all are passed to the same recursive call.
a) if Zi, or Zj is the first one of these that gets chosen as pivot , then Zi and Zj get compared.
b) if one of Zi+1, ..., Zj-1 is the first one of these that gets chosen as pivot, then Zi and Zj are splitted into differenct recursive calls and are never compared.
Since pivots are always chosen uniformly at random, each of Zi, Zi+1, ..., Zj-1, Zj is equally likely to be the first pivot. So P[Zi, Zj get compared] = 2/ (j - i + 1)
So : E[C] = Sum(i=1 ~ n-1) { Sum (j=i+1~n) {2/ (j - i + 1) } } = 2Sum(i=1 ~ n-1) { Sum (j=i+1~n) {1/ (j - i + 1) } }
For any fixed i , the inner sum is : Sum (j=i+1~n) {1/ (j - i + 1) } = 1/2 + 1/3 + ...
And outer sum is at most n choices of i, so E[C] <= 2 n Sum ( k =2~n) {1/k} <= 2 n ln n