算法导论第三版参考答案

1.1-1

Give a real-world example that requires sorting or a real-world example that requires computing a convex hull.

  • Sorting: browse the price of the restaurants with ascending prices on NTU street.
  • Convex hull: computing the diameter of set of points.

1.1-2

Other than speed, what other measures of efficiency might one use in a real-world setting?

Memory efficiency and coding efficiency.

1.1-3

Select a data structure that you have seen previously, and discuss its strengths and limitations.

Linked-list:

  • Strengths: insertion and deletion.
  • Limitations: random access.

1.1-4

How are the shortest-path and traveling-salesman problems given above similar? How are they different?

  • Similar: finding path with shortest distance.
  • Different: traveling-salesman has more constrains.

1.1-5

Come up with a real-world problem in which only the best solution will do. Then come up with one in which a solution that is ‘‘approximately’’ the best is good enough.

  • Best: find the GCD of two positive integer numbers.
  • Approximately: find the solution of differential equations.

1.2-1

Give an example of an application that requires algorithmic content at the application level, and discuss the function of the algorithms involved.

Drive navigation.

1.2-2

Suppose we are comparing implementations of insertion sort and merge sort on the same machine. For inputs of size n n n , insertion sort runs in 8 n 2 8n^2 8n2 steps, while merge sort runs in 64 n lg ⁡ n 64n\lg n 64nlgn steps. For which values of n n n does insertion sort beat merge sort?

\begin{align}
8n^2 & < 64n\lg n \\
2^n & < n^8 \\
n & \le 43.
\end{align}

1.2-3

What is the smallest value of n n n such that an algorithm whose running time is 100 n 2 100n^2 100n2 runs faster than an algorithm whose running time is 2 n 2^n 2n on the same machine?

\begin{align}
100n^2 & < 2^n \\
n & \ge 15.
\end{align}

For each function f ( n ) f(n) f(n) and time t t t in the following table, determine the largest size n n n of a problem that can be solved in time t t t, assuming that the algorithm to solve the problem takes f ( n ) f(n) f(n) microseconds.

For each function f ( n ) f(n) f(n) and time t t t in the following table, determine the largest size n n n of a problem that can be solved in time t t t, assuming that the algorithm to solve the problem takes f ( n ) f(n) f(n) microseconds.

2-1

Although merge sort runs in Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn) worst-case time and insertion sort runs in Θ ( n 2 ) \Theta(n^2) Θ(n2) worst-case time, the constant factors in insertion sort can make it faster in practice for small problem sizes on many machines. Thus, it makes sense to coarsen the leaves of the recursion by using insertion sort within merge sort when subproblems become sufficiently small. Consider a modification to merge sort in which n / k n / k n/k sublists of length k k k are sorted using insertion sort and then merged using the standard merging mechanism, where k k k is a value to be determined.

a. Show that insertion sort can sort the n / k n / k n/k sublists, each of length k k k, in Θ ( n k ) \Theta(nk) Θ(nk) worst-case time.

b. Show how to merge the sublists in Θ ( n lg ⁡ ( n / k ) ) \Theta(n\lg(n / k)) Θ(nlg(n/k)) worst-case time.

c. Given that the modified algorithm runs in Θ ( n k + n lg ⁡ ( n / k ) ) \Theta(nk + n\lg(n / k)) Θ(nk+nlg(n/k)) worst-case time, what is the largest value of k k k as a function of n n n for which the modified algorithm has the same running time as standard merge sort, in terms of Θ \Theta Θ-notation?

d. How should we chosse k k k in practice?

a. Insertion sort takes Θ ( k 2 ) \Theta(k^2) Θ(k2) time per k k k-element list in the worst case. Therefore, sorting n / k n / k n/k lists of k k k elements each takes Θ ( k 2 n / k ) = Θ ( n k ) \Theta(k^2n / k) = \Theta(nk) Θ(k2n/k)=Θ(nk) worst-case time.

b. Just extending the 2 2 2-list merge to merge all the lists at once would take Θ ( n ⋅ ( n / k ) ) = Θ ( n 2 / k ) \Theta(n \cdot(n / k)) = \Theta(n^2/k) Θ(n(n/k))=Θ(n2/k) time ( n n n from copying each element once into the result list, n / k n / k n/k from examining n / k n / k n/k lists at each step to select next item for result list).

To achieve Θ ( n lg ⁡ ( n / k ) ) \Theta(n\lg(n / k)) Θ(nlg(n/k))-time merging, we merge the lists pairwise, then merge the resulting lists pairwise, and so on, until there’s just one list. The pairwise merging requires Θ ( n ) \Theta(n) Θ(n) work at each level, since we are still working on n n n elements, even if they are partitioned among sublists. The number of levels, starting with n / k n / k n/k lists (with k k k elements each) and finishing with 1 list (with n n n elements), is ⌈ lg ⁡ ( n / k ) ⌉ \lceil \lg(n / k) \rceil lg(n/k). Therefore, the total running time for the merging is Θ ( n lg ⁡ ( n / k ) ) \Theta(n\lg(n / k)) Θ(nlg(n/k)).

c. The modified algorithm has the same asymptotic running time as standard merge sort when Θ ( n k + n lg ⁡ ( n / k ) ) = Θ ( n lg ⁡ n ) \Theta(nk + n\lg(n / k)) = \Theta(n\lg n) Θ(nk+nlg(n/k))=Θ(nlgn). The largest asymptotic value of k k k as a function of n n n that satisfies this condition is k = Θ ( lg ⁡ n ) k = \Theta(\lg n) k=Θ(lgn).

To see why, first observe that k k k cannot be more than Θ ( lg ⁡ n ) \Theta(\lg n) Θ(lgn) (i.e., it can’t have a higher-order term than lg ⁡ n \lg n lgn), for otherwise the left-hand expression wouldn’t be Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn) (because it would have a higher-order term than n lg ⁡ n n\lg n nlgn). So all we need to do is verify that k = Θ ( lg ⁡ n ) k = \Theta(\lg n) k=Θ(lgn) works, which we can do by plugging k = lg ⁡ n k = \lg n k=lgn into

Θ ( n k + n lg ⁡ ( n / k ) ) = Θ ( n k + n lg ⁡ n − n lg ⁡ k ) \Theta(nk + n\lg(n / k)) = \Theta(nk + n\lg n - n\lg k) Θ(nk+nlg(n/k))=Θ(nk+nlgnnlgk)

to get

Θ ( n lg ⁡ n + n lg ⁡ n − n lg ⁡ lg ⁡ n ) = Θ ( 2 n lg ⁡ n − n lg ⁡ lg ⁡ n ) , \Theta(n\lg n + n\lg n - n\lg\lg n) = \Theta(2n\lg n - n\lg\lg n), Θ(nlgn+nlgnnlglgn)=Θ(2nlgnnlglgn),

which by taking just the high-order term and ignorin the constant coefficient, equals Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn).

d. In practice, k k k should be the largest list length on which insertion sort is faster than merge sort.

2-2

Bubblesort is a popular, but inefficient, sorting algorithm. It works by repeatedly swapping adjacent elements that are out of order.

BUBBLESORT(A)
    for i = 1 to A.length - 1 
        for j = A.length downto i + 1 
            if A[j] < A[j - 1]
                exchange A[j] with A[j - 1]

a. Let A ′ A' A denote the output of BUBBLESORT ( A ) \text{BUBBLESORT}(A) BUBBLESORT(A) To prove that BUBBLESORT \text{BUBBLESORT} BUBBLESORT is correct, we need to prove that it terminates and that

(2.3) A ′ [ 1 ] ≤ A ′ [ 2 ] ≤ ⋯ ≤ A ′ [ n ] , A'[1] \le A'[2] \le \cdots \le A'[n], \tag{2.3} A[1]A[2]A[n],(2.3)

where n = A . l e n g t h n = A.length n=A.length. In order to show that BUBBLESORT actually sorts, what else do we need to prove?

The next two parts will prove inequality (2.3) \text{(2.3)} (2.3).

b. State precisely a loop invariant for the for loop in lines 2–4, and prove that this loop invariant holds. Your proof should use the structure of the loop invariant proof presented in this chapter.

c. Using the termination condition of the loop invariant proved in part (b), state a loop invariant for the for loop in lines 1–4 that will allow you to prove inequality (2.3) \text{(2.3)} (2.3). Your proof should use the structure of the loop invariant proof presented in this chapter.

d. What is the worst-case running time of bubblesort? How does it compare to the running time of insertion sort?

a. We need to show that the elements of A ′ A' A form a permutation of the elements of A A A.

b. Loop invariant: At the start of each iteration of the for loop of lines 2–4, A [ j ] = min ⁡ A [ k ] : j ≤ k ≤ n A[j] = \min\\{A[k]: j \le k \le n\\} A[j]=minA[k]:jkn and the subarray A [ j . . n ] A[j..n] A[j..n] is a permutation of the values that were in A [ j . . n ] A[j..n] A[j..n] at the time that the loop started.

Initialization: Initially, j = n j = n j=n, and the subarray A [ j . . n ] A[j..n] A[j..n] consists of single element A [ n ] A[n] A[n]. The loop invariant trivially holds.

Maintenance: Consider an iteration for a given value of j j j. By the loop invariant, A [ j ] A[j] A[j] is the smallest value in A [ j . . n ] A[j..n] A[j..n]. Lines 3–4 exchange A [ j ] A[j] A[j] and A [ j − 1 ] A[j - 1] A[j1] if A [ j ] A[j] A[j] is less than A [ j − 1 ] A[j - 1] A[j1], and so A [ j − 1 ] A[j - 1] A[j1] will be the smallest value in A [ j − 1.. n ] A[j - 1..n] A[j1..n] afterward. Since the only change to the subarray A [ j − 1.. n ] A[j - 1..n] A[j1..n] is this possible exchange, and the subarray A [ j . . n ] A[j..n] A[j..n] is a permutation of the values that were in A [ j . . n ] A[j..n] A[j..n] at the time that the loop started, we see that A [ j − 1.. n ] A[j - 1..n] A[j1..n] is a permutation of the values that were in A [ j − 1.. n ] A[j - 1..n] A[j1..n] at the time that the loop started. Decrementing j j j for the next iteration maintains the invariant.

Termination: The loop terminates when j j j reaches i i i. By the statement of the loop invariant, A [ i ] = min ⁡ A [ k ] : i ≤ k ≤ n A[i] = \min\\{A[k]: i \le k \le n\\} A[i]=minA[k]:ikn and A [ i . . n ] A[i..n] A[i..n] is a permutation of the values that were in A [ i . . n ] A[i..n] A[i..n] at the time that the loop started.

c. Loop invariant: At the start of each iteration of the for loop of lines 1–4, the subarray A [ 1.. i − 1 ] A[1..i - 1] A[1..i1] consists of the i − 1 i - 1 i1 smallest values originally in A [ 1.. n ] A[1..n] A[1..n], in sorted order, and A [ i . . n ] A[i..n] A[i..n] consists of the n − i + 1 n - i + 1 ni+1 remaining values originally in A [ 1.. n ] A[1..n] A[1..n].

Initialization: Before the first iteration of the loop, i = 1 i = 1 i=1. The subarray A [ 1.. i − 1 ] A[1..i - 1] A[1..i1] is empty, and so the loop invariant vacuously holds.

Maintenance: Consider an iteration for a given value of i i i. By the loop invariant, A [ 1.. i − 1 ] A[1..i - 1] A[1..i1] consists of the i i i smallest values in A [ 1.. n ] A[1..n] A[1..n], in sorted order. Part (b) showed that after executing the for loop of lines 2–4, A [ i ] A[i] A[i] is the smallest value in A [ i . . n ] A[i..n] A[i..n], and so A [ 1.. i ] A[1..i] A[1..i] is now the i i i smallest values originally in A [ 1.. n ] A[1..n] A[1..n], in sorted order. Moreover, since the for loop of lines 2–4 permutes A [ i . . n ] A[i..n] A[i..n], the subarray A [ i + 1.. n ] A[i + 1..n] A[i+1..n] consists of the n − i n - i ni remaining values originally in A [ 1.. n ] A[1..n] A[1..n].

Termination: The for loop of lines 1–4 terminates when i = n i = n i=n, so that i − 1 = n − 1 i - 1 = n - 1 i1=n1. By the statement of the loop invariant, A [ 1.. i − 1 ] A[1..i - 1] A[1..i1] is the subarray A [ 1.. n − 1 ] A[1..n - 1] A[1..n1], and it consists of the n − 1 n - 1 n1 smallest values originally in A [ 1.. n ] A[1..n] A[1..n], in sorted order. The remaining element must be the largest value in A [ 1.. n ] A[1..n] A[1..n], and it is in A [ n ] A[n] A[n]. Therefore, the entire array A [ 1.. n ] A[1..n] A[1..n] is sorted.

Note: Tn the second edition, the for loop of lines 1–4 had an upper bound of A . l e n g t h A.length A.length. The last iteration of the outer for loop would then result in no iterations of the inner for loop of lines 1–4, but the termination argument would simplify: A [ 1.. i − 1 ] A[1..i - 1] A[1..i1] would be the entire array A [ 1.. n ] A[1..n] A[1..n], which, by the loop invariant, is sorted.

d. The running time depends on the number of iterations of the for loop of lines 2–4. For a given value of i i i, this loop makes n − i n - i ni iterations, and i i i takes on the values 1 , 2 , … , n − 1 1, 2, \ldots, n - 1 1,2,,n1. The total number of iterations, therefore, is

\begin{align}
\sum_{i = 1}^{n - 1} (n - i)
& = \sum_{i = 1}^{n - 1} n - \sum_{i = 1}^{n - 1} i \\
& = n(n - 1) - \frac{n(n - 1)}{2} \\
& = \frac{n(n - 1)}{2} \\
& = \frac{n^2}{2} - \frac{n}{2}.
\end{align}

Thus, the running time of bubblesort is Θ ( n 2 ) \Theta(n^2) Θ(n2) in all cases. The worst-case running time is the same as that of insertion sort.

2-3

The following code fragment implements Horner’s rule for evaluating a polynomial

\begin{align}
P(x) & = \sum_{k = 0}^n a_k x^k \\
& = a_0 + x(a_1 + x (a_2 + \cdots + x(a_{n - 1} + x a_n) \cdots)),
\end{align}

given the coefficients a 0 , a 1 , … , a n a_0, a_1, \ldots, a_n a0,a1,,an and a value of x x x:

y = 0
for i = n downto 0
    y = a[i] + x * y

a. In terms of Θ \Theta Θ-notation, what is the running time of this code fragment for Horner’s rule?

b. Write pseudocode to implement the naive polynomial-evaluation algorithm that computes each term of the polynomial from scratch. What is the running time of this algorithm? How does it compare to Horner’s rule

c. Consider the following loop invariant: At the start of each iteration of the for loop of lines 2-3,

y = ∑ k = 0 n − ( i + 1 ) a k + i + 1 x k . y = \sum_{k = 0}^{n - (i + 1)} a_{k + i + 1} x^k. y=k=0n(i+1)ak+i+1xk.

Interpret a summation with no terms as equaling 0 0 0. Following the structure of the loop invariant proof presented in this chapter, use this loop invariant to show that, at termination, y = ∑ k = 0 n a k x k y = \sum_{k = 0}^n a_k x^k y=k=0nakxk.

d. Conclude by arguing that the given code fragment correctly evaluates a polynomial characterized by the coefficients a 0 , a 1 , … , a n a_0, a_1, \ldots, a_n a0,a1,,an.

a. Θ ( n ) \Theta(n) Θ(n).

b.

NAIVE-HORNER()
    y = 0
    for k = 0 to n
        temp = 1
        for i = 1 to k
            temp = temp * x
            y = y + a[i] * m

The running time is Θ ( n 2 ) \Theta(n^2) Θ(n2), because of the nested loop. It is obviously slower.

c. Initialization: It is pretty trivial, since the summation has no terms which implies y = 0 y = 0 y=0.

Maintenance: By using the loop invariant, in the end of the i i i-the iteration, we have

\begin{align}
y & = a_i + x \sum_{k = 0}^{n - (i + 1)} a_{k + i + 1} x^k \\
& = a_i x^0 + \sum_{k = 0}^{n - i - 1} a_{k + i + 1} x^{k + 1} \\
& = a_i x^0 \sum_{k = 1}^{n - i} a_{k + i} x^k \\
& = \sum_{k = 0}^{n - i} a_{k + i} x^k.
\end{align}

Termination: The loop terminates at i = − 1 i = -1 i=1. If we substitute,

y = ∑ k = 0 n − i − 1 a k + i + 1 x k = ∑ k = 0 n a k x k . y = \sum_{k = 0}^{n - i - 1} a_{k + i + 1} x^k = \sum_{k = 0}^n a_k x^k. y=k=0ni1ak+i+1xk=k=0nakxk.

d. The invariant of the loop is a sum that equals a polynomial with the given coefficients.

2-4

Let A [ 1.. n ] A[1..n] A[1..n] be an array of n n n distinct numbers. If i < j i < j i<j and A [ i ] > A [ j ] A[i] > A[j] A[i]>A[j], then the pair ( i , j ) (i, j) (i,j) is called an inversion of A A A.

a. List the five inversions in the array ⟨ 2 , 3 , 8 , 6 , 1 ⟩ \langle 2, 3, 8, 6, 1 \rangle 2,3,8,6,1.

b. What array with elements from the set 1 , 2 , … , n \\{1, 2, \ldots, n\\} 1,2,,n has the most inversions? How many does it have?

c. What is the relationship between the running time of insertion sort and the number of inversions in the input array? Justify your answer.

d. Give an algorithm that determines the number of inversions in any permutation of n n n elements in Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn) worst-case time. ( Hint: \textit{Hint:} Hint: Modify merge sort).

a. The inversions are ( 1 , 5 ) (1,5) (1,5), ( 2 , 5 ) (2,5) (2,5), ( 3 , 4 ) (3,4) (3,4), ( 3 , 5 ) (3,5) (3,5), ( 4 , 5 ) (4,5) (4,5). (Remember that inversions are specified by indices rather than by the values in the array.)

b. The array with elements from 1 , 2 , … , n \\{1, 2, \ldots, n\\} 1,2,,n with the most inversions is ⟨ n , n − 1 , n − 2 , … , 2 , 1 ⟩ \langle n, n - 1, n - 2, \ldots, 2, 1 \rangle n,n1,n2,,2,1. For all 1 ≤ i < j ≤ n 1 \le i < j \le n 1i<jn, there is an inversion ( i , j ) (i, j) (i,j). The number of such inversions is ( n 2 ) = n ( n − 1 ) / 2 \binom{n}{2} = n(n - 1) / 2 (2n)=n(n1)/2.

c. Suppose that the array A A A starts out with an inversion ( k , j ) (k, j) (k,j). Then k < j k < j k<j and A [ k ] > A [ j ] A[k] > A[j] A[k]>A[j]. At the time that the outer for loop of lines 1–8 sets k e y = A [ j ] key = A[j] key=A[j], the value that started in A [ k ] A[k] A[k] is still somewhere to the left of A [ j ] A[j] A[j]. That is, it’s in A [ i ] A[i] A[i], where 1 ≤ i < j 1 \le i < j 1i<j, and so the inversion has become ( i , j ) (i, j) (i,j). Some iteration of the while loop of lines 5–7 moves A [ i ] A[i] A[i] one position to the right. Line 8 will eventually drop k e y key key to the left of this element, thus eliminating the inversion. Because line 5 moves only elements that are greater than k e y key key, it moves only elements that correspond to inversions. In other words, each iteration of the while loop of lines 5–7 corresponds to the elimination of one inversion.

d. We follow the hint and modify merge sort to count the number of inversions in Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn) time.

To start, let us define a merge-inversion as a situation within the execution of merge sort in which the MERGE \text{MERGE} MERGE procedure, after copying A [ p . . q ] A[p..q] A[p..q] to L L L and A [ q + 1.. r ] A[q + 1..r] A[q+1..r] to R R R, has values x x x in L L L and y y y in R R R such that x > y x > y x>y. Consider an inversion ( i , j ) (i, j) (i,j), and let x = A [ i ] x = A[i] x=A[i] and y = A [ j ] y = A[j] y=A[j] , so that i < j i < j i<j and x > y x > y x>y. We claim that if we were to run merge sort, there would be exactly one mergeinversion involving x x x and y y y. To see why, observe that the only way in which array elements change their positions is within the MERGE \text{MERGE} MERGE procedure. Moreover, since MERGE \text{MERGE} MERGE keeps elements within L L L in the same relative order to each other, and correspondingly for R R R, the only way in which two elements can change their ordering relative to each other is for the greater one to appear in L L L and the lesser one to appear in R R R. Thus, there is at least one merge-inversion involving x x x and y y y. To see that there is exactly one such merge-inversion, observe that after any call of MERGE \text{MERGE} MERGE that involves both x x x and y y y, they are in the same sorted subarray and will therefore both appear in L L L or both appear in R R R in any given call thereafter. Thus, we have proven the claim.

We have shown that every inversion implies one merge-inversion. In fact, the correspondence between inversions and merge-inversions is one-to-one. Suppose we have a merge-inversion involving values x x x and y y y, where x x x originally was A [ i ] A[i] A[i] and y y y was originally A [ j ] A[j] A[j]. Since we have a merge-inversion, x > y x > y x>y. And since x x x is in L L L and y y y is in R R R, x x x must be within a subarray preceding the subarray containing y y y. Therefore x x x started out in a position i i i preceding y y y's original position j j j, and so ( i , j ) (i, j) (i,j) is an inversion.

Having shown a one-to-one correspondence between inversions and mergeinversions, it suffices for us to count merge-inversions.

Consider a merge-inversion involving y y y in R R R. Let z z z be the smallest value in L L L that is greater than y y y. At some point during the merging process, z z z and y y y will be the “exposed” values in L L L and R R R, i.e., we will have z = L [ i ] z = L[i] z=L[i] and y = R [ j ] y = R[j] y=R[j] in line 13 of MERGE \text{MERGE} MERGE. At that time, there will be merge-inversions involving y y y and L [ i ] , L [ i + 1 ] , L [ i + 2 ] , … , L [ n 1 ] L[i], L[i + 1], L[i + 2], \ldots, L[n_1] L[i],L[i+1],L[i+2],,L[n1], and these n 1 − i + 1 n_1 - i + 1 n1i+1 merge-inversions will be the only ones involving y y y. Therefore, we need to detect the first time that z z z and y y y become exposed during the MERGE \text{MERGE} MERGE procedure and add the value of n 1 − i + 1 n_1 - i + 1 n1i+1 at that time to our total count of merge-inversions.

The following pseudocode, modeled on merge sort, works as we have just described. It also sorts the array A A A.

COUNT-INVERSIONS(A, p, r)
    inversions = 0
    if p < r
        q = floor((p + r) / 2)
        inversions = inversions + COUNT-INVERSIONS(A, p, q)
        inversions = inversions + COUNT-INVERSIONS(A, q + 1, r)
        inversions = inversions + MERGE-INVERSIONS(A, p, q, r)
    return inversions
MERGE-INVERSIONS(A, p, q, r)
    n[1] = q - p + 1
    n[2] = r - q
    let L[1..n[1] + 1] and R[1..n[2] + 1] be new arrays
    for i = 1 to n[1]
        L[i] = A[p + i - 1]
    for j = 1 to n[2]
        R[j] = A[q + j]
    L[n[1] + 1] = ∞
    L[n[2] + 1] = ∞
    i = 1
    j = 1
    inversions = 0
    for k = p to r
        if R[j] < L[i]
            inversions = inversions + n[1] - i + 1
            A[k] = R[j]
            j = j + 1
        else A[k] = L[i]
            i = i + 1
    return inversions

The initial call is COUNT-INVERSIONS ( A , 1 , n ) \text{COUNT-INVERSIONS}(A, 1, n) COUNT-INVERSIONS(A,1,n).

In MERGE-INVERSIONS \text{MERGE-INVERSIONS} MERGE-INVERSIONS, whenever R [ j ] R[j] R[j] is exposed and a value greater than R [ j ] R[j] R[j] becomes exposed in the L L L array, we increase inersions by the number of remaining elements in L L L. Then because R [ j + 1 ] R[j + 1] R[j+1] becomes exposed, R [ j ] R[j] R[j] can never be exposed again. We don’t have to worry about merge-inversions involving the sentinel ∞ \infty in R R R, since no value in L L L will be greater than ∞ \infty .

Since we have added only a constant amount of additional work to each procedure call and to each iteration of the last for loop of the merging procedure, the total running time of the above pseudocode is the same as for merge sort: Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn).

2.1-1

Using Figure 2.2 as a model, illustrate the operation of INSERTION-SORT \text{INSERTION-SORT} INSERTION-SORT on the array A = ⟨ 31 , 41 , 59 , 26 , 41 , 58 ⟩ A = \langle 31, 41, 59, 26, 41, 58 \rangle A=31,41,59,26,41,58.

\begin{align}
A & = \langle 31, 41, 59, 26, 41, 58 \rangle \\
A & = \langle 31, 41, 59, 26, 41, 58 \rangle \\
A & = \langle 31, 41, 59, 26, 41, 58 \rangle \\
A & = \langle 26, 31, 41, 59, 41, 58 \rangle \\
A & = \langle 26, 31, 41, 41, 59, 58 \rangle \\
A & = \langle 26, 31, 41, 41, 58, 59 \rangle
\end{align}

2.1-2

Rewrite the INSERTION-SORT \text{INSERTION-SORT} INSERTION-SORT procedure to sort into nonincreasing instead of nondecreasing order.

INSERTION-SORT(A) 
    for j = 2 to A.length
        key = A[j]
        i = j - 1
        while i > 0 and A[i] < key
            A[i + 1] = A[i]
            i = i - 1
        A[i + 1] = key

2.1-3

Consider the searching problem:

Input: A sequence of n n n numbers A = ⟨ a 1 , a 2 , … , a n ⟩ A = \langle a_1, a_2, \ldots, a_n \rangle A=a1,a2,,an and a value v v v.

Output: An index i i i such that v = A [ i ] v = A[i] v=A[i] or the special value NIL \text{NIL} NIL if v v v does not appear in A A A.

Write pseudocode for linear search, which scans through the sequence, looking for v v v. Using a loop invariant, prove that your algorithm is correct. Make sure that your loop invariant fulfills the three necessary properties.

LINEAR-SEARCH(A, v)
    for i = 1 to A.length
       if A[i] == v
            return i
    return NIL

Loop invariant: At the start of each iteration of the for loop, the subarray A [ 1.. i − 1 ] A[1..i - 1] A[1..i1] consists of elements that are different than v v v.

Initialization: Initially the subarray is the empty array, so the prove is trivial.

Maintenance: On each step, we know that A [ 1.. i − 1 ] A[1..i - 1] A[1..i1] does not contain v v v. We compare it with A [ i ] A[i] A[i]. If they are the same, we return i i i, which is a correct result. Otherwise, we continue to the next step. We have already insured that A [ 1.. i − 1 ] A[1..i - 1] A[1..i1] does not contain v v v and that A [ i ] A[i] A[i] is different from v v v, so this step preserves the invariant.

Termination: The loop terminates when i > A . l e n g t h i > A.length i>A.length. Since i i i increases by 1 1 1 and i > A . l e n g t h i > A.length i>A.length, we know that all the elements in A A A have been checked and it has been found that v v v is not among them. Thus, we return NIL \text{NIL} NIL.

2.1-4

Consider the problem of adding two n n n-bit binary integers, stored in two n n n-element arrays A A A and B B B. The sum of the two integers should be stored in binary form in an ( n + 1 ) (n + 1) (n+1)-element array C C C. State the problem formally and write pseudocode for adding the two integers.

Input: An array of booleans A = ⟨ a 1 , a 2 , … , a n ⟩ A = \langle a_1, a_2, \ldots, a_n \rangle A=a1,a2,,an and an array of booleans B = ⟨ b 1 , b 2 , … , b n ⟩ B = \langle b_1, b_2, \ldots, b_n \rangle B=b1,b2,,bn, each representing an integer stored in binary format (each digit is a number, either 0 0 0 or 1 1 1, least-significant digit first) and each of length n n n.

Output: An array C = ⟨ c 1 , c 2 , … , c n + 1 ⟩ C = \langle c_1, c_2, \ldots, c_{n + 1} \rangle C=c1,c2,,cn+1 such that C ′ = A ′ + B ′ C' = A' + B' C=A+B where A ′ A' A, B ′ B' B and C ′ C' C are the integers, represented by A A A, B B B and C C C.

ADD-BINARY(A, B)
    C = new integer[A.length + 1]
    carry = 0
    for i = 1 to A.length
        C[i] = (A[i] + B[i] + carry) % 2  // remainder
        carry = (A[i] + B[i] + carry) / 2 // quotient
    C[i] = carry
    return C

2.2-1

Express the function n 3 / 1000 − 100 n 2 − 100 n + 3 n n^3 / 1000 - 100n^2 - 100n + 3n n3/1000100n2100n+3n in terms of Θ \Theta Θ-notation.

Θ ( n 3 ) \Theta(n^3) Θ(n3).

2.2-2

Consider sorting n n n numbers stored in array A A A by first finding the smallest element of A A A and exchanging it with the element in A [ 1 ] A[1] A[1]. Then find the second smallest element of A A A, and exchange it with A [ 2 ] A[2] A[2]. Continue in this manner for the first n − 1 n - 1 n1 elements of A A A. Write pseudocode for this algorithm, which is known as selection sort. What loop invariant does this algorithm maintain? Why does it need to run for only the first n − 1 n - 1 n1 elements, rather than for all n n n elements? Give the best-case and worst-case running times of selection sort in Θ \Theta Θ-notation.

SELECTION-SORT(A)
    n = A.length
    for j = 1 to n - 1
        smallest = j
        for i = j + 1 to n
            if A[i] < A[smallest]
                smallest = i
        exchange A[j] with A[smallest]

The algorithm maintains the loop invariant that at the start of each iteration of the outer for loop, the subarray A [ 1.. j − 1 ] A[1..j - 1] A[1..j1] consists of the j − 1 j - 1 j1 smallest elements in the array A [ 1.. n ] A[1..n] A[1..n], and this subarray is in sorted order. After the first n − 1 n - 1 n1 elements, the subarray A [ 1.. n ] A[1..n] A[1..n] contains the smallest n − 1 n - 1 n1 elements, sorted, and therefore element A [ n ] A[n] A[n] must be the largest element.

The running time of the algorithm is Θ ( n 2 ) \Theta(n^2) Θ(n2) for all cases.

2.2-3

Consider linear search again (see Exercise 2.1-3). How many elements of the in- put sequence need to be checked on the average, assuming that the element being searched for is equally likely to be any element in the array? How about in the worst case? What are the average-case and worst-case running times of linear search in Θ \Theta Θ-notation? Justify your answers.

If the element is present in the sequence, half of the elements are likely to be checked before it is found in the average case. In the worst case, all of them will be checked. That is, n / 2 n / 2 n/2 checks for the average case and n n n for the worst case. Both of them are Θ ( n ) \Theta(n) Θ(n).

2.2-4

How can we modify almost any algorithm to have a good best-case running time?

Modify the algorithm so it tests whether the input satisfies some special-case condition and, if it does, output a pre-computed answer. The best-case running time is generally not a good measure of an algorithm.

2.3-1

Using Figure 2.4 as a model, illustrate the operation of merge sort on the array A = ⟨ 3 , 41 , 52 , 26 , 38 , 57 , 9 , 49 ⟩ A = \langle 3, 41, 52, 26, 38, 57, 9, 49 \rangle A=3,41,52,26,38,57,9,49.

[ 3 ] [ 41 ] [ 52 ] [ 26 ] [ 38 ] [ 57 ] [ 9 ] [ 49 ] [3] \quad [41] \quad [52] \quad [26] \quad [38] \quad [57] \quad [9] \quad [49] [3][41][52][26][38][57][9][49]

↓ \downarrow

[ 3 ∣ 41 ] [ 26 ∣ 52 ] [ 38 ∣ 57 ] [ 9 ∣ 49 ] [3|41] \quad [26| 52] \quad [38|57] \quad [9|49] [341][2652][3857][949]

↓ \downarrow

[ 3 ∣ 26 ∣ 41 ∣ 52 ] [ 9 ∣ 38 ∣ 49 ∣ 57 ] [3|26|41|52] \quad [9 |38|49|57] [3264152][9384957]

↓ \downarrow

[ 3 ∣ 9 ∣ 26 ∣ 38 ∣ 41 ∣ 49 ∣ 52 ∣ 57 ] [3|9|26|38|41|49|52|57] [39263841495257]

2.3-2

Rewrite the MERGE \text{MERGE} MERGE procedure so that it does not use sentinels, instead stopping once either array L L L or R R R has had all its elements copied back to A A A and then copying the remainder of the other array back into A A A.

MERGE(A, p, q, r)
    n[1] = q - p + 1
    n[2] = r - q
    let L[1..n[1]] and R[1..n[2]] be new arrays
    for i = 1 to n[1]
        L[i] = A[p + i - 1]
    for j = 1 to n[2]
        R[j] = A[q + j]
    i = 1
    j = 1
    for k = p to r
        if i > n[1]
            A[k] = R[j]
            j = j + 1
        else if j > n[2]
            A[k] = L[i]
            i = i + 1
        else if L[i] ≤ R[j]
            A[k] = L[i]
            i = i + 1
        else
            A[k] = R[j]
            j = j + 1

2.3-3

Use mathematical induction to show that when n n n is an exact power of 2 2 2, the solution of the recurrence

T ( n ) = { 2 if  n = 2 , 2 T ( n / 2 ) if  n = 2 k , for  k > 1 T(n) = \begin{cases} 2 & \text{if } n = 2, \\\\ 2T(n / 2) & \text{if } n = 2^k, \text{for } k > 1 \end{cases} T(n)=22T(n/2)if n=2,if n=2k,for k>1

is T ( n ) = n lg ⁡ n T(n) = n\lg n T(n)=nlgn.

The base case is when n = 2 n = 2 n=2, and we have n lg ⁡ n = 2 lg ⁡ 2 = 2 ⋅ 1 = 2 n\lg n = 2\lg 2 = 2 \cdot 1 = 2 nlgn=2lg2=21=2.

For the inductive step, our inductive hypothesis is that T ( n / 2 ) = ( n / 2 ) lg ⁡ ( n / 2 ) T(n / 2) = (n / 2)\lg(n / 2) T(n/2)=(n/2)lg(n/2). Then

\begin{align}
T(n) & = 2T(n / 2) + n \\
& = 2(n / 2) \lg(n / 2) + n \\
& = n(\lg n - 1) + n \\
& = n\lg n - n + n \\
& = n\lg n,
\end{align}

which completes the inductive proof for exact powers of 2 2 2.

2.3-4

We can express insertion sort as a recursive procedure as follows. In order to sort A [ 1.. n ] A[1..n] A[1..n], we recursively sort A [ 1.. n − 1 ] A[1..n - 1] A[1..n1] and then insert A [ n ] A[n] A[n] into the sorted array A [ 1.. n − 1 ] A[1..n - 1] A[1..n1]. Write a recurrence for the running time of this recursive version of insertion sort.

Since it takes Θ ( n ) \Theta(n) Θ(n) time in the worst case to insert A [ n ] A[n] A[n] into the sorted array A [ 1.. n − 1 ] A[1..n - 1] A[1..n1], we get the recurrence

T ( n ) = { Θ ( 1 ) if  n = 1 , T ( n − 1 ) + Θ ( n ) if  n > 1. T(n) = \begin{cases} \Theta(1) & \text{if } n = 1, \\\\ T(n - 1) + \Theta(n) & \text{if } n > 1. \end{cases} T(n)=Θ(1)T(n1)+Θ(n)if n=1,if n>1.

Although the exercise does not ask you to solve this recurrence, its solution is T ( n ) = Θ ( n 2 ) T(n) = \Theta(n^2) T(n)=Θ(n2).

2.3-5

Referring back to the searching problem (see Exercise 2.1-3), observe that if the sequence A A A is sorted, we can check the midpoint of the sequence against v v v and eliminate half of the sequence from further consideration. The binary search algorithm repeats this procedure, halving the size of the remaining portion of the sequence each time. Write pseudocode, either iterative or recursive, for binary search. Argue that the worst-case running time of binary search is Θ ( lg ⁡ n ) \Theta(\lg n) Θ(lgn).

Procedure BINARY-SEARCH \text{BINARY-SEARCH} BINARY-SEARCH takes a sorted array A A A, a value v v v, and a range [ l o w . . h i g h ] [low..high] [low..high] of the array, in which we search for the value v v v. The procedure compares to the array entry at the midpoint of the range and decides to eliminate half the range from further consideration. We give both iterative and recursive versions, each of which returns either an index i i i such that A [ i ] = v A[i] = v A[i]=v, or NIL \text{NIL} NIL if no entry of A [ l o w . . h i g h ] A[low..high] A[low..high] contains the value v v v. The initial call to either version should have the parameters A A A, v v v, 1 1 1, n n n.

ITERATIVE-BINARY-SEARCH(A, v, low, high)
    while low ≤ high
        mid = floor((low + high) / 2)
        if v == A[mid]
            return mid
        else if v > A[mid]
            low = mid + 1
        else high = mid - 1
    return NIL
RECURSIVE-BINARY-SEARCH(A, v, low, high)
    if low > high
        return NIL
    mid = floor((low + high) / 2)
    if v == A[mid]
        return mid
    else if v > A[mid]
        return RECURSIVE-BINARY-SEARCH(A, v, mid + 1, high)
    else return RECURSIVE-BINARY-SEARCH(A, v, low, mid - 1)

Both procedures terminate the search unsuccessfully when the range is empty (i.e., l o w > h i g h low > high low>high) and terminate it successfully if the value v v v has been found. Based on the comparison of v v v to the middle element in the searched range, the search continues with the range halved. The recurrence for these procedures is therefore T ( n ) = T ( n / 2 ) + Θ ( 1 ) T(n) = T(n / 2) + \Theta(1) T(n)=T(n/2)+Θ(1), whose solution is T ( n ) = Θ ( lg ⁡ n ) T(n) = \Theta(\lg n) T(n)=Θ(lgn).

2.3-6

Observe that the while loop of lines 5–7 of the INSERTION-SORT \text{INSERTION-SORT} INSERTION-SORT procedure in Section 2.1 uses a linear search to scan (backward) through the sorted subarray A [ i . . j − 1 ] A[i..j - 1] A[i..j1]. Can we use a binary search (see Exercise 2.3-5) instead to improve the overall worst-case running time of insertion sort to Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn)?

The while loop of lines 5–7 of procedure INSERTION-SORT \text{INSERTION-SORT} INSERTION-SORT scans backward through the sorted array A [ 1.. j − 1 ] A[1..j - 1] A[1..j1] to find the appropriate place for A [ j ] A[j] A[j]. The hitch is that the loop not only searches for the proper place for A [ j ] A[j] A[j], but that it also moves each of the array elements that are bigger than A [ j ] A[j] A[j] one position to the right (line 6). These movements can take as much as Θ ( j ) \Theta(j) Θ(j) time, which occurs when all the j − 1 j - 1 j1 elements preceding A [ j ] A[j] A[j] are larger than A [ j ] A[j] A[j]. We can use binary search to improve the running time of the search to Θ ( lg ⁡ j ) \Theta(\lg j) Θ(lgj), but binary search will have no effect on the running time of moving the elements. Therefore, binary search alone cannot improve the worst-case running time of INSERTION-SORT \text{INSERTION-SORT} INSERTION-SORT to Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn).

2.3-7 ⋆ \star

Describe a Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn)-time algorithm that, given a set S S S of n n n integers and another integer x x x, determines whether or not there exist two elements in S S S whose sum is exactly x x x.

The following algorithm solves the problem:

  1. Sort the elements in S S S.
  2. Form the set KaTeX parse error: Expected '}', got 'EOF' at end of input: …\\{z: z = x - y for some KaTeX parse error: Expected 'EOF', got '}' at position 10: y \in S\\}̲.
  3. Sort the elements in S ′ S' S.
  4. Merge the two sorted sets S S S and S ′ S' S.
  5. There exist two elements in S S S whose sum is exactly x x x if and only if the same value appears in consecutive positions in the merged output.

To justify the claim in step 4, first observe that if any value appears twice in the merged output, it must appear in consecutive positions. Thus, we can restate the condition in step 5 as there exist two elements in S S S whose sum is exactly x x x if and only if the same value appears twice in the merged output.

Suppose that some value w w w appears twice. Then w w w appeared once in S S S and once in S ′ S' S. Because w w w appeared in S ′ S' S, there exists some y ∈ S y \in S yS such that w = x − y w = x - y w=xy, or x = w + y x = w + y x=w+y. Since w ∈ S w \in S wS, the elements w w w and y y y are in S S S and sum to x x x.

Conversely, suppose that there are values w , y ∈ S w, y \in S w,yS such that w + y = x w + y = x w+y=x. Then, since x − y = w x - y = w xy=w, the value w w w appears in S ′ S' S. Thus, w w w is in both S S S and S ′ S' S, and so it will appear twice in the merged output.

Steps 1 and 3 require Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn) steps. Steps 2, 4, 5, and 6 require O ( n ) O(n) O(n) steps. Thus the overall running time is O ( n lg ⁡ n ) O(n\lg n) O(nlgn).

A reader submitted a simpler solution that also runs in Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn) time. First, sort the elements in S S S, taking Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn) time. Then, for each element y y y in S S S, perform a binary search in S S S for x − y x - y xy. Each binary search takes O ( lg ⁡ n ) O(\lg n) O(lgn) time, and there are are most n n n of them, and so the time for all the binary searches is O ( n lg ⁡ n ) O(n\lg n) O(nlgn). The overall running time is Θ ( n lg ⁡ n ) \Theta(n\lg n) Θ(nlgn).

Another reader pointed out that since S S S is a set, if the value x / 2 x / 2 x/2 appears in S S S, it appears in S S S just once, and so x / 2 x / 2 x/2 cannot be a solution.

3.1-1

Let f ( n ) + g ( n ) f(n) + g(n) f(n)+g(n) be asymptotically nonnegative functions. Using the basic definition of Θ \Theta Θ-notation, prove that max ⁡ ( f ( n ) , g ( n ) ) = Θ ( f ( n ) + g ( n ) ) \max(f(n), g(n)) = \Theta(f(n) + g(n)) max(f(n),g(n))=Θ(f(n)+g(n)).

First, let’s clarify what the function max ⁡ ( f ( n ) , g ( n ) ) \max(f(n), g(n)) max(f(n),g(n)) is. Let’s define the function h ( n ) = max ⁡ ( f ( n ) , g ( n ) ) h(n) = \max(f(n), g(n)) h(n)=max(f(n),g(n)). Then

h ( n ) = { f ( n )  if  f ( n ) ≥ g ( n ) , g ( n )  if  f ( n ) < g ( n ) . h(n) = \begin{cases} f(n) & \text{ if } f(n) \ge g(n), \\\\ g(n) & \text{ if } f(n) < g(n). \end{cases} h(n)=f(n)g(n) if f(n)g(n), if f(n)<g(n).

Since f ( n ) f(n) f(n) and g ( n ) g(n) g(n) are asymptotically nonnegative, there exists n 0 n_0 n0 such that f ( n ) ≥ 0 f(n) \ge 0 f(n)0 and g ( n ) ≥ 0 g(n) \ge 0 g(n)0 for all n ≥ n 0 n \ge n_0 nn0. Thus for n ≥ n 0 n \ge n_0 nn0 , f ( n ) + g ( n ) ≥ f ( n ) ≥ 0 f(n) + g(n) \ge f(n) \ge 0 f(n)+g(n)f(n)0 and f ( n ) + g ( n ) ≥ g ( n ) ≥ 0 f(n) + g(n) \ge g(n) \ge 0 f(n)+g(n)g(n)0. Since for any particular n n n, h ( n ) h(n) h(n) is either f ( n ) f(n) f(n) or g ( n ) g(n) g(n), we have f ( n ) + g ( n ) ≥ h ( n ) ≥ 0 f(n) + g(n) \ge h(n) \ge 0 f(n)+g(n)h(n)0, which shows that

h ( n ) = max ⁡ ( f ( n ) , g ( n ) ) ≤ c 2 ( f ( n ) + g ( n ) ) h(n) = \max(f(n), g(n)) \le c_2(f(n) + g(n)) h(n)=max(f(n),g(n))c2(f(n)+g(n))

for all n ≥ n 0 n \ge n_0 nn0 (with c 2 = 1 c_2 = 1 c2=1 in the definition of Θ \Theta Θ).

Similarly, since for any particular n n n, h ( n ) h(n) h(n) is the larger of f ( n ) f(n) f(n) and g ( n ) g(n) g(n), we have for all n ≥ n 0 n \ge n_0 nn0, 0 ≤ f ( n ) ≤ h ( n ) 0 \le f(n) \le h(n) 0f(n)h(n) and 0 ≤ g ( n ) ≤ h ( n ) 0 \le g(n) \le h(n) 0g(n)h(n). Adding these two inequalities yields 0 ≤ f ( n ) + g ( n ) ≤ 2 h ( n ) 0 \le f(n) + g(n) \le 2h(n) 0f(n)+g(n)2h(n), or equivalently 0 ≤ ( f ( n ) + g ( n ) ) / 2 ≤ h ( n ) 0 \le (f(n) + g(n)) / 2 \le h(n) 0(f(n)+g(n))/2h(n), which shows that

h ( n ) = max ⁡ ( f ( n ) , g ( n ) ) ≥ c 1 ( f ( n ) + g ( n ) ) h(n) = \max(f(n), g(n)) \ge c_1(f(n) + g(n)) h(n)=max(f(n),g(n))c1(f(n)+g(n))

for all n ≥ n 0 n \ge n_0 nn0 (with c 1 = 1 / 2 c_1 = 1 / 2 c1=1/2 in the definition of Θ \Theta Θ).

3.1-2

Show that for any real constants a a a and b b b, where b > 0 b > 0 b>0,

(3.2) ( n + a ) b = Θ ( n b ) . (n + a)^b = \Theta(n^b). \tag{3.2} (n+a)b=Θ(nb).(3.2)

To show that ( n + a ) b = Θ ( n b ) (n + a)^b = \Theta(n^b) (n+a)b=Θ(nb), we want to find constants c 1 , c 2 , n 0 > 0 c_1, c_2, n_0 > 0 c1,c2,n0>0 such that 0 ≤ c 1 n b ≤ ( n + a ) b ≤ c 2 n b 0 \le c_1 n^b \le (n + a)^b \le c_2 n^b 0c1nb(n+a)bc2nb for all n ≥ n 0 n \ge n_0 nn0.

Note that

\begin{align}
n + a & \le n + |a| & \\
& \le 2n & \text{ when } |a| \le n,
\end{align}

and

\begin{align}
n + a & \ge n - |a| & \\
& \ge \frac{1}{2}n & \text{ when } |a| \le \frac{1}{2}n.
\end{align}

Thus, when n ≥ 2 ∣ a ∣ n \ge 2|a| n2a,

0 ≤ 1 2 n ≤ n + a ≤ 2 n . 0 \le \frac{1}{2}n \le n + a \le 2n. 021nn+a2n.

Since b > 0 b > 0 b>0, the inequality still holds when all parts are raised to the power b b b:

\begin{align}
0 \le \Big(\frac{1}{2}n\Big)^b & \le (n + a)^b \le (2n)^b, \\
0 \le \Big(\frac{1}{2}\Big)^b n^b & \le (n + a)^b \le 2^b n^b.
\end{align}

Thus, c 1 = ( 1 / 2 ) b c_1 = (1 / 2)^b c1=(1/2)b, c 2 = 2 b c_2 = 2^b c2=2b, and n 0 = 2 ∣ a ∣ n_0 = 2|a| n0=2a satisfy the definition.

3.1-3

Explain why the statement, ‘‘The running time of algorithm A A A is at least O ( n 2 ) O(n^2) O(n2),’’ is meaningless.

Let the running time be T ( n ) T(n) T(n). T ( n ) ≥ O ( n 2 ) T(n) \ge O(n^2) T(n)O(n2) means that T ( n ) ≥ f ( n ) T(n) \ge f(n) T(n)f(n) for some function f ( n ) f(n) f(n) in the set O ( n 2 ) O(n^2) O(n2). This statement holds for any running time T ( n ) T(n) T(n), since the function g ( n ) = 0 g(n) = 0 g(n)=0 for all n n n is in O ( n 2 ) O(n^2) O(n2), and running times are always nonnegative. Thus, the statement tells us nothing about the running time.

3.1-4

Is 2 n + 1 = O ( 2 n ) 2^{n + 1} = O(2^n) 2n+1=O(2n)? Is 2 2 n = O ( 2 n ) 2^{2n} = O(2^n) 22n=O(2n)?

2 n + 1 = O ( 2 n ) 2^{n + 1} = O(2^n) 2n+1=O(2n), but 2 2 n ≠ O ( 2 n ) 2^{2n} \ne O(2^n) 22n̸=O(2n).

  • To show that 2 n + 1 = O ( 2 n ) 2^{n + 1} = O(2^n) 2n+1=O(2n), we must find constants c c c; n 0 > 0 n_0 > 0 n0>0 such that

    0 ≤ 2 n + 1 ≤ c ⋅ 2 n  for all  n ≥ n 0 . 0 \le 2^{n + 1} \le c \cdot 2^n \text{ for all } n \ge n_0. 02n+1c2n for all nn0.

    Since 2 n + 1 = 2 ⋅ 2 n 2^{n + 1} = 2 \cdot 2^n 2n+1=22n for all n n n, we can satisfy the definition with c = 2 c = 2 c=2 and n 0 = 1 n_0 = 1 n0=1.

  • To show that 2 2 n ≠ O ( 2 n ) 2^{2n} \ne O(2^n) 22n̸=O(2n), assume there exist constants c , n 0 > 0 c, n_0 > 0 c,n0>0 such that

    0 ≤ 2 2 n ≤ c ⋅ 2 n  for all  n ≥ n 0 . 0 \le 2^{2n} \le c \cdot 2^n \text{ for all } n \ge n_0. 022nc2n for all nn0.

    Then 2 2 n = 2 n ⋅ 2 n ≤ c ⋅ 2 n ⇒ 2 n ≤ c 2^{2n} = 2^n \cdot 2^n \le c \cdot 2^n \Rightarrow 2^n \le c 22n=2n2nc2n2nc. But no constant is greater than all 2 n 2^n 2n, and so the assumption leads to a contradiction.

3.1-5

Prove Theorem 3.1.

The theorem states:

For any two functions f ( n ) f(n) f(n) and g ( n ) g(n) g(n), we have f ( n ) = Θ ( g ( n ) ) f(n) = \Theta(g(n)) f(n)=Θ(g(n)) if and only if f ( n ) = O ( g ( n ) ) f(n) = O(g(n)) f(n)=O(g(n)) and f ( n ) = Ω ( g ( n ) ) f(n) = \Omega(g(n)) f(n)=Ω(g(n)).

From f = Θ ( g ( n ) ) f = \Theta(g(n)) f=Θ(g(n)), we have that

0 ≤ c 1 g ( n ) ≤ f ( n ) ≤ c 2 g ( n )  for  n > n 0 . 0 \le c_1 g(n) \le f(n) \le c_2g(n) \text{ for } n > n_0. 0c1g(n)f(n)c2g(n) for n>n0.

We can pick the constants from here and use them in the definitions of O O O and Ω \Omega Ω to show that both hold.

From f ( n ) = Ω ( g ( n ) ) f(n) = \Omega(g(n)) f(n)=Ω(g(n)) and f ( n ) = O ( g ( n ) ) f(n) = O(g(n)) f(n)=O(g(n)), we have that

\begin{align}
& 0 \le c_3g(n) \le f(n) & \text{ for all } n \ge n_1 \\
\text{and } & 0 \le f(n) \le c_4g(n) & \text{ for all } n \ge n_2.
\end{align}

If we let n 3 = max ⁡ ( n 1 , n 2 ) n_3 = \max(n_1, n_2) n3=max(n1,n2) and merge the inequalities, we get

0 ≤ c 3 g ( n ) ≤ f ( n ) ≤ c 4 g ( n )  for all  n > n 3 . 0 \le c_3g(n) \le f(n) \le c_4g(n) \text{ for all } n > n_3. 0c3g(n)f(n)c4g(n) for all n>n3.

Which is the definition of Θ \Theta Θ.

3.1-6

Prove that the running time of an algorithm is Θ ( g ( n ) ) \Theta(g(n)) Θ(g(n)) if and only if its worst-case running time is O ( g ( n ) ) O(g(n)) O(g(n)) and its best-case running time is Ω ( g ( n ) ) \Omega(g(n)) Ω(g(n)).

If T w T_w Tw is the worst-case running time and T b T_b Tb is the best-case running time, we know that

\begin{align}
& 0 \le c_1g(n) \le T_b(n) & \text{ for } n > n_b \\
\text{and } & 0 \le T_w(n) \le c_2g(n) & \text{ for } n > n_w.
\end{align}

Combining them we get

0 ≤ c 1 g ( n ) ≤ T b ( n ) ≤ T w ( n ) ≤ c 2 g ( n )  for  n > max ⁡ ( n b , n w ) . 0 \le c_1g(n) \le T_b(n) \le T_w(n) \le c_2g(n) \text{ for } n > \max(n_b, n_w). 0c1g(n)Tb(n)Tw(n)c2g(n) for n>max(nb,nw).

Since the running time is bound between T b T_b Tb and T w T_w Tw and the above is the definition of the Θ \Theta Θ-notation, proved.

3.1-7

Prove o ( g ( n ) ) ∩ w ( g ( n ) ) o(g(n)) \cap w(g(n)) o(g(n))w(g(n)) is the empty set.

We know that for any c > 0 c > 0 c>0,

\begin{align}
& \exists n_1 > 0: 0 \le f(n) < cg(n) \\
\text{and } & \exists n_2 > 0: 0 \le cg(n) < f(n).
\end{align}

If we pick n 0 = max ⁡ ( n 1 , n 2 ) n_0 = \max(n_1, n_2) n0=max(n1,n2), from the problem definition we get

f ( n ) < c g ( n ) < f ( n ) . f(n) < cg(n) < f(n). f(n)<cg(n)<f(n).

There is no solutions, which means that the intersection is the empty set.

3.1-8

We can extend our notation to the case of two parameters n n n and m m m that can go to infinity independently at different rates. For a given function g ( n , m ) g(n, m) g(n,m) we denote O ( g ( n , m ) ) O(g(n, m)) O(g(n,m)) the set of functions:

\begin{align}
O(g(n, m)) = \{f(n, m):
& \text{ there exist positive constants } c, n_0, \text{ and } m_0 \\
& \text{ such that } 0 \le f(n, m) \le cg(n, m) \\
& \text{ for all } n \ge n_0 \text{ or } m \ge m_0.\}
\end{align}

Give corresponding definitions for Ω ( g ( n , m ) ) \Omega(g(n, m)) Ω(g(n,m)) and Θ ( g ( n , m ) ) \Theta(g(n, m)) Θ(g(n,m)).

\begin{align}
\Omega(g(n, m)) = \{f(n, m):
& \text{ there exist positive constants } c, n_0, \text{ and } m_0 \\
& \text{ such that } 0 \le cg(n, m) \le f(n, m) \\
& \text{ for all } n \ge n_0 \text{ or } m \ge m_0.\}
\end{align}

\begin{align}
\Theta(g(n, m)) = \{f(n, m):
& \text{ there exist positive constants } c_1, c_2, n_0, \text{ and } m_0 \\
& \text{ such that } 0 \le c_1 g(n, m) \le f(n, m) \le c_2 g(n, m) \\
& \text{ for all } n \ge n_0 \text{ or } m \ge m_0.\}
\end{align}

3.2-1

Show that if f ( n ) f(n) f(n) and g ( n ) g(n) g(n) are monotonically increasing functions, then so are the functions f ( n ) + g ( n ) f(n) + g(n) f(n)+g(n) and f ( g ( n ) ) f(g(n)) f(g(n)), and if f ( n ) f(n) f(n) and g ( n ) g(n) g(n) are in addition nonnegative, then f ( n ) ⋅ g ( n ) f(n) \cdot g(n) f(n)g(n) is monotonically increasing.

\begin{align}
f(m) \le f(n) \quad \text{ for } m \le n \\
g(m) \le g(n) \quad \text{ for } m \le n, \\
\to f(m) + g(m) \le f(n) + g(n),
\end{align}

which proves the first function.

Then

f ( g ( m ) ) ≤ f ( g ( n ) )  for  m ≤ n . f(g(m)) \le f(g(n)) \text{ for } m \le n. f(g(m))f(g(n)) for mn.

This is true, since g ( m ) > g ( n ) g(m) > g(n) g(m)>g(n) and f ( n ) f(n) f(n) is monotonically increasing.

If both functions are nonnegative, then we can multiply the two equalities and we get

f ( m ) ⋅ g ( m ) ≤ f ( n ) ⋅ g ( n ) . f(m) \cdot g(m) \le f(n) \cdot g(n). f(m)g(m)f(n)g(n).

3.2-2

Prove equation (3.16) \text{(3.16)} (3.16).

\begin{align}
a^{\log_b c} = a^\frac{\log_a c}{\log_a b} = (a^{\log_a c})^{\frac{1}{\log_a b}} = c^{\log_b a}
\end{align}

3.2-3

Prove equation (3.19) \text{(3.19)} (3.19). Also prove that n ≠ ω ( 2 n ) n \ne \omega(2^n) n̸=ω(2n) and n ≠ o ( n n ) n \ne o(n^n) n̸=o(nn).

(3.19) lg ⁡ ( n ! ) = Θ ( n lg ⁡ n ) \lg(n!) = \Theta(n\lg n) \tag{3.19} lg(n!)=Θ(nlgn)(3.19)

We use Stirling’s approximation:

\begin{align}
\lg(n!)
& = \lg\Bigg(\sqrt{2\pi n}\Big(\frac{n}{e}\Big)^n\Big(1 + \Theta(\frac{1}{n})\Big)\Bigg) \\
& = \lg\sqrt{2\pi n } + \lg\Big(\frac{n}{e}\Big)^n + \lg\Big(1+\Theta(\frac{1}{n})\Big) \\
& = \Theta(\sqrt n) + n\lg{\frac{n}{e}} + \lg\Big(\Theta(1) + \Theta(\frac{1}{n})\Big) \\
& = \Theta(\sqrt n) + \Theta(n\lg n) + \Theta(\frac{1}{n}) \\
& = \Theta(n\lg n).
\end{align}

The other two are

∀ n > 3 : 2 n = 2 ⋅ 2 ⋅ ⋯ ⋅ 2 ⎵ n times < 1 ⋅ 2 ⋅ ⋯ ⋅ n = n ! ⇒ n ! = ω ( 2 n ) . \forall n > 3: 2^n = \underbrace{2 \cdot 2 \cdot \cdots \cdot 2}_\text{n times} < 1 \cdot 2 \cdot \cdots \cdot n = n! \quad \Rightarrow \quad n! = \omega(2^n). n>3:2n=n times 222<12n=n!n!=ω(2n).

and

∀ n > 1 : n ! = 1 ⋅ 2 ⋅ ⋯ n < n ⋅ n ⋅ ⋯ ⋅ n ⎵ n times = n n ⇒ n ! = o ( n n ) . \forall n > 1 : n! = 1 \cdot 2 \cdot \cdots n < \underbrace{n \cdot n \cdot \cdots \cdot n}_\text{n times} = n^n \quad \Rightarrow \quad n! = o(n^n). n>1:n!=12n<n times nnn=nnn!=o(nn).

3.2-4 ⋆ \star

Is the function ⌈ lg ⁡ n ⌉ ! \lceil \lg n \rceil! lgn! polynomially bounded? Is the function ⌈ lg ⁡ lg ⁡ n ⌉ ! \lceil \lg\lg n \rceil! lglgn! polynomially bounded?

⌈ lg ⁡ n ⌉ ! \lceil \lg n \rceil! lgn! is not polynomially bounded, but ⌈ lg ⁡ lg ⁡ n ⌉ ! \lceil \lg\lg n \rceil! lglgn! is.

Proving that a function f ( n ) f(n) f(n) is polynomially bounded is equivalent to proving that lg ⁡ ( f ( n ) ) = O ( lg ⁡ n ) \lg(f(n)) = O(\lg n) lg(f(n))=O(lgn) for the following reasons.

  • If f f f is polynomially bounded, then there exist constants c c c, k k k, n 0 n_0 n0 such that for all n ≥ n 0 n \ge n_0 nn0, f ( n ) ≤ c n k f(n) \le cn^k f(n)cnk. Hence, lg ⁡ ( f ( n ) ) ≤ k c lg ⁡ n \lg(f(n)) \le kc\lg n lg(f(n))kclgn, which, since c c c and k k k are constants, means that lg ⁡ ( f ( n ) ) = O ( lg ⁡ n ) \lg(f(n)) = O(\lg n) lg(f(n))=O(lgn).
  • Similarly, if lg ⁡ ( f ( n ) = O ( lg ⁡ n ) \lg(f(n) = O(\lg n) lg(f(n)=O(lgn), then f f f is polynomially bounded.

In the following proofs, we will make use of the following two facts:

  1. lg ⁡ ( n ! ) = Θ ( n lg ⁡ n ) \lg(n!) = \Theta(n\lg n) lg(n!)=Θ(nlgn) (by equation (3.19) \text{(3.19)} (3.19)).
  2. ⌈ lg ⁡ n ⌉ = Θ ( lg ⁡ n ) \lceil \lg n \rceil = \Theta(\lg n) lgn=Θ(lgn), because
    • ⌈ lg ⁡ n ⌉ ≥ lg ⁡ n \lceil \lg n \rceil \ge \lg n lgnlgn
    • ⌈ lg ⁡ n ⌉ < lg ⁡ n + 1 ≤ 2 lg ⁡ n  for all  n ≥ 2 \lceil \lg n \rceil < \lg n + 1 \le 2\lg n \text{ for all } n \ge 2 lgn<lgn+12lgn for all n2

\begin{align}
\lg(\lceil \lg n \rceil!) & = \Theta(\lceil \lg n \rceil \lg \lceil \lg n \rceil) \\
& = \Theta(\lg n\lg\lg n) \\
& = \omega(\lg n).
\end{align}

Therefore, lg ⁡ ( ⌈ lg ⁡ n ⌉ ! ) ≠ O ( lg ⁡ n ) \lg(\lceil \lg n \rceil!) \ne O(\lg n) lg(lgn!)̸=O(lgn), and so ⌈ lg ⁡ n ⌉ ! \lceil \lg n \rceil! lgn! is not polynomially bounded.

\begin{align}
\lg(\lceil \lg\lg n \rceil!) & = \Theta(\lceil \lg\lg n \rceil \lg \lceil \lg\lg n \rceil) \\
& = \Theta(\lg\lg n\lg\lg\lg n) \\
& = o((\lg\lg n)^2) \\
& = o(\lg^2(\lg n)) \\
& = o(\lg n).
\end{align}

3.2-5 ⋆ \star

Which is asymptotically larger: KaTeX parse error: Expected group after '^' at position 8: \lg(\lg^̲\*n) or KaTeX parse error: Expected group after '^' at position 4: \lg^̲\*(\lg n)?

KaTeX parse error: Expected group after '^' at position 4: \lg^̲\*(\lg n) is asymptotically larger because KaTeX parse error: Expected group after '^' at position 4: \lg^̲\*(\lg n) = \lg….

3.2-6

Show that the golden ratio ϕ \phi ϕ and its conjugate ϕ ^ \hat \phi ϕ^ both satisfy the equation x 2 = x + 1 x^2 = x + 1 x2=x+1.

\begin{align}
\phi^2 - \phi - 1
& = \big(\frac{1 + \sqrt 5}{2}\big)^2 - \frac{1 + \sqrt 5}{2} - 1 \\
& = \frac{1 + 2\sqrt 5 + 5 - 2 - 2\sqrt 5 - 4}{4} \\
& = 0.
\end{align}
\begin{align}
\hat\phi^2 - \hat\phi - 1
& = \big(\frac{1 - \sqrt 5}{2}\big)^2 - \frac{1 - \sqrt 5}{2} - 1 \\
& = \frac{1 - 2\sqrt 5 + 5 - 2 + 2\sqrt 5 - 4}{4} \\
& = 0.
\end{align}

3.2-7

Prove by induction that the i i ith Fibonacci number satisfies the equality

F i = ϕ i − ϕ ^ i 5 , F_i = \frac{\phi^i - \hat\phi^i}{\sqrt 5}, Fi=5 ϕiϕ^i,

where ϕ \phi ϕ is the golden ratio and ϕ ^ \hat\phi ϕ^ is its conjugate.

We have two base cases: i = 0 i = 0 i=0 and i = 1 i = 1 i=1. For i = 0 i = 0 i=0, we have

\frac{\phi^0 - \hat\phi^0}{\sqrt 5}
& = \frac{1 - 1}{\sqrt 5} \\
& = 0 \\
& = F_0,

and for i = 1 i = 1 i=1, we have

\begin{align}
\frac{\phi^1 - \hat\phi^1}{\sqrt 5}
& = \frac{(1 + \sqrt 5) - (1 - \sqrt 5)}{2\sqrt 5} \\
& = \frac{2\sqrt 5}{2\sqrt 5} \\
& = 1 \\
& = F_1.
\end{align}

For the inductive case, the inductive hypothesis is that F i − 1 − ( ϕ i − 1 − ϕ ^ i − 1 ) / 5 F_{i - 1} - (\phi^{i - 1} - \hat\phi^{i - 1}) / \sqrt 5 Fi1(ϕi1ϕ^i1)/5 and F i − 2 = ( ϕ i − 2 − ϕ ^ i − 2 ) / 5 F_{i - 2} = (\phi^{i - 2} - \hat\phi^{i - 2}) / \sqrt 5 Fi2=(ϕi2ϕ^i2)/5 . We have

F_i & = F_{i - 1} + F_{i - 2} & \text{(equation (3.22)} \\
& = \frac{\phi^{i - 1} - \hat\phi^{i - 1}}{\sqrt 5} + \frac{\phi^{i - 2} - \hat\phi^{i - 2}}{\sqrt 5} & \text{(inductive hypothesis)} \\
& = \frac{\phi^{i - 2}(\phi + 1) - \hat\phi^{i - 2}(\hat\phi + 1)}{\sqrt 5} \\
& = \frac{\phi^{i - 2}\phi^2 - \hat\phi^{i - 2}\hat\phi^2}{\sqrt 5} & \text{(Exercise 3.2-6)} \\
& = \frac{\phi^i - \hat\phi^i}{\sqrt 5}.

3.2-8

Show that k ln ⁡ k = Θ ( n ) k\ln k = \Theta(n) klnk=Θ(n) implies k = Θ ( n / ln ⁡ n ) k = \Theta(n / \ln n) k=Θ(n/lnn).

From the symmetry of Θ \Theta Θ,

k ln ⁡ k = Θ ( n ) ⇒ n = Θ ( k ln ⁡ k ) . k\ln k = \Theta(n) \Rightarrow n = \Theta(k\ln k). klnk=Θ(n)n=Θ(klnk).

Let’s find ln ⁡ n \ln n lnn,

ln ⁡ n = Θ ( ln ⁡ ( k ln ⁡ k ) ) = Θ ( ln ⁡ k + ln ⁡ ln ⁡ k ) = Θ ( ln ⁡ k ) . \ln n = \Theta(\ln(k\ln k)) = \Theta(\ln k + \ln\ln k) = \Theta(\ln k). lnn=Θ(ln(klnk))=Θ(lnk+lnlnk)=Θ(lnk).

Let’s divide the two,

n ln ⁡ n = Θ ( k ln ⁡ k ) Θ ( ln ⁡ k ) = Θ ( k ln ⁡ k ln ⁡ k ) = Θ ( k ) . \frac{n}{\ln n} = \frac{\Theta(k\ln k)}{\Theta(\ln k)} = \Theta({\frac{k\ln k}{\ln k}}) = \Theta(k). lnnn=Θ(lnk)Θ(klnk)=Θ(lnkklnk)=Θ(k).

The last step above follows from the property that any polylogarithmic function grows more slowly than any positive polynomial function, i.e., that for constants a , b > 0 a, b > 0 a,b>0, we have lg ⁡ b = o ( n a ) \lg^b = o(n^a) lgb=o(na). Substitute lg ⁡ n \lg n lgn for n n n, 2 2 2 for b b b, and 1 1 1 for a a a, giving lg ⁡ 2 ( lg ⁡ n ) = o ( lg ⁡ n ) \lg^2(\lg n) = o(\lg n) lg2(lgn)=o(lgn).

Therefore, lg ⁡ ( ⌈ lg ⁡ lg ⁡ n ⌉ ! ) = O ( lg ⁡ n ) \lg(\lceil \lg\lg n \rceil!) = O(\lg n) lg(lglgn!)=O(lgn), and so ⌈ lg ⁡ lg ⁡ n ⌉ ! \lceil \lg\lg n \rceil! lglgn! is polynomially bounded.

3-1

Let
p ( n ) = ∑ i = 0 d a i n i , p(n) = \sum_{i = 0}^d a_i n^i, p(n)=i=0daini,

where a d > 0 a_d > 0 ad>0, be a degree- d d d polynomial in n n n, and let k k k be a constant. Use the definitions of the asymptotic notations to prove the following properties.

a. If k ≥ d k \ge d kd, then p ( n ) = O ( n k ) p(n) = O(n^k) p(n)=O(nk).

b. If k ≤ d k \le d kd, then p ( n ) = Ω ( n k ) p(n) = \Omega(n^k) p(n)=Ω(nk).

c. If k = d k = d k=d, then p ( n ) = Θ ( n k ) p(n) = \Theta(n^k) p(n)=Θ(nk).

d. If k > d k > d k>d, then p ( n ) = o ( n k ) p(n) = o(n^k) p(n)=o(nk).

e. If k < d k < d k<d, then p ( n ) = ω ( n k ) p(n) = \omega(n^k) p(n)=ω(nk).

Let’s see that p ( n ) = O ( n d ) p(n) = O(n^d) p(n)=O(nd). We need do pick c = a d + b c = a_d + b c=ad+b, such that

∑ i = 0 d = a d n d + a d − 1 n d − 1 + ⋯ + a 1 n + a 0 ≤ c n d . \sum\limits_{i = 0}^d = a_d n^d + a_{d - 1}n^{d - 1} + \cdots + a_1n + a_0 \le cn^d. i=0d=adnd+ad1nd1++a1n+a0cnd.

When we divide by n d n^d nd, we get

c = a d + b ≥ a d + a d − 1 n + a d − 2 n 2 + ⋯ + a 0 n d . c = a_d + b \ge a_d + \frac{a_{d - 1}}n + \frac{a_{d - 2}}{n^2} + \cdots + \frac{a_0}{n^d}. c=ad+bad+nad1+n2ad2++nda0.

and

b ≥ a d − 1 n + a d − 2 n 2 + ⋯ + a 0 n d . b \ge \frac{a_{d - 1}}n + \frac{a_{d - 2}}{n^2} + \cdots + \frac{a_0}{n^d}. bnad1+n2ad2++nda0.

If we choose b = 1 b = 1 b=1, then we can choose n 0 n_0 n0,

n 0 = max ⁡ ( d a d − 1 , d a d − 2 , … , d a 0 d ) . n_0 = \max(da_{d - 1}, d\sqrt{a_{d - 2}}, \ldots, d\sqrt[d]{a_0}). n0=max(dad1,dad2 ,,dda0 ).

Now we have n 0 n_0 n0 and c c c, such that

p ( n ) ≤ c n d for  n ≥ n 0 , p(n) \le cn^d \quad \text{for } n \ge n_0, p(n)cndfor nn0,

which is the definition of O ( n d ) O(n^d) O(nd).

By chosing b = − 1 b = -1 b=1 we can prove the Ω ( n d ) \Omega(n^d) Ω(nd) inequality and thus the Θ ( n d ) \Theta(n^d) Θ(nd) inequality.

It is very similar to prove the other inequalities.

3-2

Indicate for each pair of expressions ( A , B ) (A, B) (A,B) in the table below, whether A A A is O O O, o o o, Ω \Omega Ω, ω \omega ω, or Θ \Theta Θ of B B B. Assume that k ≥ 1 k \ge 1 k1, ϵ > 0 \epsilon > 0 ϵ>0, and c > 1 c > 1 c>1 are constants. Your answer should be in the form of the table with ‘‘yes’’ or ‘‘no’’ written in each box.

\begin{array}{ccccccc}
A & B & O & o & \Omega & \omega & \Theta \\
\hline
\lg^k n & n^\epsilon & yes & yes & no & no & no \\
n^k & c^n & yes & yes & no & no & no \\
\sqrt n & n^{\sin n} & no & no & no & no & no \\
2^n & 2^{n / 2} & no & no & yes & yes & no \\
n^{\lg c} & c^{\lg n} & yes & no & yes & no & yes \\
\lg(n!) & \lg(n^n) & yes & no & yes & no & yes
\end{array}

你可能感兴趣的:(算法)