Let X [1 .. n ] and Y [1 .. n ] be two arrays, each containing n numbers already in sorted order. Give an O (lg n )-time algorithm to find the median of all 2n elements in arrays X and Y .
#include <stdio.h> #include <stdlib.h> /* *求2个已经排好序的同样为n大小的数组的中位数算法 */ int FindMedian(int [],int [],int,int,int); int TWO_ARRAY_MEDIAN(int A[], int B[], int n) { int media=FindMedian(A,B,n,0,n-1); if(media=='\0') media=FindMedian(B,A,n,0,n-1); return media; } int FindMedian(int A[], int B[], int n, int low, int high) { if(low>high) return '\0'; int k=(low+high)/2; if(k==n && A[k]<=B[0]) { return A[k]; } else if(A[k]>=B[n-k-1] && A[k]<=B[n-k]) { return A[k]; } else if(A[k]<B[n-k-1]) return FindMedian(A,B,n,k+1,high); else return FindMedian(A,B,n,low,k-1); } int main() { int A[]={1,3,5}; int B[]={8,9,10}; printf("%d\n",TWO_ARRAY_MEDIAN(A,B,3)); }
算法分析:
Lets start out by supposing that the median (the lower median, since we know we
have an even number of elements) is in X. Lets call the median value m, and lets
suppose that its in X[k]. Then k elements of X are less than or equal to m and
n −k elements of X are greater than or equal to m. We know that in the two arrays
combined, there must be n elements less than or equal to m and n elements greater
than or equal to m, and so there must be n − k elements of Y that are less than or
equal to m and n − (n − k) = k elements of Y that are greater than or equal to m.
Thus, we can check that X[k] is the lower median by checking whether Y [n−k] ≤ X[k] ≤ Y [n − k + 1]. A boundary case occurs for k = n. Then n − k = 0, and
there is no array entry Y [0]; we only need to check that X[n] ≤ Y [1].
Now, if the median is in X but is not in X[k], then the above condition will not
hold. If the median is in X[k], where k < k, then X[k] is above the median, and
Y [n − k + 1] < X[k]. Conversely, if the median is in X[k], where k > k, then
X[k] is below the median, and X[k] < Y [n − k].
Thus, we can use a binary search to determine whether there is an X[k] such that
either k < n and Y [n−k] ≤ X[k] ≤ Y [n−k+1] or k = n and X[k] ≤ Y [n−k+1];
if we Þnd such an X[k], then it is the median. Otherwise, we know that the median
is in Y , and we use a binary search to Þnd a Y [k] such that either k < n and
X[n − k] ≤ Y [k] ≤ X[n − k + 1] or k = n and Y [k] ≤ X[n − k + 1]; such a
Y [k] is the median. Since each binary search takes O(lg n) time, we spend a total
of O(lg n) time.
Heres how we write the algorithm in pseudocode:
TWO-ARRAY-MEDIAN(X, Y )
n ← length[X] n also equals length[Y ]
median ← FIND-MEDIAN(X, Y, n, 1, n)
if median = NOT-FOUND
then median ← FIND-MEDIAN(Y, X, n, 1, n)
return median
FIND-MEDIAN(A, B, n, low, high)
if low > high
then return NOT-FOUND
else k ← (low+high)/2
if k = n and A[n] ≤ B[1]
then return A[n]
elseif k < n and B[n − k] ≤ A[k] ≤ B[n − k + 1]
then return A[k]
elseif A[k] > B[n − k + 1]
then return FIND-MEDIAN(A, B, n, low, k − 1)
else return FIND-MEDIAN(A, B, n, k + 1, high)