题目
There are two sorted arrays nums1 and nums2 of size m and n respectively.
Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).
You may assume nums1 and nums2 cannot be both empty.
Example 1:
nums1 = [1, 3]
nums2 = [2]
The median is 2.0
Example 2:
nums1 = [1, 2]
nums2 = [3, 4]
The median is (2 + 3)/2 = 2.5
解答
最终结果一定在nums1和nums2的中位数x1和x2中间,接下来的思路就很值得总结(因为自己一开始没有想到)。比较x1和x2大小,如果x1=x2,return x1;如果不等,假设x1>x2(反之同理),获得x1右侧和x2左侧数字个数n1和n2。取n=min{n1, n2},分别删除x1右侧和x2左侧的n个数,易知,此时nums1和nums2的中位数维持不变(关键在于取n1和n2的较小值,这一点之前一直没想到,可惜)。递推,直到len(nums1)=1或len(nums2)=1。之后分情况讨论就可以了,中文写起来太麻烦,懒得写了。
上述是我最开始的思路,为什么错呢,比如[1,4]、[2,3],这种算法就不适用。代码见LeetCode提交记录,1538 / 2085 test cases passed。换种思路,中位数在考虑奇偶情况时太麻烦,问题转化为求nums1和nums2的第k大的数。对于k,取两列数的第k/2大的数,比较大小,删除较小数的前面所有数,迭代(列表更新,k更新),直到k=1。
class Solution {
public:
double findK(vector& nums1, vector& nums2, int k) {
int len1 = nums1.size();
int len2 = nums2.size();
int half_k = k / 2;
int nk1, nk2;
int index = (k == 1 ? 0 : half_k - 1);
if (len2 <= 0) {
nk1 = nums1[index];
nk2 = nk1 + 1;
}
else if (len1 <= 0) {
nk2 = nums2[index];
nk1 = nk2 + 1;
}
else {
nk1 = nums1[min(index, len1 - 1)];
nk2 = nums2[min(index, len2 - 1)];
}
if (nk1 <= nk2) {
if (k == 1) {
return nk1;
}
vector::iterator i1 = nums1.begin() + min(index + 1, len1);
vector::iterator i2 = nums1.end();
vector nums1_new(i1, i2);
k = k - min(index + 1, len1);
return findK(nums1_new, nums2, k);
}
else {
if (k == 1) {
return nk2;
}
vector::iterator i1 = nums2.begin() + min(index + 1, len2);
vector::iterator i2 = nums2.end();
vector nums2_new(i1, i2);
k = k - min(index + 1, len2);
return findK(nums1, nums2_new, k);
}
}
double findMedianSortedArrays(vector& nums1, vector& nums2) {
if ((nums1.size() + nums2.size()) % 2 == 1) {
return findK(nums1, nums2, ceil((nums1.size() + nums2.size()) / 2.0));
}
else {
int a = findK(nums1, nums2, (nums1.size() + nums2.size()) / 2);
int b = findK(nums1, nums2, (nums1.size() + nums2.size()) / 2 + 1);
return (a + b) / 2.0;
}
}
};
改进解法
算法思路不变,但是写法需要改进:
- vector对象的重新实例化并没有必要,不论是实例化,复制,传参,都是很大的性能损失。可以通过传递索引代替。实验结果从52ms变为20ms。
class Solution {
public:
double findK(vector& nums1, vector& nums2, int begin1, int end1, int begin2, int end2, int k) {
int len1 = end1 - begin1 + 1;
int len2 = end2 - begin2 + 1;
int half_k = k / 2;
int nk1, nk2;
int index1, index2;
if (len2 <= 0) {
index1 = (k == 1 ? begin1 : begin1 + half_k - 1);
nk1 = nums1[index1];
nk2 = nk1 + 1;
}
else if (len1 <= 0) {
index2 = (k == 1 ? begin2 : begin2 + half_k - 1);
nk2 = nums2[index2];
nk1 = nk2 + 1;
}
else {
index1 = (k == 1 ? begin1 : begin1 + half_k - 1);
nk1 = nums1[min(index1, begin1 + len1 - 1)];
index2 = (k == 1 ? begin2 : begin2 + half_k - 1);
nk2 = nums2[min(index2, begin2 + len2 - 1)];
}
cout << "nk1 " << nk1 << " nk2 " << nk2 << endl;
if (nk1 <= nk2) {
if (k == 1) {
return nk1;
}
k = k - min(index1 + 1, begin1 + len1) + begin1;
return findK(nums1, nums2, index1 + 1, end1, begin2, end2, k);
}
else {
if (k == 1) {
return nk2;
}
k = k - min(index2 + 1, begin2 + len2) + begin2;
return findK(nums1, nums2, begin1, end1, index2 + 1, end2, k);
}
}
double findMedianSortedArrays(vector& nums1, vector& nums2) {
int len1 = nums1.size();
int len2 = nums2.size();
if ((len1 + len2) % 2 == 1) {
return findK(nums1, nums2, 0, len1 - 1, 0, len2 - 1, ceil((len1 + len2) / 2.0));
}
else {
int a = findK(nums1, nums2, 0, len1 - 1, 0, len2 - 1, (len1 + len2) / 2);
int b = findK(nums1, nums2, 0, len1 - 1, 0, len2 - 1, (len1 + len2) / 2 + 1);
return (a + b) / 2.0;
}
}
};
题外话
时隔多天,终于有空把这道看了很久的题彻底解决了。解题思路很清晰,剩下的就是奇偶分析,边界条件选取等体力活。其实,第二个part的代码,逻辑还是有些混乱的,直觉上不需要这么多段if-else,可以继续提升性能。(待续)