480. Sliding Window Median

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

Examples:
[2,3,4] , the median is 3

[2,3], the median is (2 + 3) / 2 = 2.5

Given an array nums, there is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position. Your job is to output the median array for each window in the original array.

For example,
Given nums = [1,3,-1,-3,5,3,6,7], and k = 3.

Window position Median
[1 3 -1] -3 5 3 6 7 1
1 [3 -1 -3] 5 3 6 7 -1
1 3 [-1 -3 5] 3 6 7 -1
1 3 -1 [-3 5 3] 6 7 3
1 3 -1 -3 [5 3 6] 7 5
1 3 -1 -3 5 [3 6 7] 6

Therefore, return the median sliding window as [1,-1,-1,3,5,6].

解题思路

这道题让我想起来之前微软的的一道面试题。给一个数列,求出前k个数的中位数(k=1,2,..,n)。这道题的解法是这样的:

维护两个堆:一个大根堆,一个小根堆。大根堆存放较小的那些数,小根堆存放较大的那些数。在两个堆大小差距不超过1的情况下,很容易求得中位数。堆的具体维护方法如下:

Step 1:对于每一个新加的数,若小于大根堆的堆顶,则放入大根堆,反之放入小根堆。
Step 2:平衡两个堆。分别用两个数值记录每个队中的元素个数,将元素个数多的堆的堆顶元素移入元素个数少的堆,直至两个堆的元素个数差不超过1
Step 3:求出此时的中位数

两道题非常的相似,唯一不同的是多了一个滑动窗。而且由于堆的特性,不能随便删除堆中元素(要从堆顶删),似乎就不能用上面的方法做了。然而真的是这样吗?

先看代码:

class Solution {
public:
    vector medianSlidingWindow(vector& nums, int k) {
        vector medians;
        unordered_map hash;                          // count numbers to be deleted
        priority_queue> bheap;                // heap on the bottom
        priority_queue, greater> theap;  // heap on the top
        
        int i = 0;
        
        // Initialize the heaps
        while (i < k)  { bheap.push(nums[i++]); }
        for (int count = k/2; count > 0; --count) {
            theap.push(bheap.top()); bheap.pop();
        }
        
        while (true) {
            // Get median
            if (k % 2) medians.push_back(bheap.top());
            else medians.push_back( ((double)bheap.top() + theap.top()) / 2 );
            
            if (i == nums.size()) break;
            int m = nums[i-k], n = nums[i++], balance = 0;
            
            // What happens to the number m that is moving out of the window
            if (m <= bheap.top())  { --balance;  if (m == bheap.top()) bheap.pop(); else ++hash[m]; }
            else                   { ++balance;  if (m == theap.top()) theap.pop(); else ++hash[m]; }
            
            // Insert the new number n that enters the window
            if (!bheap.empty() && n <= bheap.top())  { ++balance; bheap.push(n); }
            else                                     { --balance; theap.push(n); }
            
            // Rebalance the bottom and top heaps
            if      (balance < 0)  { bheap.push(theap.top()); theap.pop(); }
            else if (balance > 0)  { theap.push(bheap.top()); bheap.pop(); }
            
            // Remove numbers that should be discarded at the top of the two heaps
            while (!bheap.empty() && hash[bheap.top()])  { --hash[bheap.top()]; bheap.pop(); }
            while (!theap.empty() && hash[theap.top()])  { --hash[theap.top()]; theap.pop(); }
        }
        
        return medians;
    }
};

我们可以看到,作者引入了一个有效数目概念,当一个元素被删除时,我们判定他为无效。在衡量两个堆大小时,比较两个堆的有效数目大小,再进行调整。而且,由于堆顶元素一定是有效的(最后一步会清除无效的堆顶元素),在调整时,最多只需要调整一个元素。

由此来看,此题是微软面试题的变种。在应付中位数问题时,两个堆的做法一般是实用而高效的,只是需要进行灵活变通。

你可能感兴趣的:(480. Sliding Window Median)