Remove Duplicates from Sorted Array II

每日算法——letcode系列

问题 Remove Duplicates from Sorted Array II

Difficulty:Medium

Follow up for "Remove Duplicates":

What if duplicates are allowed at most twice?

For example,

Given sorted array nums = [1,1,1,2,2,3],

Your function should return length = 5, with the first five elements of nums being 1, 1, 2, 2 and 3. It doesn't matter what you leave beyond the new length.

// C++
class Solution {
public:
    int removeDuplicates(vector& nums) {
        
    }
};

翻译移除有序数组中重复元素II

难度系数：中等

接着上一个总是“删除重复数”:

如果可以重复的数最多只允许两次呢?

例如,

给定一个有序的数组 nums = [1,1,1,2,2,3],

函数返回的长度应 = 5,数组nums的前五个元素应为 1, 1, 2, 2 和3

思路

由于是有序的，上题是相邻不能相等，现在相当于是隔一个不能相等，本质上一样, 可以和上一题写一个通用的代码

// C++
class Solution {
public:
    int removeDuplicates(vector& nums) {
       
        int n = static_cast(nums.size());
        
        if (n < 3){
            return n;
        }
        
        int index = 2;
        for (int i = 2; i < n; ++i){
            if (nums[index - 2] != nums[i]){
                nums[index++] = nums[i];
            }
        }
        for (int i = 0; i < n - index; ++i){
            nums.pop_back();
        }
        return index;
    }
};

// 通用
class Solution {
public:
    /**
     *  remove duplicates if the number of duplicates more than num
     * 
     *  @param nums <#nums description#>
     *  @param num  <#num description#>
     *
     *  @return size of new nums
     */
    int removeDuplicates(vector& nums, int num) {
        
        int n = static_cast(nums.size());
        
        if (n < num + 1){
            return n;
        }
        
        int index = num;
        for (int i = 2; i < n; ++i){
            if (nums[index - num] != nums[i]){
                nums[index++] = nums[i];
            }
        }
        for (int i = 0; i < n - index; ++i){
            nums.pop_back();
        }
        return index;
    }
};

前文延伸解决

注：前文中文版个人觉得翻译有问题，不应该是有序的，应该是一系列整数，有序的就太简单了

分析

由于4 300 000 000大于$2^{{32}$，且1到$2}{32}$中每个数都至少有一个，所以必须有重复的数，题目设置成这么大的数，主要是不要考虑全部读取到内存的方案，为方便测试可以用1到256的取258个数（258 >$2^8$)来测试。

方案一：

用一个char类型的数据，由于一个char占8个字节，每个字节两个比特位，一共有256个比特位，读取一个数，如果这个数是2，先判断char类型数据的第二个比特位上是否为0，如果为0就把设置为1，如果为1代表这个数出现, 时间复杂度$T_{n} = O{(n)} $, 其实本质上就是hashmap的方案，只不过更省空间。

方案二：

二分法（注：《编程珠玑》推荐的方案）
读到一个数，最高比特位如果是0，放在A堆，最高比特位如果是1放到B堆，这样就把258，人为的分成了两堆，现在如果哪堆的个数大于$256/2=128$, 证明重复的数在哪堆，也有可能两堆都有，再读最高位的下一位比特位再分堆，如此反复肯定能找到那个重复的数

延伸思考

如果是$2^{32}$ + 1个数呢？也就是只有一个重复的数，怎么来找此数呢？有没有更低的时间复杂度方案？