Leetcode - House Robber III

My code:

/**
 * Definition for a binary tree node.
 * public class TreeNode {
 *     int val;
 *     TreeNode left;
 *     TreeNode right;
 *     TreeNode(int x) { val = x; }
 * }
 */
public class Solution {
    HashMap cache = new HashMap();
    public int rob(TreeNode root) {
        if (root == null) {
            return 0;
        }
        else if (cache.containsKey(root)) {
            return cache.get(root);
        }
        
        int val = 0;
        if (root.left != null) {
            val += rob(root.left.left) + rob(root.left.right);
        }
        if (root.right != null) {
            val += rob(root.right.left) + rob(root.right.right);
        }
        
        int max = Math.max(val + root.val, rob(root.left) + rob(root.right));
        cache.put(root, max);
        return max;
    }
}

仔细研读：
https://discuss.leetcode.com/topic/39659/easy-understanding-solution-with-dfs/2

这道题目没做出来，然后看了答案，觉得这篇解析说的很棒。
一开始的思路应该是很自然的，但我却没有想到。

rob(TreeNode root)
返回以当前结点作为root，可以抢到的最多的钱。
这个结点，可能偷，也可能没偷。

那么这个rob(root) 的终止条件是什么？
肯定是 root == null
那么递推式呢？
或者说，和下面层与层之间的关系如何确定呢？

如果结点被抢了，那么不能偷left, right了。
所以

rob(root) = rob(root.left.left) + rob(root.left.right) + 
            rob(root.right.left) + rob(root.right.right) + root.val;

注意，这里，

rob(root.left) + rob(root.right)
可能 =
rob(root.left.left) + rob(root.left.right) + rob(root.right.left) + rob(root.right.right) + root.val;
如果 偷了root 左右，还没有直接投 root 孙子赚得多

所以这里我们可以看出，这个dp，状态与状态之间，互相联系，
rob(root) 需要依靠rob(left.left), rob(left.right)
rob(left) 也需要依靠 rob(left.left), rob(left.right)
重复就来了。下面再细说。

从这里可以看出。

rob(root)
= root.val + 
Math.max(
  rob(root.left) + rob(root.right),
  rob(root.left.left) + rob(root.left.right) + rob(root.right.left) + rob(root.right.right)
);

递推式有了，我们就可以一层层往下计算了。
其实也是 bottom up

然后发现速度特别慢，原因就是刚刚说的，有很多东西都重复计算了，然后状态之间互相焦灼，互相相关，有许多东西需要算好几遍。

然后一个工程上的手段是，
剪枝
cut edge

剪枝的本质有两点。

当从当前状态下去，以后无论如何变化，都不可能获得我们需要的那个状态的话，那就直接停止吧，不用走到无路可走的时候再回头
cache，有些已经计算过的最优解，全部存在cache中。这样下次要用的时候避免计算。

这里采用了方法二进行了剪枝，于是有了上面的代码。
然后速度一下快多了。

**
但是，剪枝虽然可以很好的降低时间，但是无法降低时间复杂度。
也就是说，他可以把时间复杂度的常数系数降得很多，但是无法从根本上改良这个算法。
导致的问题是：
1 . 当数据量越来越大的时候，剪枝的dp仍然有极限，仍然会很快就出现延迟超长的情况。
2 . recursion会导致stack overflow,即使剪枝仍然需要进入一层栈去取出cache，所以当输入数据变得越来越大的时候，剪枝也会很快爆栈
**

举例：

1 . 斐波那契数列。
第一种做法，设置一个数组cache

long[] cache;
private int counter = 0;
private int hit = 0;
public long f(int n) {
    counter = 0;
    hit = 0;
    if (n < 0) {
        return 0;
    }
    cache = new long[n + 1];
    cache[0] = 1;
    return f(n - 1, cache) + f(n - 2, cache);
}

private long f(int n, long[] cache) {
    counter++;
    if (n < 0) {
        return 0;
    }
    else if (cache[n] != 0) {
        hit++;
        return cache[n];
    }
    else {
        long ret = f(n - 1, cache) + f(n - 2, cache);
        cache[n] = ret;
        return ret;
    }
}

2 . iteration dp

public long f2(int n) {
    if (n < 0) {
        return 0;
    }
    else if (n < 2) {
        return 1;
    }
    long pre = 1;
    long next = 1;
    for (int i = 2; i <= n; i++) {
        long sum = pre + next;
        pre = next;
        next = sum;
    }
    return next;
}

然后测试相同的输入下，时间的对比，如下：

Screen Shot 2016-09-08 at 11.20.57 PM.png

第一个点可以当做噪音点，因为CPU可能没有引起足够重视，
所以 f(5) 跑的时间比 f(10) 还要大很多

但是看总体趋势，我们还是可以发现，即使加了cache，当输入变大的时候，增大的趋势，也是很陡峭的。
而iteration增大的趋势则是比较平缓的

为了做这个实验，破费了些时间。
但画出来后，发现竟然如此完美。
perfect~

第二个例子就是 Burst Baloon了啊

但是说起来太久远了，具体可以看下这道题目在里我写的文章。
总是 dp recursion >>> dp iteration
因为 dp recursion + memorization 不能从根本上解决问题。

接着这道题目往下说。
所以，我们需要把状态和状态之间完全独立起来。
想斐波那契数列的iteration解法。
我求 f(n) 的时候不用再次计算 f(n - 1), f(n - 2),他们已经完全计算好了，完全独立的。

但是recursion + memorization 版本不是
f(n) needs f(n - 2), f(n - 1) 也需要 f(n - 2)
虽然有hit，但是仍然要再recursive call 一下

你认为差不多？
那你错了，看我画的图。
左边是 f(10) recursion + memorization 版本
右边是 f(10) iteration 版本

从图中我们可以看出，
左侧一共需要计算 2 * n + 1 个结点。
右侧一共需要计算 n + 1 个结点

似乎数量级是差不多的。
再仔细看。
左侧有 2 * n + 1个 stack
右侧，只有 1 个stack

无论 n 多么大，
iteration 永远只有一个 stack
而 recursion + memorization 无论cache policy 设计的再怎么好， stack的个数都会随着n变大而剧烈变大
而大量的栈操作，会有入栈出栈，不仅对空间上要影响，对时间影响也很大。

**
这才是为什么， recursion + memorization 看上去计算次数和 iteration差不多，但却慢那么多的，根本原因所在。
**

那么，如何面对dp类题目呢？
首先，就是什么时候用dp
当题目出现求最大最小最长最短最优的时候，一定是用DP

然后，dp有多种思路，我记得的有，
1 . divide and conquer, 划成两块，分别继续
2 . string 类型的，通过前一个状态推导后一个。
3 . buy and sell stock 那种，状态机的

等等。
然后分析哪里存在重复计算，然后用memorization 剪枝

这是第二步，也是想出DP后
用memorization cut edge

第三步，就是思考，为什么需要重复计算？
因为状态之间相互依靠的效果太明显，不能完全切开。
或者说，当前状态不仅依靠之前的一个状态，可能依靠之前的所有状态。
那么我们需要一种机制，把状态之间彻底切开。
比如，结点只可能被偷或者没被偷。
如果 rob(root) 被偷，
那么， left, right 只能没被偷，就不用在计算他们了
如果， root没被偷，那么 left， right可以被偷，也可以没被偷

写出了下面的解法：
My code:

/**
 * Definition for a binary tree node.
 * public class TreeNode {
 *     int val;
 *     TreeNode left;
 *     TreeNode right;
 *     TreeNode(int x) { val = x; }
 * }
 */
public class Solution {
    public int rob(TreeNode root) {
        if (root == null) {
            return 0;
        }
        
        int[] ret = helper(root);
        return Math.max(ret[0], ret[1]);
    }
    
    private int[] helper(TreeNode root) {
        if (root == null) {
            return new int[]{0, 0};
        }
        int[] left = helper(root.left);
        int[] right = helper(root.right);
        
        int[] ret = new int[2];
        ret[0] = Math.max(left[0], left[1]) + Math.max(right[0], right[1]);
        ret[1] = root.val + left[0] + right[0];
        return ret;
    }
}

这个算法下，
root只会有 root.left and root.right 推导，
和下面的孙子辈没关系了，不需要recursive call
而上面的算法，
root,
root.left, root.right

都需要 recursive call 一下这些孙子辈，虽然 cache hit也没有用，总有栈的消耗。当输入变大，尤其层数变多时，消耗是很惊人的。

当然，这里还是没写成 iteration的做法，否则更优。

Anyway, Good luck, Richardo! -- 09/09/2016

Leetcode - House Robber III

你可能感兴趣的:(Leetcode - House Robber III)