数据结构与算法-DFS/回溯

回溯backtracking

回溯法思路的简单描述是：把问题的解空间转化成了图或者树的结构表示，然后使用深度优先搜索策略进行遍历，遍历的过程中记录和寻找所有可行解或者最优解。
基本思想类同于：
1.图的深度优先搜索
2.二叉树的后序遍历
分支限界法：广度优先搜索
思想类同于：图的广度优先遍历
二叉树的层序遍历

1. leetcode 39. Combination Sum

题目描述：给一个数组nums和一个目标数target，找出所有数组nums中的数组合和为target的组合，nums数组中的数可以
使用无限次，
Given a set of candidate numbers (C) (without duplicates) and a target number (T), find all unique combinations in C where the candidate numbers sums to T.

The same repeated number may be chosen from C unlimited number of times.

Note:
All numbers (including target) will be positive integers.
The solution set must not contain duplicate combinations.
For example, given candidate set [2, 3, 6, 7] and target 7, 
A solution set is: 
[
  [7],
  [2, 2, 3]
]

code:

class Solution(object):
    def combinationSum(self, candidates, target):
        """
        :type candidates: List[int]
        :type target: int
        :rtype: List[List[int]]
        """
        self.resList = []
        candidates = sorted(candidates)
        self.dfs(candidates,[],target,0)
        return self.resList
    def dfs(self, candidates, sublist, target, last):
        if target == 0:
            self.resList.append(sublist[:])
        if target< candidates[0]:
            return 
        for n in candidates:
            if n > target:
                return
            if n < last: # not go back 
                continue
            sublist.append(n)
            self.dfs(candidates,sublist,target - n, n)
            sublist.pop()

if __name__ == '__main__':
    nums = [2,3,6,7]
    Solution = Solution()
    target = 7
    print(Solution.combinationSum(nums,target))

上面的方法，DFS传参中需要传入sublist,候选集，每次需要append进当前的元素，因此最后需要pop()对数据出栈。

自己的code2: 更易懂但效率低，每次要对path求和，自然和为target则结果正确，添加到结果里，> target则return，否则继续DFS

class Solution(object):
    def combinationSum(self, candidates, target):
        candidates.sort()
        res = []
        index = 0 
        self.dfs(candidates,index,target,[],res)
        return res
    
    def dfs(self,candidates,index,target,path,res):
        if sum(path) == target:
            res.append(path)
        if sum(path) > target:
            return 
        for i in range(index,len(candidates)):
            self.dfs(candidates,i,target,path+[candidates[i]],res)

更简洁的写法：摘自leetcode discuss:

class Solution(object):
    def combinationSum(self, candidates, target):
        res = []
        candidates.sort()
        self.dfs(candidates, target, 0, [], res)
        return res
        
    def dfs(self, nums, target, index, path, res):
        if target < 0:
            return  # backtracking
        if target == 0:
            res.append(path)
            return 
        for i in xrange(index, len(nums)):
            self.dfs(nums, target-nums[i], i, path+[nums[i]], res)

2.leetcode 40. Combination Sum II

题目和上题leetcode 39 基本类似，唯一不同是数组中的所有元素不是无限次使用，而是有多少只能用多少次

Given a collection of candidate numbers (C) and a target number (T), find all unique combinations in C where the candidate numbers sums to T.

Each number in C may only be used once in the combination.

Note:
All numbers (including target) will be positive integers.
The solution set must not contain duplicate combinations.
For example, given candidate set [10, 1, 2, 7, 6, 1, 5] and target 8, 
A solution set is: 
[
  [1, 7],
  [1, 2, 5],
  [2, 6],
  [1, 1, 6]
]

code1: 完全暴力，不考虑任何优化，和leetcode 39的代码的区别只在于dfs递归里的i变成了i+1（跳过本身），同时leetcode 39中数组是没有重复元素的，leetcode 40里面可能有，所以需要判断重复，这里仅依靠一个字典d的O(1)判断
可以ac leetcode的tests，但是速度很低

class Solution(object):
    def combinationSum2(self, candidates, target):
        candidates.sort()
        res = []
        index = 0 
        self.d = {}
        self.dfs(candidates,index,target,[],res)
        return res
    
    def dfs(self,candidates,index,target,path,res):
        if target == 0 :
            if tuple(path) not in self.d:
                res.append(path)
                self.d[tuple(path)] = 1
            return 
        if target < 0:
            return 
        for i in range(index,len(candidates)):
            self.dfs(candidates,i + 1,target - candidates[i],path+[candidates[i]],res)

code2:
考虑优化：

当目前path为空，即即将添加第一个元素的时候，判断之前res里是否已有答案是该元素开头

比如target = 8, nums = [1, 1, 2, 5, 6, 7, 10]
则第一个1过完，到第二个1，path肯定已被第一个1的path所包含，就要跳过，否则
[1, 1, 6]
[1, 2, 5]
[1, 7]
[1, 2, 5]
[1, 7]
[2, 6]
可以看到出现了两次（1，2，5）和（1，7）

if i == index and nums[i] == res[-1][0]: # res[-1][0]是上一个答案的开头元素

当目前的下标不在开头，但是和之前的元素有重复，就直接跳过

比如 nums = [1,2,2,2,5] ,target = 5
答案会为
[1, 2, 2]
[1, 2, 2]
[1, 2, 2]
[5]
在遍历过第一个2的时候就不应该再去看后面的两个2了。
if i != index and candidates[i] == candidate[i-1]:
  continue

可以看出，优化2的代码其实也包含了优化1，同时由于不会重复，没有必要再用字典d去保存已经有的答案。

因此最终代码为

class Solution(object):
    def combinationSum2(self, candidates, target):
        candidates.sort()
        res = []
        index = 0 
        self.dfs(candidates,index,target,[],res)
        return res
    
    def dfs(self,candidates,index,target,path,res):
        if target == 0 :、
            res.append(path)
            return 
        if target < 0:
            return 
        for i in range(index,len(candidates)):
            if i != index and candidates[i] == candidates[i-1]:
                continue
            self.dfs(candidates,i + 1,target - candidates[i],path+[candidates[i]],res)

二维数组的深度遍历（参考leetcode17 Letter Combinations of a Phone Number）

给一个数组[['1','2','3'], ['4','5','6'], ['7','8','9']]
生成排列的组合：（顺序可以任意）
1,4,7,
1,4,8
1,4,9
1,5,7
1,5,8...
3,6,8,
3,6,9

def letter(nums):
    path = ''
    res = []
    index = 0
    dfs(nums,path,index,res)
    return res


def dfs(nums,path,index,res):
    if len(path) == len(nums):
        res.append(path)
        return

    for i in nums[index]:
        dfs(nums,path+i,index+1,res)


if __name__ == '__main__':
    nums = [['a','b','c'],['d','e','f'],['g','h','i','x']]
    # nums = ['a','']
    print(letter(nums))

leetcode22 Generate Parentheses

生成所有N个括号可能的组合
Given n pairs of parentheses, write a function to generate all combinations of well-formed parentheses.

For example, given n = 3, a solution set is:

[
"((()))",
"(()())",
"(())()",
"()(())",
"()()()"
]
分析：

DFS返回正确结果条件,当前添加的path长度等于2 * n，且左边括号数量要等于右边括号
DFS中途返回的条件：左边或者右边已经添加的次数大于n
或右边比左边添加的次数要多（如“())” 是不合法的）

方法1: 自己写的:

class Solution(object):
    def generateParenthesis(self, n):
        """
        :type n: int
        :rtype: List[str]
        """
        nums = ['(',')']
        path, index, l, r = '', n, 0, 0
        res = []
        self.dfs(nums,path,index,res,l,r)
        return res
    def dfs(self,nums,path,index,res,l,r):
        if l > index or r > index or r > l:
            return 
        if len(path) == 2 * index:
            res.append(path)
        for j in nums:
            if j == '(':
                self.dfs(nums,path+j,index,res,l+1,r)
            else:
                self.dfs(nums,path+j,index,res,l,r+1)

方法2 leetcode discuss
l和r分别是长度为n的两个栈，一个保存左括号，一个保存右括号，分别出栈，最后两个都为空的时候为结果
如果右边比左边数量还少（已经添加入path更多，illegal），则中途退出,faster

class Solution(object):
    def generateParenthesis(self, n):
        """
        :type n: int
        :rtype: List[str]
        """
        #if not n:
        #    return []
        path,left,right,res ='',n,n,[]
        self.dfs(path,left,right,res)
        return res
    
    def dfs(self,path,left,right,res):
        if right < left:
            return 
        if not left and not right:
            res.append(path[:])
            return 
        if left > 0:
            self.dfs(path + '(',left - 1,right,res)
        if right > 0:
            self.dfs(path + ')',left ,right - 1,res)

按照这个思路也可以给方法1中的dfs的for训练改成if

class Solution(object):
    def generateParenthesis(self, n):
        """
        :type n: int
        :rtype: List[str]
        """
        nums = ['(',')']
        path, index, l, r = '', n, 0, 0
        res = []
        self.dfs(nums,path,index,res,l,r)
        return res
    
    def dfs(self,nums,path,index,res,l,r):
        if r > l or l> index or r > index:
            return 
        if len(path) == 2 * index :
            res.append(path)
        if l <= index:
            self.dfs(nums,path+'(',index,res,l+1,r)
        if r <= index:
            self.dfs(nums,path+')',index,res,l,r+1)

全排列：

[1,2,3] 生成 123 132 213 231 312 321 六种全排列

网上常见的递归写法：交换与开头元素的位置

class Solution(object):
    def permute(self, nums):
        """
        :type nums: List[int]
        :rtype: List[List[int]]
        """
        end = len(nums)
        begin = 0
        global res 
        res = []
        self.perm(nums,begin,end)
        # res.append(perm(nums,begin,end))
        return res
    
    def perm(self,nums,begin,end):
    
        if begin >= end:
            tmp = nums[:]
            res.append(tmp) 
            
        else:
            for i in range(begin,end):
                nums[begin],nums[i] = nums[i],nums[begin]
                self.perm(nums,begin+1,end)
                nums[begin],nums[i] = nums[i],nums[begin]

DFS写法
DFS是树型结构
即第一个节点搜索1,2,3，再每个节点下继续搜索1，2，3，判断return的原则一是碰到了重复元素（相当于剪枝），二是得到正确结果

class Solution(object):
    def permute(self, nums):
        index,path = 0,[]
        res = []
        d = {}
        self.dfs(nums,index,path,res,d)

    def dfs(self,nums,index,path,res,d):
        if len(set(path)) != len(path):
            return
        if len(path) == len(nums): 
            print(path)
            return 
        for i in nums:
            self.dfs(nums,index,path+[i],res,d)

上面的方法中需要将python中的list转化为set，比较长度来判断list是否有重复来完成剪枝，list转换set还是比较耗时的，因此另一种方法，直接在nums中去掉已经添加的元素，代码更简洁也更高效
nums = nums[:i] + nums[i+1:]

class Solution(object):
    def permute(self, nums):
        path = []
        res = []
        length = len(nums)
        self.dfs(nums,path,length,res)
        return res
    
    def dfs(self,nums,path,length,res):
        if len(path) == length:
            res.append(path)
        for i in range(len(nums)):
            self.dfs(nums[:i] + nums[i+1:],path + [nums[i]],length,res)