Python算法题集_最小覆盖子串

本文为Python算法题集之一的代码示例

题目76:最小覆盖子串

说明:给你一个字符串 s 、一个字符串 t 。返回 s 中涵盖 t 所有字符的最小子串。如果 s 中不存在涵盖 t 所有字符的子串,则返回空字符串 ""

注意:

  • 对于 t 中重复字符,我们寻找的子字符串中该字符数量必须不少于 t 中该字符数量。
  • 如果 s 中存在这样的子串,我们保证它是唯一的答案。

示例 1:

输入:s = "ADOBECODEBANC", t = "ABC"
输出:"BANC"
解释:最小覆盖子串 "BANC" 包含来自字符串 t 的 'A'、'B' 和 'C'。

示例 2:

输入:s = "a", t = "a"
输出:"a"
解释:整个字符串 s 是最小覆盖子串。

示例 3:

输入: s = "a", t = "aa"
输出: ""
解释: t 中两个字符 'a' 均应包含在 s 的子串中,
因此没有符合条件的子字符串,返回空字符串。

提示:

  • m == s.length

  • n == t.length

  • 1 <= m, n <= 105

  • st 由英文字母组成


- 问题分析

  1. 本题为求连续的字符串子串,t是字符串的子集
  2. 主要的计算为三个,1字符串子串遍历,2是字符串子串和t的集合比较,3是字符串子串的长度比较
  3. 基本的遍历为双层循环,从第一个元素开始,计算从此元素开始有多少次和为K,所以基本的时间算法复杂度为(On2),不过这个算法复杂度还要乘以t的集合元素多少,所以极限而言是(On3)

- 优化思路

  1. 优化的思路,一是简化字符串子串和t的集合比较,二是减少字符串子串的长度比较循环次数
  2. 因为t是固定的,所以字符串子串和t的集合比较可以分解为t的各字符的数量比
  3. 字符串子串长度的比较,可以用滑动窗口【双指针】的形式进行
  4. 从推导过程中可以知道,控制最小的满足条件的滑动窗口从左到右,就可以计算出最小覆盖子串

  • CheckFuncPerf是我写的函数用时和内存占用模块,地址在这里:测量函数运行用时、内存占用的代码单元CheckFuncPerf.py以及使用方法
  • 测试的超长字符串文件是官网的,已上传到CSDN,地址在这里:LeetCode:最小覆盖子串测试用例,10W长度字符串1W长度子串(估计是1月31日过审)

  1. 标准求解,优化一层,倒在黎明前夜,超时失败Python算法题集_最小覆盖子串_第1张图片

    import CheckFuncPerf as cfp
    
    def minWindow(s, t):
        dict_t, dict_s = {}, {}
        list_buffer = []
        for achar in t:
            dict_t[achar] = dict_t.get(achar, 0) + 1
        ineedMeet = len(dict_t.keys())
        imeet, iright = 0, 0
        for iIdx in range(len(s)):
            if s[iIdx] in dict_t:
                list_buffer.append(iIdx)
                dict_s[s[iIdx]] = dict_s.get(s[iIdx], [])
                dict_s[s[iIdx]].append(iIdx)
        if len(dict_s.values()) < len(dict_t.values()):
            return ""
        dict_check = {}
        for key, value in dict_t.items():
            dict_check[key] = len(dict_s[key])
            if dict_check[key] >= value:
                imeet += 1
        if imeet < ineedMeet:
            return ""
        iminlen = len(s)
        bcanleft = True
        minleft, minright = 0, 0
        while bcanleft:
            lcharidx = list_buffer[0]
            ileft = lcharidx
            tmpdict = dict_check.copy()
            tmplist_buffer = list_buffer.copy()
            bcanright = True
            while bcanright:
                rcharidx = tmplist_buffer[-1]
                iright = rcharidx
                if tmpdict[s[rcharidx]] == dict_t[s[rcharidx]]:
                    bcanright = False
                else:
                    tmpdict[s[rcharidx]] -= 1
                    tmplist_buffer.pop(-1)
            if iminlen > iright - ileft:
                iminlen = iright - ileft + 1
                minleft = ileft
                minright = iright
            if dict_check[s[lcharidx]] == dict_t[s[lcharidx]]:
                bcanleft = False
            else:
                list_buffer.pop(0)
                dict_check[s[lcharidx]] -= 1
        return s[minleft:minright+1]
    
    s = open(r'testcase/hot12_big.txt', mode='r', encoding='utf-8').read()
    t = open(r'testcase/hot12_big_t.txt', mode='r', encoding='utf-8').read()
    result = cfp.getTimeMemoryStr(minWindow, s, t)
    print(result['msg'], '执行结果 = {}'.format(len(result['result'])))
    
    # 运行结果
    函数 minWindow 的运行时间为 1233597.94 ms;内存使用量为 176.00 KB 执行结果 = 10742
    
  2. 优化版【过滤t子字符集+滑动窗口】,马马虎虎,超越64%Python算法题集_最小覆盖子串_第2张图片

    def minWindow_ext1(s, t):
        dict_t, dict_s = {}, {}
        list_buffer = []
        for achar in t:
            dict_t[achar] = dict_t.get(achar, 0) + 1
        ineedMeet = len(dict_t.keys())
        imeet, iright = 0, 0
        for iIdx in range(len(s)):
            if s[iIdx] in dict_t:
                list_buffer.append(iIdx)
                dict_s[s[iIdx]] = dict_s.get(s[iIdx], [])
                dict_s[s[iIdx]].append(iIdx)
        if len(dict_s.values()) < len(dict_t.values()):
            return ""
        dict_check = {}
        for key, value in dict_t.items():
            dict_check[key] = len(dict_s[key])
            if dict_check[key] >= value:
                imeet += 1
        if imeet < ineedMeet:
            return ""
        iminlen, imaxright = len(s), list_buffer[-1]
        minleft, minright, imeet, ilistpos, ileftpos = 0, 0, 0, 0, 0
        ileft, iright = list_buffer[0], list_buffer[0]
        dict_check = {}
        while ilistpos < len(list_buffer):
            iright = list_buffer[ilistpos]
            acharidx = list_buffer[ilistpos]
            dict_check[s[acharidx]] = dict_check.get(s[acharidx], 0) + 1
            if dict_check[s[iright]] == dict_t[s[iright]]:
                imeet += 1
            while imeet == ineedMeet:
                if iminlen > iright - ileft:
                    iminlen = iright - ileft
                    minleft = ileft
                    minright = iright
                dict_check[s[ileft]] -= 1
                if dict_check[s[ileft]] < dict_t[s[ileft]]:
                    imeet -= 1
                ileftpos += 1
                ileft = list_buffer[ileftpos]
            ilistpos += 1
        return s[minleft:minright + 1]
    
    s = open(r'testcase/hot12_big.txt', mode='r', encoding='utf-8').read()
    t = open(r'testcase/hot12_big_t.txt', mode='r', encoding='utf-8').read()
    result = cfp.getTimeMemoryStr(minWindow_ext1, s, t)
    print(result['msg'], '执行结果 = {}'.format(len(result['result'])))
    
    # 运行结果
    函数 minWindow_ext1 的运行时间为 84.02 ms;内存使用量为 1036.00 KB 执行结果 = 10742
    
  3. 加强版【滑动窗口+字典分解集合比较】,有所改善,超越77%Python算法题集_最小覆盖子串_第3张图片

    def minWindow_ext2(s, t):
        dict_t = {}
        for tchar in t:
            dict_t[tchar] = dict_t.get(tchar, 0) + 1
        dict_window = {}
        imeet = 0
        ineedmeet = len(dict_t)
        ileft, iright, istartpos = 0, 0, 0
        iminlen = len(s)+1
        while iright < len(s):
            achar = s[iright]
            if achar in dict_t:
                dict_window[achar] = dict_window.get(achar, 0) + 1
                if dict_window[achar] == dict_t[achar]:
                    imeet += 1
            iright += 1
            while imeet == ineedmeet:
                if iright - ileft < iminlen:
                    istartpos = ileft
                    iminlen = iright - ileft
                tmpChar = s[ileft]
                if tmpChar in dict_window:
                    if dict_window[tmpChar] == dict_t[tmpChar]:
                        imeet -= 1
                    dict_window[tmpChar] -= 1
                ileft += 1
        if iminlen == len(s)+1:
            return ""
        return s[istartpos:istartpos + iminlen]
    
    s = open(r'testcase/hot12_big.txt', mode='r', encoding='utf-8').read()
    t = open(r'testcase/hot12_big_t.txt', mode='r', encoding='utf-8').read()
    result = cfp.getTimeMemoryStr(minWindow_ext2, s, t)
    print(result['msg'], '执行结果 = {}'.format(len(result['result'])))
    
    # 运行结果
    函数 minWindow_ext2 的运行时间为 77.02 ms;内存使用量为 8.00 KB 执行结果 = 10742
    

    一日练,一日功,一日不练十日空

    may the odds be ever in your favor ~

你可能感兴趣的:(Python,python,算法,leetcode)