Python数据结构之哈希表与字符串

目录

  1. 哈希表的基础知识
  2. 最长回文串(LeetCode 409)
  3. 词语模式 (LeetCode 290)
  4. 同字符词语分组 (LeetCode 49)
  5. 无重复字符的最长子串 (LeetCode 3)
  6. 重复的DNA序列 (LeetCode 187)
  7. 最小窗口子串(LeetCode 76)

1. 哈希表的基础知识

哈希表是一种数据结构,其数据元素的地址或索引值由散列函数生成。这使得访问数据的速度更快,因为索引值是数据值的关键字。哈希表存储键值对,但键是通过哈希函数生成的。因此,数据元素的搜索和插入函数变得更快,因为键值本身成为存储数据的数组的索引。在Python中,字典数据类型表示哈希表的实现,字典中的键满足一下要求:

  • 字典的键是可散列的,即通过散列函数生成该散列函数,该散列函数为提供给散列函数的每个唯一值生成唯一结果
  • 字典中的数据元素顺序不固定。

2. 最长回文串(LeetCode 409 Longest Palindrome)

2.1题目

Given a string which consists of lowercase or uppercase letters, find the length of the longest palindromes that can be built with those letters.

This is case sensitive, for example “Aa” is not considered a palindrome here.

Note:
Assume the length of given string will not exceed 1,010.

Example:

Input:
"abccccdd"

Output:
7

Explanation:
One longest palindrome that can be built is "dccaccd", whose length is 7.

2.2思路

使用字符串中的字母组成的最长回文串的长度。使用集合存储字符串中的字符,如果字符未在集合中出现,则将该字符加入集合,如果该字符出现过,则将集合中的该字符移除。遍历字符串的所有元素,最后集合中剩下的元素为字符串中剩下的单个的字符。由于回文串的最中间位置可以单独放置一个成单的字符,所以,最长回文串的长度=len(s)-len(hash)+1

2.3代码

class Solution(object):
    def longestPalindrome(self, s):
        """
        :type s: str
        :rtype: int
        """
        hash_s = set()
        for i in s:
            if i not in hash_s:
                hash_s.add(i)
            else:
                hash_s.remove(i)
        return len(s) - len(hash_s) + 1 if len(hash_s) > 0 else len(s) 

3. 词语模式 (LeetCode 290 Word Pattern)

3.1题目

Given a pattern and a string str, find if str follows the same pattern.

Here follow means a full match, such that there is a bijection between a letter in pattern and a non-empty word in str.

Examples:
pattern = “abba”, str = “dog cat cat dog” should return true.
pattern = “abba”, str = “dog cat cat fish” should return false.
pattern = “aaaa”, str = “dog cat cat dog” should return false.
pattern = “abba”, str = “dog dog dog dog” should return false.
Notes:
You may assume pattern contains only lowercase letters, and str contains lowercase letters separated by a single space.

3.2思路

分别将pattern和str中的字符作为键,另一个字符串中对应的字符作为值,遍历2次,看是否存在键值不符合的情况,如果存在,则返回False。

3.3代码

class Solution(object):
    def wordPattern(self, pattern, str):
        """
        :type pattern: str
        :type str: str
        :rtype: bool
        """
        ans1 = {}
        ans2 = {}
        str1 = str.split(' ')
        l1 = len(pattern)
        l2 = len(str1)
        if l1 != l2:
            return False
        for i in range(l1):
            if str1[i] not in ans1:
                ans1[str1[i]] = pattern[i]
            else:
                if ans1[str1[i]] != pattern[i]:
                    return False
        for i in range(l1):
            if pattern[i] not in ans2:
                ans2[pattern[i]] = str1[i]
            else:
                if ans2[pattern[i]] != str1[i]:
                    return False
        return True

4. 同字符词语分组 (LeetCode 49 Group Anagrams)

4.1题目

Given an array of strings, group anagrams together.

For example, given: [“eat”, “tea”, “tan”, “ate”, “nat”, “bat”],
Return:

[
  ["ate", "eat","tea"],
  ["nat","tan"],
  ["bat"]
]

4.2思路

遍历每一个字符串,并将排序后的字符串结果作为键,对应的字符串作为值,返回字典所有值的结果即可

4.3代码

class Solution(object):
    def groupAnagrams(self, strs):
        """
        :type strs: List[str]
        :rtype: List[List[str]]
        """
        str_dict = {}
        for w in sorted(strs):
            key = tuple(sorted(w))
            str_dict[key] = str_dict.get(key,[]) + [w]
        return str_dict.values()

5. 无重复字符的最长子串 (LeetCode 3Longest Substring Without Repeating Characters)

5.1题目

Given a string, find the length of the longest substring without repeating characters.

Examples:

Given “abcabcbb”, the answer is “abc”, which the length is 3.

Given “bbbbb”, the answer is “b”, with the length of 1.

Given “pwwkew”, the answer is “wke”, with the length of 3. Note that the answer must be a substring, “pwke” is a subsequence and not a substring.

5.2思路

使用一个哈希表记录一个字符串的索引,以’abcdbea’为例,当检测到第二个‘b’时,应该从第一个b的下一个字符重新开始测算长度,但要把第一个a之前的字符在哈希表中对应的值清除,如果不清除的话,就会误以为还存在重复的。
遍历一次,使用i记录没有重复字母的字符串的起始位置,j记录没有重复字母的结束位置,不断更新子字符串的长度,当出现重复时,更新i,再求长度。

5.3代码

class Solution(object):
    def lengthOfLongestSubstring(self, s):
        """
        :type s: str
        :rtype: int
        """
        i,j,k = 0,0,0  
        map = {}  
        while i < len(s) and j < len(s):  
            if map.has_key(s[j]):  
                i = max(i, map[s[j]] + 1)  
            map[s[j]] = j  
            k = max(k, j - i + 1)  
            j += 1  
        return k  

6. 重复的DNA序列 (LeetCode 187 Repeated DNA Sequences)

6.1题目

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: “ACGAATTCCG”. When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].

6.2思路

从起始位置开始遍历,记录从起始位置开始的由10个字母组成的字符串,用字典记录每个字符串出现的次数,返回出现次数大于1的那些字符串结果的列表

6.3代码

class Solution(object):
    def findRepeatedDnaSequences(self, s):
        """
        :type s: str
        :rtype: List[str]
        """
        dict = {}
        for i in range(0,len(s)-9):
            key = s[i:i+10]
            if key not in dict:
                dict[key] = 1
            else:
                dict[key] += 1
        ans = []
        for s in dict:
            if dict[s] > 1:
                ans.append(s)
        return ans

7. 最小窗口子串(LeetCode 76 Minimum Window Substring)

7.1题目

Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n).

For example,
S = “ADOBECODEBANC”
T = “ABC”
Minimum window is “BANC”.

Note:
If there is no such window in S that covers all characters in T, return the empty string “”.

If there are multiple such windows, you are guaranteed that there will always be only one unique minimum window in S.

7.2思路

使用s[i:j]表示当前窗口,s[I:J]表示结果窗口,遍历一次,先将j右移,当满足条件时,i右移,直至i再右移就不满足情况为止,此时比较i,j和I,J的关系,更新I,J

7.3代码

class Solution(object):
    def minWindow(self, s, t):
        """
        :type s: str
        :type t: str
        :rtype: str
        """
        need, missing = collections.Counter(t), len(t)
        i = I = J = 0
        for j, c in enumerate(s,1):
            missing -= need[c] > 0
            need[c] -= 1
            if not missing:
                while i < j and need[s[i]] < 0:
                    need[s[i]] += 1
                    i += 1
                if not J or j - i <= J - I:
                    I, J = i, j
        return s[I:J]

你可能感兴趣的:(算法,数据分析,数据结构)