字母异位词分组(LeetCode49)
给定一个字符串数组,将字母异位词组合在一起。字母异位词指字母相同,但排列不同的字符串。
输入: ["eat", "tea", "tan", "ate", "nat", "bat"]
输出: [ ["ate","eat","tea"], ["nat","tan"], ["bat"] ]
所有输入均为小写字母;不考虑答案输出的顺序。
用排好序的字符串作为key(这个key其实是可以认为是这个自字符串的一种特征),对应的排序之前的字符串作为追加元素加入到该key对应的列表中,当然要先做一次判断,判定这个字母组合是否已经存在于字典的keys中。用到的知识点包括dict对象的keys()和.values()方法,字符串的.join方法,sorted排序方法,for循环遍历enumerate,列表生成式等等,具体代码如下。
lst = ["eat", "tea", "tan", "ate", "nat", "bat"]
dct = {}
for i in lst:
key = str(sorted(i))
if key not in dct.keys():
dct[key] = [i]
else:
dct[key].append(i)
res = [val for val in dct.values()]
print(res)
{"['a', 'e', 't']": ['eat', 'tea', 'ate'], "['a', 'n', 't']": ['tan', 'nat'], "['a', 'b', 't']": ['bat']}
lst = ["eat", "tea", "tan", "ate", "nat", "bat"]
dct = {}
for word in lst:
key = ''.join(sorted(word))
# 注意get的用法,第二个参数表示key不存在时的默认值
dct[key] = dct.get(key,[]) + [word]
res = [val for val in dct.values()]
print(res)
因为get方法的第二个参数可以指定key不存在的情况下value的默认值,所以可以使得代码简化。
import collections
from collections import defaultdict
strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
dct = defaultdict(list)
for s in strs:
count = [0] * 26
for i in s:
count[(ord(i)-ord('a'))] += 1
#注意key只能是不可变的数值
key = tuple(count)
dct[tuple(count)].append(s)
res = [val for val in dct.values()]
print(res)
在这里,其实建立了一个26维的向量count,用来记录每个英文字母出现的次数,并且使用了collections中的defaultdict函数,对字典的value用空的list数据结构进行初始化,使得代码可读性进一步增加。
用一个质数来代表一个字母,那么一种字母组合对应质数的乘积是唯一的,可以作为特征值称为字典的key。
import math as m
def isprime(num):
for i in range(2,int(m.sqrt(num))+1):
if num%i==0:
return False
return True
count = 0
num = 2
prims = []
while count<=26:
if isprime(num):
count += 1
prims.append(num)
num += 1
def codewithprime(word):
global prims
code = 1
for s in word:
index = ord(s)-ord('a')
code *= prims[index]
return code
strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
dct = {}
for word in strs:
key = codewithprime(word)
dct[key] = dct.get(key,[])+[word]
res = [val for val in dct.values()]
print(res)
[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
[Finished in 0.1s]