Python内置的Counter模块,尤其在计算对象的个数非常方便
class Counter(dict):
'''Dict subclass for counting hashable items. Sometimes called a bag
or multiset. Elements are stored as dictionary keys and their counts
are stored as dictionary values.
字典的子类,用于计算hashtable,有时被称为袋子或者多类型的字典。元素作为字典的键,元素个数作为字典的值。
>>> c = Counter('abcdeabcdabcaba') # count elements from a string
>>> c.most_common(3) # three most common elements
[('a', 5), ('b', 4), ('c', 3)]
>>> sorted(c) # list all unique elements
['a', 'b', 'c', 'd', 'e']
>>> ''.join(sorted(c.elements())) # list elements with repetitions
'aaaaabbbbcccdde'
>>> sum(c.values()) # total of all counts
15
>>> c['a'] # count of letter 'a'
5
>>> for elem in 'shazam': # update counts from an iterable
... c[elem] += 1 # by adding 1 to each element's count
>>> c['a'] # now there are seven 'a'
7
>>> del c['b'] # remove all 'b'
>>> c['b'] # now there are zero 'b'
0
>>> d = Counter('simsalabim') # make another counter
>>> c.update(d) # add in the second counter
>>> c['a'] # now there are nine 'a'
9
>>> c.clear() # empty the counter
>>> c
Counter()
Note: If a count is set to zero or reduced to zero, it will remain
in the counter until the entry is deleted or the counter is cleared:
>>> c = Counter('aaabbc')
>>> c['b'] -= 2 # reduce the count of 'b' by two
>>> c.most_common() # 'b' is still in, but its count is zero
[('a', 3), ('c', 1), ('b', 0)]
>>> c = Counter('abcdeabcdabcaba')
>>> c.items()
dict_items([('b', 4), ('c', 3), ('a', 5), ('d', 2), ('e', 1)])
>>> c['a']
5
>>> c.get("a")
5
>>> c.keys()
dict_keys(['b', 'c', 'a', 'd', 'e'])
>>> c.values()
dict_values([4, 3, 5, 2, 1])
>>> c.pop("a")
5
>>> c.clear()
>>> c.most_common(3) # 最多的前三位
[('b', 4), ('c', 3), ('d', 2)]
>>> c
Counter({'a': 5, 'b': 4, 'c': 3, 'd': 2, 'e': 1})
>>> c.update("abc") # 更新计数器对象c
>>> c
Counter({'a': 6, 'b': 5, 'c': 4, 'd': 2, 'e': 1})
>>> d = Counter("abc")
>>> c.update(d) # 可以更新counter 对象
>>> c
Counter({'a': 7, 'b': 6, 'c': 5, 'd': 2, 'e': 1})
>>> c.clear()
>>> c
Counter()
# Counter 间的数学集合操作
>>> c = Counter(a=3, b=1, c=5)
>>> d = Counter(a=1, b=2, d=4)
>>> c + d # counter相加, 相同的key的value相加
Counter({'c': 5, 'a': 4, 'd': 4, 'b': 3})
>>> c - d # counter相减, 相同的key的value相减,只保留正值得value
Counter({'c': 5, 'a': 2})
>>> c & d # 交集: 取两者都有的key,value取小的那一个
Counter({'a': 1, 'b': 1})
>>> c | d # 并集: 汇聚所有的key, key相同的情况下,取大的value
Counter({'c': 5, 'd': 4, 'a': 3, 'b': 2})
常见做法:
sum(c.values()) # 继承自字典的.values()方法返回values的列表,再求和
c.clear() # 继承自字典的.clear()方法,清空counter
list(c) # 返回key组成的list
set(c) # 返回key组成的set
dict(c) # 转化成字典
c.items() # 转化成(元素,计数值)组成的列表
Counter(dict(list_of_pairs)) # 从(元素,计数值)组成的列表转化成Counter
c.most_common()[:-n-1:-1] # 最小n个计数的(元素,计数值)组成的列表
说明:因为 Counter 实现了字典的 missing 方法, 所以当访问不存在的key的时候,返回值为0