我们知道python内建模块的collections有很多好用的操作。
比如:
from collections import Counter
#统计字符串
# top n问题
user_counter = Counter("abbafafpskaag")
print(user_counter.most_common(3)) #[('a', 5), ('b', 2), ('f', 2)]
print(user_counter['a']) # 5
python里面实现most_common:
def most_common(self, n=None):
'''List the n most common elements and their counts from the most
common to the least. If n is None, then list all element counts.
>>> Counter('abcdeabcdabcaba').most_common(3)
[('a', 5), ('b', 4), ('c', 3)]
'''
# Emulate Bag.sortedByCount from Smalltalk
if n is None:
return sorted(self.items(), key=_itemgetter(1), reverse=True)
return _heapq.nlargest(n, self.items(), key=_itemgetter(1))
这里用到了个 _heapq
堆数据结构 也就是说它是堆来解决top n问题的,而不是遍历。
总结:most_common()函数用来实现Top n 功能.