如何统计序列中元素出现的频次

首先生成一个随机的序列

Python
In [51]: data = [randint(0,10) for i in range(30)] In [52]: data Out[52]: """ In [55]: dict.fromkeys? Signature: dict.fromkeys(iterable, value=None, /) Docstring: Returns a new dict with keys from iterable and values equal to value. Type: builtin_function_or_method """ #根据 data 生成一个键是data,默认值是0的字典, In [53]: new_data = dict.fromkeys(data,0) In [54]: new_data Out[54]: {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 10: 0}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
In [ 51 ] : data = [ randint ( 0 , 10 ) for i in range ( 30 ) ]
 
In [ 52 ] : data
Out [ 52 ] :
 
"""
In [55]: dict.fromkeys?
Signature: dict.fromkeys(iterable, value=None, /)
Docstring: Returns a new dict with keys from iterable and values equal to value.
Type:      builtin_function_or_method
"""
 
#根据 data 生成一个键是data,默认值是0的字典,
In [ 53 ] : new_data = dict . fromkeys ( data , 0 )
 
In [ 54 ] : new_data
Out [ 54 ] : { 0 : 0 , 1 : 0 , 2 : 0 , 3 : 0 , 4 : 0 , 5 : 0 , 6 : 0 , 7 : 0 , 8 : 0 , 10 : 0 }

第一种方法 对序列进行频次统计

Python
In [56]: for x in data: #根据列表进行遍历,如果存在则new_data 相应的键+1 ...: new_data[x]+=1 ...: In [57]: new_data Out[57]: {0: 3, 1: 2, 2: 3, 3: 4, 4: 2, 5: 5, 6: 4, 7: 3, 8: 1, 10: 3}
1
2
3
4
5
6
In [ 56 ] : for x in data : #根据列表进行遍历,如果存在则new_data 相应的键+1
     . . . :      new_data [ x ] += 1
     . . . :
 
In [ 57 ] : new_data
Out [ 57 ] : { 0 : 3 , 1 : 2 , 2 : 3 , 3 : 4 , 4 : 2 , 5 : 5 , 6 : 4 , 7 : 3 , 8 : 1 , 10 : 3 }

#还有一种跟方便的统计序列的排序方法 就是通过collections 下的Counter 函数进行统计频次

Python
In [60]: from collections import Counter In [61]: c = Counter(data) In [62]: c Out[62]: Counter({0: 3, 1: 2, 2: 3, 3: 4, 4: 2, 5: 5, 6: 4, 7: 3, 8: 1, 10: 3}) In [63]: c.most_common(3) # 获取频次最高的前三名 Out[63]: [(5, 5), (6, 4), (3, 4)]
1
2
3
4
5
6
7
8
9
In [ 60 ] : from collections import Counter
 
In [ 61 ] : c = Counter ( data )
 
In [ 62 ] : c
Out [ 62 ] : Counter ( { 0 : 3 , 1 : 2 , 2 : 3 , 3 : 4 , 4 : 2 , 5 : 5 , 6 : 4 , 7 : 3 , 8 : 1 , 10 : 3 } )
 
In [ 63 ] : c . most_common ( 3 ) # 获取频次最高的前三名
Out [ 63 ] : [ ( 5 , 5 ) , ( 6 , 4 ) , ( 3 , 4 ) ]

#统计英文文章词频

Python
txt = """ We make parts from your 3D models by CNC machining and 3D printing. We struggle with what to call ourselves since we are not a traditional machine shop, molder or a 3D printing service bureau. Regardless, once you work with ETCN, you will call us a trusted partner. Currnetly, we have CNC machining center, automatic lathe, surface grinding machine, meter lathe, CNC lathe, milling machine, plastic injection machine and stamping machine etc. Our capabilities include lathe work, wire cutting, deburring, sht and sand blasting, heat treatment, surface plating, stamping, casting, copper parts, machining and power coating. We have the expertise to 100% inspect our components before distribution ensuring only the highest quality to our customers. Our high quality and low cost is what we are known for and is the key to our success for many years. Most of products have been widely applied in electric motor components, automotive, electronic instrument parts, auto machine accessory, communication equipments, medical treatment, sporting equipment, optics instrument, fire protection devices etc. Insisting on the tenet of “Best Quality, Lowest Price, Best Service”, we are looking forward to build long term business relationships with you for mutual benefit. """ In [69]: newlist = re.split('\W',txt) #\w 匹配字母数字及下划线 #\W 匹配非字母数字及下划线 In [70]: new = Counter(newlist) In [71]: new In [72]: new.most_common(10) Out[72]: [('', 47), ('and', 6), ('machine', 6), ('to', 5), ('we', 4), ('lathe', 4), ('the', 4), ('We', 3), ('parts', 3),
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
txt = """
We make parts from your 3D models by CNC machining and 3D printing. We struggle with what to call ourselves since we are not a traditional machine shop, molder or a 3D printing service bureau. Regardless, once you work with ETCN, you will call us a trusted partner.
 
Currnetly, we have CNC machining center, automatic lathe, surface grinding machine, meter lathe, CNC lathe, milling machine, plastic injection machine and stamping machine etc. Our capabilities include lathe work, wire cutting, deburring, sht and sand blasting, heat treatment, surface plating, stamping, casting, copper parts, machining and power coating.
 
We have the expertise to 100% inspect our components before distribution ensuring only the highest quality to our customers. Our high quality and low cost is what we are known for and is the key to our success for many years. Most of products have been widely applied in electric motor components, automotive, electronic instrument parts, auto machine accessory, communication equipments, medical treatment, sporting equipment, optics instrument, fire protection devices etc.
 
Insisting on the tenet of “Best Quality, Lowest Price, Best Service”, we are looking forward to build long term business relationships with you for mutual benefit.
"""
In [ 69 ] : newlist = re . split ( '\W' , txt )
#\w 匹配字母数字及下划线
#\W 匹配非字母数字及下划线
In [ 70 ] : new = Counter ( newlist )
 
In [ 71 ] : new
 
 
In [ 72 ] : new . most_common ( 10 )
Out [ 72 ] :
[ ( '' , 47 ) ,
( 'and' , 6 ) ,
( 'machine' , 6 ) ,
( 'to' , 5 ) ,
( 'we' , 4 ) ,
( 'lathe' , 4 ) ,
( 'the' , 4 ) ,
( 'We' , 3 ) ,
( 'parts' , 3 ) ,



你可能感兴趣的:(如何统计序列中元素出现的频次)