pip3的安装:
sudo apt-get update
sudo apt-get install python3-pip
用pip3安装requests:
sudo pip3 install requests
获取一个简单的网页:
!/usr/bin/env python3
import os
import os.path
import requests
def download(url):
req = requests.get(url)
if req.status_code == 404:
print('No such file found at %s' % url)
return
filename = url.split('/')[-1]
with open(filename, 'wb') as fobj:
fobj.write(req.content)
print("Download over.")
if __name__ == '__main__':
url = input('Enter a URL: ')
download(url)
~
Counter 示例
>>> from collections import Counter
>>> import re
>>> path = '/usr/lib/python3.4/LICENSE.txt'
>>> words = re.findall('\w+', open(path).read().lower())
>>> Counter(words).most_common(10)
[('the', 80), ('or', 78), ('1', 66), ('of', 61), ('to', 50), ('and', 48), ('python', 46), ('in', 38), ('license', 37), ('any', 37)]
Counter 对象有一个叫做 elements()
的方法,其返回的序列中,依照计数重复元素相同次数,元素顺序是无序的。
>>> c = Counter(a=4, b=2, c=0, d=-2)
>>> list(c.elements())
['b','b','a', 'a', 'a', 'a']
most_common()
方法返回最常见的元素及其计数,顺序为最常见到最少。
>>> Counter('abracadabra').most_common(3)
[('a', 5), ('r', 2), ('b', 2)]
namedtuple使用方法:
defaultdict 用例
>>> from collections import defaultdict
>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
... d[k].append(v)
...
>>> d.items()
dict_items([('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])])