之前用greenevent协程并发下载web内容,很方便,速度快。最近学习eventlet,还想结合requests并发下载web,各种尝试,遇到一些问题。
Eventlet官方文档的第一个示例:
import eventlet
from eventlet.green import urllib2
urls = ["http://www.google.com/intl/en_ALL/images/logo.gif",
"https://www.python.org/static/img/python-logo.png",
"http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"]
def fetch(url):
return urllib2.urlopen(url).read()
pool = eventlet.GreenPool()
for body in pool.imap(fetch, urls):
print("got body", len(body))
报错
ModuleNotFoundError: No module named ‘urllib2’
因为python3.7版本问题,我用的是3.7.2,eventlet.green
里没有urllib2
了,可以用from eventlet.green.urllib import request
替代,可以跑通。
还是想用requests,patch一下,报错
GreenSSLSocket does not have a public constructor
也是python3.7的问题,没有找到好的解决方案,用python3.6.5就好了
鼓捣好patch requests以后对比了一下速度,下载13个网页:
- 串行下载,大约3-8秒,方差很大
- 用green.urllib
的request
,2秒左右
- 用打patch的requests
,基本都在1秒之内
import time
import eventlet
from eventlet.green.urllib import request # urllib的request
import requests # 原本的requests
# requests = eventlet.import_patched('requests') # 打patch的requests
urls = [
"http://tmall.com",
"http://sohu.com",
"http://jd.com",
"http://sina.com.cn",
"http://www.baidu.com",
"http://www.qq.com",
"http://weibo.com",
"http://alipay.com",
"http://bilibili.com",
"http://hao123.com",
"http://xinhuanet.com",
"http://163.com",
"http://csdn.net",
]
def fetch(url):
return request.urlopen(url) # 用urllib的request
# return requests.get(url) # 用打patch的requests
pool = eventlet.GreenPool()
start = time.time()
for body in pool.imap(fetch, urls):
# print(body.url, len(body.text)) # 用打patch的requests
print(len(body.read())) # 用urllib的request
print(time.time() - start)
# 串行
start = time.time()
for i in urls:
print(i, len(requests.get(i).content))
print(time.time() - start)