本地新建request.Session, 多线程打开url并发较高时, 出现以下报错:
requests.packages.urllib3.connectionpool:Connection pool is full, discarding connection
def test_pool_max_size():
# 测试 pool_maxsize 对多线程访问的影响
def thread_get(url):
s.get(url)
s = requests.Session()
s.mount('https://', HTTPAdapter(pool_connections=1, pool_maxsize=1))
ts = []
for _ in range(2):
t = Thread(target=thread_get, args=('https://www.ask.com',))
ts.append(t)
t.start()
for t in ts:
t.join()
HTTPAdapter用于对指定网址的连接管理, 参数包含pool_connections=1, pool_maxsize=1.
在HTTPAdapter中的init中会新建poolmanager
def __init__(self, pool_connections=DEFAULT_POOLSIZE,pool_maxsize=DEFAULT_POOLSIZE ...):
...
self.init_poolmanager(pool_connections, pool_maxsize, block=pool_block)
def init_poolmanager(self, connections, maxsize ...):
...
self.poolmanager = PoolManager(num_pools=connections, maxsize=maxsize,
block=block, strict=True, **pool_kwargs)
在PoolManager中num_pools也就是pool_connections用于构造RecentlyUsedContainer容器, 该容器用于保存最近使用的HTTPConnectionPool/HTTPSConnectionPool. HTTPConnectionPool/HTTPSConnectionPool管理指定(url, port)的所有连接.
def __init__(self, num_pools=10 ...):
...
self.pools = RecentlyUsedContainer(num_pools,
dispose_func=lambda p: p.close())
HTTPConnectionPool/HTTPSConnectionPool也有一个pool, 是LifoQueue容器, 由pool_maxsize参数构造, 保存的是HTTPConnection. HTTPConnection保存与服务器的socket连接.
pool_connections在poolmanager中限制缓存中不同url对应的HTTPConnectionPool/HTTPSConnectionPool数目.
pool_maxsize在HTTPConnectionPool/HTTPSConnectionPool中限制缓存中同一个url连接的数目. 单线程时运行时, 只会同时存在一个连接, 不会出现连接数过多的问题. 多线程时, 当同一网址的url请求数量大于pool_maxsize时, 发起url调用时不会报错, 在请求返回时, 会将连接放入HTTPConnectionPool/HTTPSConnectionPool中的LifoQueue, 当并发请求数量大于pool_maxsize时, LifoQueue不够放入所有的请求, 就会报错Connection pool is full, discarding connection.