urllib3是一款Python 3的HTTP客户端。
Python标准库提供了urllib。在Python 2中,另外提供了urllib2;而在Python 3中,重构了urllib和urllib2到标准库urllib,并另外提供了urllib3。
1. urllib3的特性
线程安全
连接缓冲池
客户端SSL/TLS验证
文件上传
请求重试
HTTP重定向
支持gzip和deflate encoding
支持HTTP和SOCKS的代理
2. 安装
urllib3不是Python 3的标准库,要使用需要另外安装,pip命令如下:
pip install urllib3
3. 用法
1) HTTP GET请求
>>> import urllib3 >>> http = urllib3.PoolManager() >>> r = http.request('GET', 'http://httpbin.org/robots.txt') >>> r.status 200 >>> r.data ... >>> r.headers ...
注意:任何HTTP请求,只有通过PoolManager对象发出,才能够提供连接缓冲池和线程安全特性。
任何请求的返回对象都是HTTPResponse对象,其中包含status, data和headers三个属性。
2) HTTP POST请求
>>> import urllib3 >>> http = urllib3.PoolManager() >>> r = http.request('POST', 'http://httpbin.org/post', fields={'hello': 'Xiangbin'}) >>> r.status 200 >>> r.data ... >>> r.headers ...
3) JSON响应的处理
>>> import urllib3 >>> import json >>> http = urllib3.PoolManager() >>> r = http.request('GET', 'http://httpbin.org/ip') >>> r.data b'{\n "origin": "10.23.1.37"\n}\n' >>> json.loads(r.data.decode('utf-8')) {'origin': '127.0.0.1'}
注意:使用json的loads()方法
4) 流式响应的处理
>>> import urllib3 >>> http = urllib3.PoolManager() >>> r = http.request('GET', 'http://httpbin.org/bytes/1024', preload_content=False) >>> for chunk in r.stream(32): ... print(chunk) ... >>> r.release_conn()
注意:preload_content=False表示流式处理响应数据。
处理stream()方法读取响应数据之外,还可以使用read()方法,示例如下:
>>> import urllib3 >>> http = urllib3.PoolManager() >>> r = http.request('GET', 'http://httpbin.org/bytes/1024', preload_content=False) >>> r.read(4) b'\x88\x1f\x8b\xe5' >>> r.release_conn()
5) 请求带参数
>>> r = http.request('GET', 'http://httpbin.org/headers', fields={'hello': 'Xiangbin'}, headers={'X-Something': 'value'})
对于POST和PUT方法,需要将参数编码后,这样才可以追加到URL,示例如下:
>>> from urllib.parse import urlencode >>> encoded_args = urlencode({'arg': 'value'}) >>> url = 'http://httpbin.org/post?' + encoded_args >>> r = http.request('POST', url)
当然,最好还是以fields参数形式,urllib3将自动编码,示例如下:
>>> r = http.request('POST', 'http://httpbin.org/post', fields={'hello': 'Xiangbin'})
使用JSON模块,还可以以body形式发送请求参数,示例如下:
>>> import json >>> data = {'Hello': 'Xiangbin'} >>> encoded_data = json.dumps(data).encode('utf-8') >>> r = http.request('POST', 'http://httpbin.org/post', body=encoded_data, headers={'Content-Type': 'application/json'}) >>> json.loads(r.data.decode('utf-8'))['json'] {'Hello': 'Xiangbin'}
6) 上传文件
文本文件
>>> with open('example.txt') as fp: ... file_data = fp.read() >>> r = http.request( ... 'POST', ... 'http://httpbin.org/post', ... fields={ ... 'filefield': ('example.txt', file_data, 'text/plain'), ... }) >>> json.loads(r.data.decode('utf-8'))['files'] {'filefield': '...'}
注意:上传文件必须使用POST方法。
二进制文件
>>> with open('example.jpg', 'rb') as fp: ... binary_data = fp.read() >>> r = http.request( ... 'POST', ... 'http://httpbin.org/post', ... body=binary_data, ... headers={'Content-Type': 'image/jpeg'}) >>> json.loads(r.data.decode('utf-8'))['data'] b'...'
补充知识:Python的requests软件包详解
requests是一款Python的第三方HTTP类库,便于进行HTTP访问。
1. requests的特性
能够发送HTTP 1.1请求
无需手工为GET方法设置URL的请求参数,无需手工为POST方法组编码表单形式
借助于urllib3实现HTTP请求的连接会话缓存
支持Python 2.6, 2.7, 3.3-3.7
2. requests的安装
requests不是Python标准库,需要使用PIP安装,命令如下:
pip install requests
安装过程如下:
C:\Sam\works>pip install requests Collecting requests Downloading https://files.pythonhosted.org/packages/51/bd/23c926cd341ea6b7dd0b2a00aba99ae0f828be89d72b2190f27c11d4b7fb/requests-2.22.0-py2.py3-none-any.whl (57kB) 100% |████████████████████████████████| 61kB 17kB/s Collecting certifi>=2017.4.17 (from requests) Downloading https://files.pythonhosted.org/packages/18/b0/8146a4f8dd402f60744fa380bc73ca47303cccf8b9190fd16a827281eac2/certifi-2019.9.11-py2.py3-none-any.whl (154kB) 100% |████████████████████████████████| 163kB 18kB/s Collecting idna<2.9,>=2.5 (from requests) Downloading https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl (58kB) 100% |████████████████████████████████| 61kB 10kB/s Collecting urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 (from requests) Downloading https://files.pythonhosted.org/packages/e0/da/55f51ea951e1b7c63a579c09dd7db825bb730ec1fe9c0180fc77bfb31448/urllib3-1.25.6-py2.py3-none-any.whl (125kB) 100% |████████████████████████████████| 133kB 32kB/s Collecting chardet<3.1.0,>=3.0.2 (from requests) Downloading https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl (133kB) 100% |████████████████████████████████| 143kB 48kB/s Installing collected packages: certifi, idna, urllib3, chardet, requests Successfully installed certifi-2019.9.11 chardet-3.0.4 idna-2.8 requests-2.22.0 urllib3-1.25.6 You are using pip version 19.0.3, however version 19.3.1 is available. You should consider upgrading via the 'python -m pip install --upgrade pip' command.
3. requests的接口
1) Main interfaces
requests.request() requests.head() requests.get('url', params={'key1':'value1', 'key2':'value2'},headers={'user-agent': '...'}, cookies={'name1':'value2'}) requests.post('url', data={'key':'value'}) requests.post('url', json={'key':'value'}) requests.post('url', files={'uploaded_file': open('report.xls', 'rb')}) requests.post('url', files={'uploaded_file': ('report.xls', open('report.xls', 'rb'), 'application/excel', {'Expires': '0'})}) requests.post('url', files={'uploaded_file': ('temp.txt', 'one line\ntwo lines\n')}) requests.put('url', data={'key':'value'}) requests.patch() requests.delete('url') def getGithub(): github_url = 'https://api.github.com/user/repos' myresponse = requests.get(github_url, auth=('champagne', 'myPassword')) print(myresponse.json()) def postGithub(): github_url = 'https://api.github.com/user/repos' data = json.dumps({'name':'python test', 'description':'a python test repo'}) myresponse = requests.post(github_url, data, auth=('champagne', 'myPassword')) print(myresponse.text)
2) requests.Session类
import requests
requests.Session()
3) requests.Request类
import requests
requests.Request('GET', 'http://httpbin.org/get')
4) requests.PreparedRequest类
import requests req = requests.Request('GET', 'http://httpbin.org/get') preq = req.prepare()
5) requests.Response类
import requests r = requests.get('https://api.github.com/events') r.headers['content-type'] #'application/json;charset=utf8' r.url r.status_code #200==requests.codes.ok r.encoding #'utf-8' by default r.raw #raw content r.text #text content r.content #binary content r.json()#json content, recommended r.cookies['a_key']
注意:调用json()方法,如果返回结果不是有效的JSON数据,则抛出ValueError异常。
6) requests.adapters.BaseAdapter类
7) requests.adapters.HTTPAdapter类
requests提供的使用urllib3的HTTP Adapter
以上这篇Python urllib3软件包的使用说明就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持脚本之家。