前面了解了urllib的基本用法,但是其中确实有不方便的地方。比如处理网页验证、处理cookies等等,需要写 Opener、Handler 来进行处理。为了更加方便地实现这些操作,在这里就有了更为强大的库requests,有了它,cookies、登录验证、代理设置等等的操作都不是事儿。
import requests
response = requests.get('https://httpbin.org/get')
print(response.text)
运行结果如下:
{
"args": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Connection": "close",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.21.0"
},
"origin": "116.227.107.42",
"url": "https://httpbin.org/get"
}
除此之外requstes还有其他类型的请求。
response = requests.post('http://httpbin.org/post')
response = requests.put('http://httpbin.org/put')
response = requests.delete('http://httpbin.org/delete')
response = requests.head('http://httpbin.org/get')
response = requests.options('http://httpbin.org/get')
请求一个带参数的地址。例如:http://httpbin.org/get?name=chris&age=22。
方法一:直接请求
import requests
response = requests.get('http://httpbin.org/get?name=chris&age=22')
print(response.text)
方法二:利用param参数
import requests
params = {
'name': 'chris',
'age': 22
}
response = requests.get("http://httpbin.org/get", params=params)
print(response.text)
运行结果如下:
{
"args": {
"age": "22",
"name": "chris"
},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Connection": "close",
"Host": "httpbin.org",
"User-Agent": "python-requests/2.21.0"
},
"origin": "116.227.107.42",
"url": "http://httpbin.org/get?name=chris&age=22"
}
import requests
headers= {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36',
}
response = requests.get("http://httpbin.org/get", headers=headers)
print(response.text)
运行结果如下:
{
"args": {},
"headers": {
"Accept": "*/*",
"Accept-Encoding": "gzip, deflate",
"Connection": "close",
"Host": "httpbin.org",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"
},
"origin": "116.227.107.42",
"url": "http://httpbin.org/get"
}
import requests
headers = {
'Cookie': 'q_c1=31653b264a074fc9a57816d1ea93ed8b|1474273938000|1474273938000; d_c0="AGDAs254kAqPTr6NW1U3XTLFzKhMPQ6H_nc=|1474273938"; __utmv=51854390.100-1|2=registration_date=20130902=1^3=entry_date=20130902=1;a_t="2.0AACAfbwdAAAXAAAAso0QWAAAgH28HQAAAGDAs254kAoXAAAAYQJVTQ4FCVgA360us8BAklzLYNEHUd6kmHtRQX5a6hiZxKCynnycerLQ3gIkoJLOCQ==";z_c0=Mi4wQUFDQWZid2RBQUFBWU1DemJuaVFDaGNBQUFCaEFsVk5EZ1VKV0FEZnJTNnp3RUNTWE10ZzBRZFIzcVNZZTFGQmZn|1474887858|64b4d4234a21de774c42c837fe0b672fdb5763b0',
'Host': 'www.zhihu.com',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36',
}
r = requests.get('https://www.zhihu.com', headers=headers)
print(r.text)
import requests
proxies = {
'http': 'http://127.0.0.1:1080',
'https': 'http://127.0.0.1:1080',
}
requests.get('http://httpbin.org/get', proxies=proxies)