pixiv
是著名的插画网站。如果我们通过爬虫技术得到了pixiv
网站图片的url
,那么如何根据url
下载图片到本地。
pip install requests
打开以下页面
https://www.pixiv.net/artworks/77926406
复制图片地址
https://i.pximg.net/img-original/img/2019/11/22/00/00/13/77926406_p0.jpg
import requests
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
headers = {
'Referer': 'https://www.pixiv.net/'}
url = 'https://i.pximg.net/img-original/img/2019/11/22/00/00/13/77926406_p0.jpg'
res = requests.get(url, headers=headers, verify=False)
with open('test.jpg', 'wb') as f:
f.write(res.content)
请求头添加Referer
headers = {'Referer': 'https://www.pixiv.net/'}
关闭SSL
证书验证
verify = False
pixiv
设置了图片防盗链,所以需要添加Referer
。
Referer
的作用就是告诉你要下载的那个图片页面,我是从主页面来的,你可以放心的把数据给我。
举个栗子:
pixiv
用的是私有证书,如果设置verify=True
,下载会报错:
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='i.pximg.net', port=443): Max retries exceeded with url: /img-original/img/2019/11/22/00/00/13/77926406_p0.jpg
(Caused by NewConnectionError(': Failed to establish a new connection: [WinError 10061] 由于目标计算机积极拒绝,无法连接。'))
请求图片地址的时候设置了verify=False
,所以会弹出警告:
InsecureRequestWarning:
Unverified HTTPS request is being made to host 'i.pximg.net'.
Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
为了不让程序运行时弹出警告,我们需要添加以下两行代码:
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
https://blog.csdn.net/python_neophyte/article/details/82562330
https://requests.readthedocs.io/zh_CN/latest/user/advanced.html#ssl
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings