爬虫学习笔记

爬虫防封手段之一:requests.get方法中添加headers
方法一:自定义headers
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36'}
response = requests.get(url=chapter_url,headers=headers,verify=False)

方法二:使用user_agent随机生成headers
import user_agent
headers = {'User-Agent': user_agent.generate_user_agent()}

requests.get()报错解决方法:
方法一:添加“verify=False“属性,但还有后续错误比如InsecureRequestWarning,可在导入库时多写两行,具体如下:
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

方法二:安装一下几个requests依赖包
pip install cryptography
pip install pyOpenSSL
pip install certifi

你可能感兴趣的:(爬虫学习笔记)