让你爬虫时如鱼得水的工具和模块
这款库可以生成随机的UA请求头
安装UA库
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple fake-useragent
使用UA库
# 导入模块
import random
from fake_useragent import UserAgent
# 创建实例
ua = UserAgent()
# ua列表,防止UA获取时失败
ua_list = []
for i in range(10):
# ua.random获取一条请求头
ua_list.append(ua.random)
print(ua_list)
['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 Edg/116.0.1938.81', 'Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/118.0', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/16.5 Safari/605.1.15', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/116.0', 'Mozilla/5.0 (Windows NT 10.0; rv:109.0) Gecko/20100101 Firefox/118.0', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36', 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:109.0) Gecko/20100101 Firefox/115.0', 'Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Firefox/102.0', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36']
# 获取列表中的一条请求头
print(random.choice(ua_list))
# Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36
首先找到抓的包,右键复制cURL(cmd)
首先打开爬虫工具库
把复制cURL粘贴到curl转requests
他会自动生成爬虫代码,包括请求头和cookies
可以直接运行
极简插件下载xpath helper
下载完成后解压,打开开发者模式,拖到页面加载成功
它可以让你写xpath语法时,给你验证,与爬虫的相似