python爬虫第二课:请求头之伪装UA

UA:'User-Agent',一些网站最基本的反爬虫手段就是通过UA判断来源,如果UA不正常,则可能是机器人了。

现在进行伪装,建立一个字典,如果懒得去浏览器上复制,则可以使用fake_useragent库的UserAgent模块:

# headers.py
from fake_useragent import UserAgent;

class headers:
    def __init__(self):
        self.ua = UserAgent();
        self.ie = {'User-Agent':self.ua.ie};
        self.chrome = {'User-Agent':self.ua.chrome};
        self.random = {'User-Agent':self.ua.random};
import requests;
from headers import headers;

ua = headers();
url = 'http://httpbin.org/get'

data_1 = requests.get(url);
data_2 = requests.get(url,headers=ua.chrome);

data_1 = eval(str(data_1.text));
data_2 = eval(str(data_2.text));

print(data_1['headers']['User-Agent']);
print(data_2['headers']['User-Agent']);

 

# data_1:
python-requests/2.18.4

# data_2:
Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)
Chrome/27.0.1453.93 Safari/537.36

 

你可能感兴趣的:(python爬虫,python爬虫进阶笔记,伪装UA)