通过cmd安装:pip install requests
通过pucharm安装:文件–设置–项目–Project Interpreter
import requests
# 添加headers和查询参数
headers = {
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36'
}
params = {
'wd':'灌篮高手'}
# params 接收一个字典或者字符串的查询参数,字典类型自动转换为url编码,不需要urlencode()
response = requests.get("https://www.baidu.com/s?&",headers=headers,params=params)
print(response)
response的属性:
response.text #返回unicode类型的数据
response.content #返回byte类型的数据
response.url #查看完整url地址
response.encoding # 查看响应头部字符编码
1.response.content :这个是直接从网络上抓取的数据,没有经过任何的编码,所以是一个bytes类型,其实在硬盘上和网络上传输的字符串都是bytes类型
2.response.text :这个是str的数据类型,是requests库将response.content进行解码的字符串,解码需要指定一个编码方式,requests会根据自己的猜测来判断编码的方式,所以有时候可能会猜测错误,就会导致解码产生乱码,这时候就应该进行手动解码,比如使用response.content.decode(‘utf-8’)
response = requests.post(url,data=data)
例如:
import requests
url = "…………"
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36'
}
data = {
'username':'…………',
'password':'…………'
}
resp = requests.post(url,headers=headers,data=data)
print(resp.text)
只要在请求的方法中(比如get或者post)传递proxies参数就可以了。
例如:
import requests
url = 'http://httpbin.org/ip'
proxy = {
'http':'http://111.222.141.127:8118'
}
resp = requests.get(url,proxies=proxy)
print(resp.text)
尝试练手:以字典形式打印出cookie
import requests
url = 'https://www.baidu.com'
resp = requests.get(url)
print(resp.cookies)
print(resp.cookies.get_dict())
模拟登录方法一:
通过在请求头中直接写入cookie的值(该方法不好,因为cookie有一个过期时间,过期后该值不再能使用)
import requests
url = 'https://www.zhihu.com/hot'
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36',
'cookie': '_zap=edce6b48-6598-4893-9e03-470792a7838e; _xsrf=QR0HluAy0lofCiF2ptG2yiYb3rkADfWR; _ga=GA1.2.1819588737.1583063491; _gid=GA1.2.532962712.1583063491; d_c0="ALCV4aKP5RCPTitRs3IbbRpte6whRWln9sM=|1583063490"; q_c1=799337cb0c7d4f90a113054b9fe08b44|1583063515000|1583063515000; r_cap_id="NTA2ODAyM2ZmMTY4NDUxYmExNWM5MGIwYTQ0ZWE2N2Q=|1583063515|90c6cbe90ca0db357458af88093804ac914436c2"; cap_id="YzVhODViZWIyNWY3NGVhN2JmYjAwNTlkNzBjMDYzYjY=|1583063515|ce318186ed42bb714032a011cf6f6f4e6d77fb93"; l_cap_id="OTI1ZmRjYWYxNmVmNDgxMzhkYzJmZjk1NzlhNWNiYmE=|1583063515|e0628f9370f4ca14e437a593a99bfed43a862e96"; __utma=51854390.1819588737.1583063491.1583063517.1583063517.1; __utmz=51854390.1583063517.1.1.utmcsr=zhihu.com|utmccn=(referral)|utmcmd=referral|utmcct=/signin; __utmv=51854390.100-1|2=registration_date=20150228=1^3=entry_date=20150228=1; tst=h; tshl=; Hm_lvt_98beee57fd2ef70ccdd5ca52b9740c49=1583063755,1583066426,1583068822,1583131638; _gat_gtag_UA_149949619_1=1; capsion_ticket="2|1:0|10:1583131654|14:capsion_ticket|44:NzM1Mjk3NGQwMzJjNDkzODhhNWI2NDM2YTg5M2E5MzI=|54842c56ebefecdd9fb21e4d3b0d4ad0a600e66cfc6b58fa350e41e06e9b3897"; z_c0="2|1:0|10:1583131665|4:z_c0|92:Mi4xbHVLS0dRQUFBQUFBc0pYaG9vX2xFQ1lBQUFCZ0FsVk5FZnBKWHdDT19lVDV0bU5CdnI4eHJ1WkxzeFg1VkZoN0lR|5e1683d2604620fd7e5e1c8412cb55b35e1a6203d802d35c66fa4db4a5f5309d"; Hm_lpvt_98beee57fd2ef70ccdd5ca52b9740c49=1583131669; KLBRSID=fb3eda1aa35a9ed9f88f346a7a3ebe83|1583131669|1583131635'
}
resp = requests.get(url,headers= headers)
print(resp.text)
模拟登录方法二:
通过session处理headers、data后,进行模拟登录。
import requests
headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36"}
post_url = "https://i.meishi.cc/login.php"
post_data = {
"username":"…………",
"password":"…………"
}
url = "https://i.meishi.cc/jifen/mingxi.php?session_id=09f5c57ef7872b9cffeb87d516c53cb2"
session = requests.session()
session.post(post_url,headers=headers,data=post_data)
resp = session.get(url)
print(resp.text)
import requests
url = "…………"
resp = requests.get(url,verify=False)
print(resp.text)