cookie保存在浏览器中,很多浏览器限制一个站点最多保存20个cookie
session存在服务器中。
1.带上cookie和session的好处
能够请求到登陆后的页面
2,弊端
一套cookie和session往往对应一个用户,请求太快,请求次数太多,容易被识别为爬虫
不需要cookie的时候尽量不去使用cookie
但是有时为了获取登陆的页面,必须发送带有cookie的请求
requests提供了一个sessiion类,来实现客户端和服务器端的会话保持
使用的方法:
1.实例化一个session对象
2.让session来发送get或post请求
session=requests.session()
response=session.get(url,headers)
例子:
第一种方法:
import requests
def run():
headers = {
'User-Agent': 'ozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36'
}
data = {
'username': 'xxxx',
'password': 'xxxx'
}
session = requests.session()
# 使用session发送请求,cookie保存其中
session.post('https://passport.baidu.com/center', headers=headers, data=data)
# 使用session请求登陆后地址,得到信息返回
r = session.get("https://passport.baidu.com/center", headers=headers)
with open('csdn-1.html', 'w', encoding='utf-8') as f:
f.write(r.content.decode())
if __name__ == '__main__':
run()
第二种方法,直接获取cookie放在headers中
import requests
def run():
headers = {
'User-Agent': 'ozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36',
'Cookie':'BAIDUID=341732C7B4F15BBC29D34AEACEB3504A:FG=1; PSTM=1542017140;'
' BIDUPSID=2AC22745056694211D23B3E44908D13C; HOSUPPORT=1; '
'UBI=fi_PncwhpxZ%7ETaKARr9ykC%7EUxVaXAd4LMQqiLsD0A7cjJoq7PwEMEJbzj-kwTBzbZ0hiDKutNki369rZtk3; '
'USERNAMETYPE=3; SAVEUSERID=a83c8629c010e3fdeb37bc15bbd859; '
'HISTORY=439af17d04a8573ef0a503db42f7f8eb74194a; cflag=15%3A3; '
'pplogid=1978AIi2iDItRkilaVSoqU%2F1X%2FxJALykIUX9p6Uk4D0coM4%3D; '
'STOKEN=59bc2a92ec36899d06dc0ded956639e6d7ab5fa617516a8ea1656eb926d7081a;'
'BDUSS=ENLNTVSVVRqVFI3SDJwQTJJRm5yWXMwQ35QYW9XM29UTUIxWk83ZGNmb2xlQzFjQVFBQUFBJCQAAAAAAAAAAAEAAACvILVXsaHO7W3OosG5AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACXrBVwl6wVcR;'
' PTOKEN=d0c63029731a23a92850fccfb5803c10; Hm_lvt_90056b3f84f90da57dc0f40150f005d5=1543891782;'
' Hm_lpvt_90056b3f84f90da57dc0f40150f005d5=1543892594'
}
r=requests.get('https://passport.baidu.com/center',headers=headers)
print(r.content.decode())
with open('csdn-2.html','w',encoding='utf-8') as f:
f.write(r.content.decode())
if __name__ == '__main__':
run()
第三种方式:将cookies字典化
import requests
def run():
url='https://passport.baidu.com/center'
headers={
'User-Agent': 'ozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.67 Safari/537.36',
}
#将cookie转为字典
cookies='BAIDUID=341732C7B4F15BBC29D34AEACEB3504A:FG=1;PSTM=1542017140;BIDUPSID=2AC22745056694211D23B3E44908D13C;HOSUPPORT=1;UBI=fi_PncwhpxZ%7ETaKARr9ykC%7EUxVaXAd4LMQqiLsD0A7cjJoq7PwEMEJbzj-kwTBzbZ0hiDKutNki369rZtk3;USERNAMETYPE=3; SAVEUSERID=a83c8629c010e3fdeb37bc15bbd859;HISTORY=439af17d04a8573ef0a503db42f7f8eb74194a; cflag=15%3A3;pplogid=1978AIi2iDItRkilaVSoqU%2F1X%2FxJALykIUX9p6Uk4D0coM4%3D;STOKEN=59bc2a92ec36899d06dc0ded956639e6d7ab5fa617516a8ea1656eb926d7081a;BDUSS=ENLNTVSVVRqVFI3SDJwQTJJRm5yWXMwQ35QYW9XM29UTUIxWk83ZGNmb2xlQzFjQVFBQUFBJCQAAAAAAAAAAAEAAACvILVXsaHO7W3OosG5AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACXrBVwl6wVcR;PTOKEN=d0c63029731a23a92850fccfb5803c10; Hm_lvt_90056b3f84f90da57dc0f40150f005d5=1543891782;Hm_lpvt_90056b3f84f90da57dc0f40150f005d5=1543892594'
cookies={cookie.split('=')[0]:cookie.split('=')[1] for cookie in cookies.split(";")}
r=requests.get(url=url,headers=headers,cookies=cookies)
with open ('csdn-3.html','w',encoding='utf-8') as f:
f.write(r.content.decode())
if __name__ == '__main__':
run()
1.cookie过期时间很长的网站(学校网站)
2,在cookie过期之前能拿到所有数据
3,配合其他程序, 其他程序获取cookie
1.实例化session,使用session发送post请求,然后在使用session.get获取登陆后的信息
2,将cookie信息加入到headers中
3,将cookies字典化,然后调用