主要参考了http://cuiqingcai.com/1076.html这篇文章。
1.通过抓包获取登录淘宝时的POST内容
以前接触网络的东西比较多,抓包只用过wireshark,朋友推荐了charles,比wireshark更直观更方便。
抓login.taobao.com对应的POST请求如下:
header部分的Accept-Encoding字段可以删掉,不然取回来的响应好像会变成奇奇怪怪的东西。
post data部分的ua字段和password2内容获取之后一直用即可,一劳永逸。
2.向淘宝页面发送请求,得到订单内容
按照抓到的包内容向登录页面发请求,淘宝的编码的gbk,下面是代码,略去了我个人信息
# -*- coding: utf-8 -*-
import requests
import urllib
import http.cookiejar
#淘宝登录的URL
loginURL="https://login.taobao.com/member/login.jhtml"
#header信息
loginHeaders={
'Host':'login.taobao.com',
'Connection':'keep-alive',
'Content-Length':'3357',
'Cache-Control':'max-age=0',
'Origin':'https://login.taobao.com',
'Upgrade-Insecure-Requests':'1',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36',
'Content-Type':'application/x-www-form-urlencoded',
'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Referer':'https://login.taobao.com/member/login.jhtml?style=mini&newMini2=true&from=alimama&redirectURL=http%3A%2F%2Flogin.taobao.com%2Fmember%2Ftaobaoke%2Flogin.htm%3Fis_login%3d1&full_redirect=true&disableQuickLogin=true',
'Accept-Language':'zh-CN,zh;q=0.8,zh-TW;q=0.6',
'Cookie':''
}
post={
'TPL_username':username,
'TPL_password':'',
'ncoSig':'',
'ncoSessionid':'',
'ncoToken':'a59d093880ce764862f5b1a22f',
'slideCodeShow':'false',
'useMobile':'false',
'lang':'zh_CN',
'loginsite':'0',
'newlogin':'0',
'TPL_redirect_url':'http%3A%2F%2Flogin.taobao.com%2Fmember%2Ftaobaoke%2Flogin.htm%3Fis_login%3D1',
'from':'alimama',
'fc':'default',
'style':'mini',
'css_style':'',
'keyLogin':'false',
'qrLogin':'true',
'newMini':'false',
'newMini2':'true',
'tid':'',
'loginType':'3',
'minititle':'',
'minipara':'',
'pstrong':'',
'sign':'',
'need_sign':'',
'isIgnore':'',
'full_redirect':'true',
'sub_jump':'',
'popid':'',
'callback':'',
'guf':'',
'not_duplite_str':'',
'need_user_id':'',
'poy':'',
'gvfdcname':'10',
'gvfdcre':'687474703A2F2F7075622E616C696D616D612E636F6D2F',
'from_encoding':'',
'sub':'',
'TPL_password_2':password2,
'loginASR':'1',
'loginASRSuc':'1',
'allp':'',
'oslanguage':'zh-CN',
'sr':'1600*900',
'osVer':'',
'naviVer':'chrome%7C58.0302911',
'osACN':'Mozilla',
'osAV':'5.0+%28Windows+NT+10.0%3B+Win64%3B+x64%29+AppleWebKit%2F537.36+%28KHTML%2C+like+Gecko%29+Chrome%2F58.0.3029.110+Safari%2F537.36',
'osPF':'Win32',
'miserHardInfo':'',
'appkey':'',
'nickLoginLink':'',
'mobileLoginLink':'https%3A%2F%2Flogin.taobao.com%2Fmember%2Flogin.jhtml%3Fstyle%3Dmini%26newMini2%3Dtrue%26from%3Dalimama%26redirectURL%3Dhttp%3A%2F%2Flogin.taobao.com%2Fmember%2Ftaobaoke%2Flogin.htm%3Fis_login%3D1%26full_redirect%3Dtrue%26disableQuickLogin%3Dtrue%26useMobile%3Dtrue',
'showAssistantLink':'',
'um_token':'HV01PAAZ926a9e70037c23f',
'ua':ua,
}
#将POST的数据进行编码转换
postdata=urllib.parse.urlencode(post).encode(encoding='gbk')
#设置cookie
cj=http.cookiejar.CookieJar()
cookie_support = urllib.request.HTTPCookieProcessor(cj)
#设置opener
opener = urllib.request.build_opener(cookie_support, urllib.request.HTTPHandler)
#请求登录
request = urllib.request.Request(loginURL, postdata, loginHeaders)
response = opener.open(request)
#利用已登录的cookie向订单页面发送请求
list_items = opener.open('https://buyertrade.taobao.com/trade/itemlist/list_bought_items.htm')
content = list_items.read()
list_items.close()
#返回结果写入txt
f=open('list.txt','wb')
f.write(content)
f.close
登录的时候得到的返回的是“页面正在跳转中”,有这个大概就说明登录成功了吧。
其实要得到方便看的订单内容还可以用正则表达式匹配一下,但是登淘宝其实并不是为了拿订单,就懒得搞了。