tiebanggg又来更新了,项目——【12306-tiebanggg-master】注:本项目仅供学习研究,如若侵犯到贵公司权益请联系我第一时间进行删除;切忌用于一切非法途径,否则后果自行承担!
通过分析123O6,车次列表信息无需登录即可获取,但是如果我们想要使用代码代替手动为自己进行购票时,则需要进行登录网站;为了不给对方服务器造成压力,本项目并未开启多线程。项目为全自动进行车票的购买,包括(登录、验证码识别、刷票、判断是否有票、预购、下单、邮箱通知)本项目思路——使用selenium登录网站获取到cookie(供后续购票使用),定时检查cookie是否有效,获取列车列表信息及购票流程均使用requests的方式进行。
本项目结构如图:
本项目思路、过程过于复杂,共分为【列车数据获取篇】【selenium登录验证篇】【购票下单篇】【项目结束】,本篇文章只讲第三点【购票下单篇】
涉及到的第三方包:
① lxml
② requests
③ lxmll
本次“幸运”对象地址:aHR0cHM6Ly93d3cuMTIzMDYuY24vaW5kZXgvaW5kZXguaHRtbA==
1. 抓包
①搜先进入车票搜索页面进行车票检索,选一个车次点击预定进行抓包
当我们点击【预定】后,抓包框如图:
我们选中XHR总共抓到三条接口,并且执行顺序为从上到下:checkUser > submitOrderRequest > getPassengerDTOs;checkUser看意思大概率可以知道是网页对我们的登录状态经行验证(就是验证登录状态)只要登陆后这条接口可以不用再次提交,此前我们第二篇登陆篇已经实现访问链接,submitOrderRequest 和getPassengerDTOs这两个接口都是需要进行提交认证的(博主试验得知)在这里就不多绕弯子,一般post购买流程大多数流程都是 一层一层进行验证往下走。下面看看submitOrderRequest这条接口需要的formdata参数:
可以看到secretStr类似加密的东西,这里细心的同学应该会发现和我们第一篇获取到的列车信息中返回的字段有些类似,其他的参数不过多介绍分别出发日期、出发地址等这些基本信息:
复制出来对比一下,这里选择20:50这一列车的信息进行对比:03Z6dtSr5ZfgGUZR9R3nOVDdAwVdRhTaTfZ1%2FQ5qbxA%2BTG8Wm9jS7UhSR2AFHT8Wrmbx3DLwtgJ2%0AtQHWoFcQ2rhg93EbGiVF2DPqDFSOzf96DX7U7uxitG7FZHce7iLWpqJerbacxAWRQ7BZPDF8mOcf%0Af9QOdt%2FtipR8%2FZB8S89haCQR%2FK6ahIwtagRAC8AquzW8kXmi6Bp4CG%2B8CCY5Z3NsYOLfESCD0Paz%0AGJonq72dLRG%2F%2BMBWj626UAMLxSY6J8xaLeqI%2BCt3G8mPv9tWhSMYJAywb195JqLFLMcGTkDBXeKq%0AyTr3GpSmgnc%3D|预订|78000G534800|G5348|UMW|KQW|UMW|KQW|20:50|22:14|01:24|Y|feCBT%2FrWrwy91MXCgDsiZWf0HVdHNrDU2uYqthfRk4bOY6RP|20210228|3|W2|01|05|1|0|||||||||||有|有|10||O0M090|OM9|1|0||O007750021M0128000219023950010|0|||||1|0#0#0|"
secretStr: 03Z6dtSr5ZfgGUZR9R3nOVDdAwVdRhTaTfZ1/Q5qbxA+TG8Wm9jS7UhSR2AFHT8Wrmbx3DLwtgJ2
tQHWoFcQ2rhg93EbGiVF2DPqDFSOzf96DX7U7uxitG7FZHce7iLWpqJerbacxAWRQ7BZPDF8mOcf
f9QOdt/tipR8/ZB8S89haCQR/K6ahIwtagRAC8AquzW8kXmi6Bp4CG+8CCY5Z3NsYOLfESCD0Paz
GJonq72dLRG/+MBWj626UAMLxSY6J8xaLeqI+Ct3G8mPv9tWhSMYJAywb195JqLFLMcGTkDBXeKq
yTr3GpSmgnc=
可以看到基本上一致,但是他们并不相等,列车信息返回的字符串中包含有%2F、%2B、%0A、%3D等字符而secretStr中却没有,根据经验判断这些字符是通过url编码而成,我们拿两条字符串的最后几位进行对比(Tr3GpSmgnc%3D : Tr3GpSmgnc=)发现最后一个字符‘=’变成了%3D,这里将%3D放入工具进行解码看看:
没错%3D等于=,其他几个通过转换得知%2F == /,%2B == +,%0A == 换行符。我们将字符串提出用python进行替换成submitOrderRequest需要提交的secretStr 参数:
sec = '03Z6dtSr5ZfgGUZR9R3nOVDdAwVdRhTaTfZ1%2FQ5qbxA%2BTG8Wm9jS7UhSR2AFHT8Wrmbx3DLwtgJ2%0AtQHWoFcQ2rhg93EbGiVF2DPqDFSOzf96DX7U7uxitG7FZHce7iLWpqJerbacxAWRQ7BZPDF8mOcf%0Af9QOdt%2FtipR8%2FZB8S89haCQR%2FK6ahIwtagRAC8AquzW8kXmi6Bp4CG%2B8CCY5Z3NsYOLfESCD0Paz%0AGJonq72dLRG%2F%2BMBWj626UAMLxSY6J8xaLeqI%2BCt3G8mPv9tWhSMYJAywb195JqLFLMcGTkDBXeKq%0AyTr3GpSmgnc%3D' secretStr = secretStr.replace('%2F', '/').replace('%0A', '').replace('%2B', '+').replace('%3D', '=') 12
得到参数后编写出submit函数:
def submit(self, res, cookies, read_id, item): url = "https://kyfw.12306.cn/otn/leftTicket/submitOrderRequest" data = { 'secretStr': res, 'train_date': variable.date, 'back_train_date': variable.date, 'tour_flag': 'dc', 'purpose_codes': 'ADULT', 'query_from_station_name': read_id[item.split('|')[6]], 'query_to_station_name': read_id[item.split('|')[7]], } setting.headers['User-Agent'] = random.choice(USER_AGENTS) print('[submit]-data:', data) response = self.session.post(url=url, data=data, headers=setting.headers, cookies=cookies) try: messages = json.loads(response.text)['messages'] print('[submit]-messages:', messages) print('[submit]-ponse.status_code:', response.status_code) print('[submit]-response.text:', response.text) if len(messages) != 0: if '当前时间不可以订票' in messages[0]: return '当前时间不可以订票' elif '您还有未处理的订单' in messages[0]: return '您还有未处理的订单' elif '提交失败,请重试...' in messages[0]: return '提交失败,请重试...' else: return 'True' else: return 'True' 123456789101112131415161718192021222324252627282930
接下来看看getPassengerDTOs这条接口的响应和data参数:
data参数:只有一个REPEAT_SUBMIT_TOKEN,这个token不知道是啥玩意但是看着像md5签名,这里提供思路:1,全局搜索看看服务器是否有返回REPEAT_SUBMIT_TOKEN; 2,搜索JS定位到前面位置,还原签名算法;
这里我们只用了第一种方法就找出了REPEAT_SUBMIT_TOKEN,看看Headers:
可以看到并没有加密的表单参数,在这里只需要提交cookie验证即可返回数据,initDc函数代码如下:
def initDc(self, cookies): """ 请求 url = 'https://kyfw.12306.cn/otn/confirmPassenger/initDc' 获取到REPEAT_SUBMIT_TOKEN, leftTicket, ticketInfoForPassengerForm :param cookies: :return: """ url = 'https://kyfw.12306.cn/otn/confirmPassenger/initDc' data = { '_json_att': '' } setting.headers['User-Agent'] = random.choice(USER_AGENTS) response = self.session.post(url=url, data=data, headers=setting.headers, cookies=cookies) # print(response.text) globalRepeatSubmitToken = re.search(r"globalRepeatSubmitToken = '(.*?)'", response.text) print('[initDc]-globalRepeatSubmitToken:', globalRepeatSubmitToken) return globalRepeatSubmitToken 1234567891011121314151617
接下来编写confirmPassengergetPassengerDTOs函数并返回响应信息userinfo:
def confirmPassengergetPassengerDTOs(self, REPEAT_SUBMIT_TOKEN, cookies): url = 'https://kyfw.12306.cn/otn/confirmPassenger/getPassengerDTOs' data = { '_json_att': '', 'REPEAT_SUBMIT_TOKEN': REPEAT_SUBMIT_TOKEN.group(1), } setting.headers['Referer'] = 'https://kyfw.12306.cn/otn/confirmPassenger/initDc' setting.headers['User-Agent'] = random.choice(USER_AGENTS) response = self.session.post(url=url, data=data, headers=setting.headers, cookies=cookies).text userinfo = json.loads(response) print('[confirmPassengergetPassengerDTOs]-userinfo', userinfo) return userinfo 123456789101112
OK,下一步选择乘车人,点击提交订单进行抓包:弹出
弹出如图,并且抓到两条接口信息:
接下来看看他们的formdata表单:
checkOrderInfo
getQueueCount
可以看到两条接口都需要提交REPEAT_SUBMIT_TOKEN参数,在之前我们已经提取到,除了passengerTicketStr,oldPassengerStr,REPEAT_SUBMIT_TOKEN外其他的参数为固定值。接下来获取passengerTicketStr,这里给出参数说明,
可以看到passengerTicketStr里面包含了购票人的基本信息(登录账号)和座位类型id信息,细心的童鞋应该会发现,这里的参数和上面
getPassengerDTOs接口返回的响应信息中有些类似:
不用多说,这里的加密参数就等于allEncStr。而oldPassengerStr中的参数分别为身份证号和电话号码。
座位类型id信息不做过多介绍,只需要进行几次抓包即可得到,然后再setting_class.py的variable类中定义字典:
# 座位类型对应id zuoweiID = { '商务座': '9', '一等座': 'M', '二等座': 'O', '硬卧': '3', '硬座': '1', '软卧': '4' } 12345678910
到这已经获取到了所有的表单参数,接下来编写checkOrderInfo函数:
def checkOrderInfo(self, userinfo, REPEAT_SUBMIT_TOKEN, cookies): """ :param userinfo: 通过 confirmPassengergetPassengerDTOs函数得到 :param REPEAT_SUBMIT_TOKEN: 通过 initDc函数得到 :param cookies: :return: """ userinfo = userinfo url = "https://kyfw.12306.cn/otn/confirmPassenger/checkOrderInfo" passenger_id_no = userinfo['data']['normal_passengers'][0]['passenger_id_no'] # 身份证 mobile_no = userinfo['data']['normal_passengers'][0]['mobile_no'] # 电话 allEncStr = userinfo['data']['normal_passengers'][0]['allEncStr'] passenger_name = userinfo['data']['normal_passengers'][0]['passenger_name'] # 姓名 data = { 'cancel_flag': '2', 'bed_level_order_num': '000000000000000000000000000000', 'passengerTicketStr': '{},0,1,{},1,{},{},N,{}'.format(variable.play_zuowei_id, passenger_name, passenger_id_no, mobile_no, allEncStr), 'oldPassengerStr': '{},1,{},1_'.format(passenger_name, passenger_id_no), 'tour_flag': 'dc', 'randCode': '', 'whatsSelect': '1', 'sessionId': '', 'sig': '', 'scene': 'nc_login', '_json_att': '', 'REPEAT_SUBMIT_TOKEN': REPEAT_SUBMIT_TOKEN.group(1), } setting.headers['User-Agent'] = random.choice(USER_AGENTS) print('[checkOrderInfo]-data:', data) setting.headers['Referer'] = 'https://kyfw.12306.cn/otn/confirmPassenger/initDc' response = self.session.post(url=url, data=data, headers=setting.headers, cookies=cookies).text print('[checkOrderInfo]-response.text:', response) return [passenger_name, mobile_no, passenger_id_no] 123456789101112131415161718192021222324252627282930313233
再看看getQueueCount这条接口的formdata表单:
data = { 'train_date': '当前星期缩写+当前月份缩写+当前日期+当前年份', 'train_no': '通过 secretStr列车字符串得到,切割处理提取即可', 'stationTrainCode': '通过 secretStr列车字符串得到,切割处理提取即可', 'seatType': '3', 'fromStationTelecode': '通过 secretStr列车字符串得到,切割处理提取即可', 'toStationTelecode': '通过 secretStr列车字符串得到,切割处理提取即可', 'leftTicket': '通过 initDc函数得到', 'purpose_codes': '00', 'train_location': '通过 secretStr列车字符串得到,切割处理提取即可', '_json_att': '', 'REPEAT_SUBMIT_TOKEN': '通过 initDc函数得到', } 12345678910111213
当前星期缩写+当前月份缩写可以通过百度得到,得到响应数据后在setting_class.py的variable类中构造字典:
# 1-12月英文缩写 yuefen = { '01': 'Jan', '02': 'Feb', '03': 'Mar', '04': 'Apr', '05': 'May', '06': 'Jun', '07': 'Jul', '08': 'Aug', '09': 'Sept', '10': 'Oct', '11': 'Nov', '12': 'Dec', } # 周1-周日英文缩写 xingqi = { '1': 'Mon', '2': 'Tues', '3': 'Wed', '4': 'Thur', '5': 'Fri', '6': 'Sat', '7': 'Sun', } 12345678910111213141516171819202122232425
前面initDc函数我们只获取到了REPEAT_SUBMIT_TOKEN,现在只需要再进行leftTicket的提取即可:
def initDc(self, cookies): url = 'https://kyfw.12306.cn/otn/confirmPassenger/initDc' data = { '_json_att': '' } setting.headers['User-Agent'] = random.choice(USER_AGENTS) response = self.session.post(url=url, data=data, headers=setting.headers, cookies=cookies) globalRepeatSubmitToken = re.search(r"globalRepeatSubmitToken = '(.*?)'", response.text) print('[initDc]-globalRepeatSubmitToken:', globalRepeatSubmitToken) leftTicket = re.search(r"'leftTicketStr':'(.*?)'", response.text) print('[initDc]-leftTicket:', leftTicket) ticketInfoForPassengerForm) return globalRepeatSubmitToken, leftTicket 12345678910111213
所有参数获取方式已经得知,现在构造getQueueCount函数:
def getQueueCount(self, REPEAT_SUBMIT_TOKEN, leftTicket, item, cookies, train_no, stationTrainCode, train_location): url = "https://kyfw.12306.cn/otn/confirmPassenger/getQueueCount" data = { 'train_date': '{} {} {} {} 00:00:00 GMT+0800 (中国标准时间)'.format( variable.xingqi[self.weekdays(variable.date)], variable.yuefen[variable.date.split('-')[1]], variable.date.split('-')[2], variable.date.split('-')[0]), 'train_no': train_no, 'stationTrainCode': stationTrainCode, 'seatType': '3', 'fromStationTelecode': item.split('|')[6], 'toStationTelecode': item.split('|')[7], 'leftTicket': leftTicket.group(1), 'purpose_codes': '00', 'train_location': train_location, '_json_att': '', 'REPEAT_SUBMIT_TOKEN': REPEAT_SUBMIT_TOKEN.group(1), } setting.headers['User-Agent'] = random.choice(USER_AGENTS) print('[getQueueCount]-data:', data) response = self.session.post(url=url, data=data, headers=setting.headers, cookies=cookies).text print('[getQueueCount]-response.text:', response) 1234567891011121314151617181920212223
接下来点击确认,再次进行下一步抓包:
可以看到抓到了四条接口地址,执行顺序为从上到下,到这里抢票逻辑基本上完成,并且抢到了车票,只需要在30分钟内付款即可,可以登录12306app进行付款或网页进行付款。
看看confirmSingleForQueue接口的formdata:
这里只说key_check_isChange和encryptedData两个参数,其他的在前面已经获取到。
key_check_isChange: 通过 initDc函数得到的ticketInfoForPassengerForm字符串,切割处理提取即可。
encryptedData:加密字符串,可以写死。如果有兴趣的童鞋想知道具体算法可以进行JS逆向,为节省时间这里不做介绍。
下面编写confirmSingleForQueue函数:
def confirmSingleForQueue(self, userinfo, key_check_isChange, leftTicketStr, REPEAT_SUBMIT_TOKEN, train_location, cookies): """ :param userinfo: 通过 confirmPassengergetPassengerDTOs函数得到 :param key_check_isChange: 通过 initDc函数得到的ticketInfoForPassengerForm字符串,切割处理提取即可 :param leftTicketStr: 通过 initDc函数得到的对象 :param REPEAT_SUBMIT_TOKEN: 通过 initDc函数得到 :param train_location: 通过 secretStr列车字符串得到,切割处理提取即可 :param cookies: :return: """ url = "https://kyfw.12306.cn/otn/confirmPassenger/confirmSingleForQueue" passenger_id_no = userinfo['data']['normal_passengers'][0]['passenger_id_no'] # 身份证 mobile_no = userinfo['data']['normal_passengers'][0]['mobile_no'] # 电话 allEncStr = userinfo['data']['normal_passengers'][0]['allEncStr'] passenger_name = userinfo['data']['normal_passengers'][0]['passenger_name'] # 姓名 data = { 'passengerTicketStr': '{},0,1,{},1,{},{},N,{}'.format(variable.play_zuowei_id, passenger_name, passenger_id_no, mobile_no, allEncStr), 'oldPassengerStr': '{},1,{},1_'.format(passenger_name, passenger_id_no), 'randCode': '', 'purpose_codes': '00', 'key_check_isChange': key_check_isChange.group(1), 'leftTicketStr': leftTicketStr.group(1), 'train_location': train_location, 'choose_seats': '', 'seatDetailType': '000', 'is_jy': 'N', 'encryptedData':'', 'whatsSelect': '1', 'roomType': '00', 'dwAll': 'N', '_json_att': '', 'REPEAT_SUBMIT_TOKEN': REPEAT_SUBMIT_TOKEN.group(1), } setting.headers['User-Agent'] = random.choice(USER_AGENTS) setting.headers['Referer'] = 'https://kyfw.12306.cn/otn/confirmPassenger/initDc' response = self.session.post(url=url, data=data, headers=setting.headers, cookies=cookies).text print("[confirmSingleForQueue]-response.text:", response) if '余票不足' in response: return '余票不足!' else: return 'True' 1234567891011121314151617181920212223242526272829303132333435363738394041
queryOrderWaitTime接口可以忽略,不做介绍,有兴趣的同学可以自行研究。
接下来再看resultOrderForDcQueue接口的formdata表单:
这里的orderSequence_no怎么来?先全局搜索一下:
没错,是服务器返回的,看看Headers:
这里formdata参数都是有的,只需要提交请求并且进行sequence_no的提取(sequence_no==orderSequence_no)即可, 下面编写playOrder函数:
def payOrder(self, REPEAT_SUBMIT_TOKEN, cookies): """ 支付必须先走这一步,服务器会记录指纹 :param REPEAT_SUBMIT_TOKEN: 通过 initDc函数得到 :param cookies: :return: sequence_no, payOrder_url """ url = 'https://kyfw.12306.cn/otn//payOrder/init?' setting.headers['User-Agent'] = random.choice(USER_AGENTS) setting.headers['Referer'] = 'https://kyfw.12306.cn/otn/confirmPassenger/initDc' data = { 'json_att': '', 'REPEAT_SUBMIT_TOKEN': REPEAT_SUBMIT_TOKEN.group(1), } params = { 'random':str(int(time.time()*1000)) } response = self.session.post(url=url, data=data, headers=setting.headers, cookies=cookies, params=params).text if len(list(response)) < 3000: print("[payOrder]-response.text", response) sequence_no = re.search(r"sequence_no = '(.*?)'", response) print("[payOrder]-sequence_no:", sequence_no) payOrder_url = url return sequence_no, payOrder_url + 'random=' + params['random'] 12345678910111213141516171819202122232425
在payOrder函数中我们返回了两个参数,sequence_no, payOrder_url + ‘random=’ + params[‘random’],后者是支付环节需要的url,在这里提前获取。
下面编写resultOrderForDcQueue函数:
def resultOrderForDcQueue(self, sequence_no, REPEAT_SUBMIT_TOKEN, cookies): """ :param sequence_no: 通过payOrder函数获得 :param REPEAT_SUBMIT_TOKEN: 通过 initDc函数得到 :param cookies: :return: None """ url = "https://kyfw.12306.cn/otn/confirmPassenger/resultOrderForDcQueue" setting.headers['User-Agent'] = random.choice(USER_AGENTS) setting.headers['Referer'] = 'https://kyfw.12306.cn/otn/confirmPassenger/initDc' data = { 'orderSequence_no': sequence_no.group(1), '_json_att': '', 'REPEAT_SUBMIT_TOKEN': REPEAT_SUBMIT_TOKEN.group(1), } response = self.session.post(url=url, data=data, cookies=cookies, headers=setting.headers).text print("[resultOrderForDcQueue]-response.text:", response) 1234567891011121314151617
我们可以写两个验证函数来验证抢票是否成功:
def queryMyOrderNoComplete(self, cookies): """ 获取待支付订单信息 :param cookies: :return: """ url = "https://kyfw.12306.cn/otn/queryOrder/queryMyOrderNoComplete" setting.headers['User-Agent'] = random.choice(USER_AGENTS) setting.headers['Referer'] = 'https://kyfw.12306.cn/otn/view/train_order.html' data = { '_json_att': '', } info_dict = {} response = self.session.post(url=url, headers=setting.headers, cookies=cookies, data=data).text jsdata = json.loads(response)['data']['orderDBList'][0] info_dict['姓名:'] = jsdata['array_passser_name_page'][0] # 姓名 info_dict['出发地:'] = jsdata['from_station_name_page'][0] # 出发地 info_dict['目的地:'] = jsdata['to_station_name_page'][0] # 目的地 info_dict['出发日期:'] = jsdata['start_train_date_page'].split(' ')[0] # 出发日期 info_dict['订单号:'] = jsdata['sequence_no'] # 订单号 info_dict['价格:'] = jsdata['ticket_total_price_page'] # 价格 info_dict['出发时间:'] = jsdata['start_time_page'] # 出发时间 info_dict['到达时间:'] = jsdata['arrive_time_page'] # 到达时间 info_dict['车次:'] = jsdata['train_code_page'] # 车号 tickets = jsdata['tickets'][0] a = tickets['coach_name'] # 车厢 b = tickets['seat_name'] #座位号 info_dict['列车信息:'] = a + '车' + b info_dict['座位性质:'] = tickets['seat_type_name'] # 座位性质 info_dict['支付时间:'] = tickets['reserve_time'] # 支付时间 info_dict['到期时间:'] = tickets['pay_limit_time'] # 到期时间 return info_dict 123456789101112131415161718192021222324252627282930313233
def check_play(self, cookies): """ 判断是否抢购成功 :param d_play: :return: """ i = 0 while True: time.sleep(0.3) i += 1 try: d_play = self.queryMyOrderNoComplete(cookies=cookies) cc = d_play['车次:'] rt = 'True' break except: if i > 5: rt = 'False' break if rt == 'True': # 抢票成功 return True if rt == 'False': # 抢票失败 return False 123456789101112131415161718192021222324252627
到此抢票项目已经完成%99,剩下的就是邮箱推送通知【项目介绍】篇,会尽快更新给大家。
下一章节【python爬取12306列车信息自动抢票并自动识别验证码(四)结束篇(邮箱推送通知)】 敬请期待
这里一直忘了还有一个章节:定时cookie监测服务——检测cookie是否过期,若过期服务将自动重启登录服务进行cookie的保存使用,避免抢票过程中重新登录从而失去购票机会,会在邮箱推送之后推出,谢谢大家关注。
近期有很多朋友通过私信咨询有关Python学习问题。为便于交流,点击蓝色自己加入讨论解答资源基地