request保持会话,寻找set-cookie来获取数据

今天遇到了 一个比较烦人的问题,爬取一个网站的时候,登陆返回的cookie和通过抓包获取的数据的cookie不一样,其中有个参数,找了半天,没找到。

网址:https://i.keking.cn/user_index.html

登陆返回的cookie是这个样子:

acw_tc=2f624a7115548746919093682e53ca410b002b05e6d61724dbcfaaa50d7b58; UM_distinctid=16a05ca88f1231-051276fcfa61fb-7a1437-100200-16a05ca88f214a; companyName=%E6%B7%B1%E5%9C%B3%E9%AA%90%E7%BF%94%E7%89%A9%E6%B5%81%E5%86%87%E9%99%90%E8%B4%A3%E4%BB%BB%E5%85%AC%E5%8F%B8; token=eyJlbmNyeXB0ZWREYXRhIjoiWHY2YU1JZjZPTEhqT3pOeFJmZzBERlNCbVVWUCtuUTFjc3BSd0E5bGtWNXUyWnhrV2ZJdFUzeGxZU3F2Y0Z1Z2NWR1BIQlNVVTY2WkRqc1lRUnhENUI1dnZDUUU0MmdQN1hjM2pwNUNXSk9rWTNQQ1JsTjRGclVqZ3g4K1VYTDN0MW13KzMwLzkySERkNFBqalVDc1lwejJpcGg4MlZHMElGcHQyM05OQ1JJPSIsIndyYXBwZWRLZXkiOiJtRTIvcTlRb2RmUUxNRi85UEFIc3NsNVJoNEJ3aE95Y3RXUkVhYVhTU3VldW1ZZTlWTk5TZk80cDBSS1FPLzNaQi9PbVBQRnNONHNGWFNlZms1SmFkMkxZSmkyNVphdWRXOWVJYlhyNElTbWdScWtDZVdDcHZmdzJiTzJCMHc3MldFZFk3TkF2YWFMOE0xOXJxTFI3VlRwVVpUVVVyc0FuR0JCam9ZZ3Y1Q0k9In0=; 

抓包数据所在的cookie是这个样子:

Cookie: acw_tc=2f624a7115548746919093682e53ca410b002b05e6d61724dbcfaaa50d7b58; UM_distinctid=16a05ca88f1231-051276fcfa61fb-7a1437-100200-16a05ca88f214a; companyName=%E6%B7%B1%E5%9C%B3%E9%AA%90%E7%BF%94%E7%89%A9%E6%B5%81%E5%86%87%E9%99%90%E8%B4%A3%E4%BB%BB%E5%85%AC%E5%8F%B8; token=eyJlbmNyeXB0ZWREYXRhIjoiWHY2YU1JZjZPTEhqT3pOeFJmZzBERlNCbVVWUCtuUTFjc3BSd0E5bGtWNXUyWnhrV2ZJdFUzeGxZU3F2Y0Z1Z2NWR1BIQlNVVTY2WkRqc1lRUnhENUI1dnZDUUU0MmdQN1hjM2pwNUNXSk9rWTNQQ1JsTjRGclVqZ3g4K1VYTDN0MW13KzMwLzkySERkNFBqalVDc1lwejJpcGg4MlZHMElGcHQyM05OQ1JJPSIsIndyYXBwZWRLZXkiOiJtRTIvcTlRb2RmUUxNRi85UEFIc3NsNVJoNEJ3aE95Y3RXUkVhYVhTU3VldW1ZZTlWTk5TZk80cDBSS1FPLzNaQi9PbVBQRnNONHNGWFNlZms1SmFkMkxZSmkyNVphdWRXOWVJYlhyNElTbWdScWtDZVdDcHZmdzJiTzJCMHc3MldFZFk3TkF2YWFMOE0xOXJxTFI3VlRwVVpUVVVyc0FuR0JCam9ZZ3Y1Q0k9In0=; tmsToken="eyJlbmNyeXB0ZWREYXRhIjoid3lYNWRhc0dEeGMxTEpYdEFRMlNWYlAxVFVFNldySlI2L1pLdkJ5Zjc4Yms3RCs2cEMyWTF3UHdqK1E1L2YrMXpyVDQxdGRNbXBVU0R3dmR5TjV1VkZzaXRXbkhkM0hERVhBT1RFa3BIaU1QOW16bDRJeGIwUk94N1h2WnB6WDZJMDFHTDhUN0IwbFNHNU9lM2h4YVNxaUd1UVlxZjVySEI1Z3p4RTBQOHVpcGZidk1uMzJYY0E5dUsxenZwdHUvRFhNeSsrWlcvMU5ZUEtxblJwRHpmc251TkNmK0pCVzVwdlppd1FRVGdBL3hnY1dJei9GUkNJUVZuL0R5bCtZSE5zTlpBb2VuVzhFZjBMVVhYODBnUzdTWWRRU3FIN2xubjBDR2I4dlJ2YVRTdElyNGsvajVRaGpPR2dLSVlqVGdkNWU3bnNyYms3S3dINmlJdVN6UnlKZ3YxV3Bqemp6bzJTVXV3S3NVaUhVM0l5M3BHU0FoS0g0VWk5eWJoTk1XaVlqWUJqRWxadmJCYkJxM0IwQ3hGTzUwNCtxRjFOTjNSWFdMUUk4Ni9NVmt3c1UrcnpJQlI0ZFFwZ1BxZXNkMG5mRmpFamFDekxJSk9reU5YZEZBaC91aS9tYXRFUk50NFFGQnQzNnNmTTlSSjFFeE1nSXdia3V4dUNvdngrMnNRd21KbW9LM0F2RE41bHA3a0FraGZSVGg0bnNSMklBMmR6TzU1T0JTbHM3T2k3c01ZRit5SS9aMXBCVTg0bm1LT21oL083U0tBc1RJSUVVSDlieHN2R296Q0xHaEJsUjlWcWRzUHJLQ2JsUkRmK2ViS0xzV3NwTVlON3hyOFlhWVVCWmtiSUlpaFM5aEpYYTh3cHJwVWxFUWpHQXlKWndCUTZ0bTFwNGI5d1plVkxrYitGUzlPVVhXUjA0emVYb0k0OXNxU1oxWVE1L0NVditYU00wV3VCZHlzTHBqV09IazhrekRYd05ac2pTZEk3OUNsaHZFNUpJTDh5WkhTMEtOdnEyWUI2T3BvQkFHbENiQ3JJandIeUpJdHhFWXdLV1Z3MzNPUTkyMGY3TUU3Z3JEdWsxeTBKZHkrTk5FWG1PcFZyUTRQbnVaZ1BUd3IxRVVNWmZiYWtGSnNnais4c1E1R0NpSTRabHNtbDE0MVBPcTl1cmJCWG9RNlVLaFJXenZUZ3dnR2VZQk1NZGNpcXU4UFlFNTRDVGlDUzlONlRpZEwwcmJkRzFXdUIwWUFPZzIwc0E3TzRqeWVtMmFkUkN2dUx4M0tUVzFVRWdZOTFYNWZMTyt6UWxEdFRUZmc0Ri81YlNVRzNIWXRZanhkNTM5Vk9jZTBIalZROWRxQjR5d1dUeWVRTFB5NWdMcUZ5RnF3aWg1VW5QU1JHQWorSEgwaHlDeHFUQW5SN20rRnhRYkI1dHNIZDRoSzJvZDZxVXZLeE5LVmdJNS9HV3E2NGZpSFkya0w4VXY5TE5Yd1p2bS9VZmdQaVJhUU9qV3VEZUdxdm81WEFPV1l6NHZtQ25GeWpaQWd2VFJHYnF2bWlhZ2F5bGVodGF5SURIazQ5SWpSWmphL2ViUExqc2lwTHpGR0lmR2E3NTAwYWhsdlN0MGlyV1UvcnBTOUsydzlTR1M1RkdCd2V3S25mdTM4THBoSGY3bGpmcU5ZUVVvUVdvdzk5b1NzdVdjcWt5NkNjTG5TcXFoTTRMSnJlNDJGV0NDeUV4MDNSd2JHUUVKSCt4ZENqZlc0WHpHUlByVlRuWWkyenNDUmpKYVlGRk92T0R3Nm1naWpoQktUYUNoTUxSME9EV1NNU2d5NFlTd3NCY3o1dDRjS1ljWXE4RCs3dGIyOUVSZW1GRzhvbVF2M285TEFqRjMvbEw3ZHlPeVdlOFQ0RGg2WUFTZWxsTUtyNE1GUmYzYmR6N0Jicm0vSm5WS1ZYcFVGaG5FTlpkUXN3ZFFVZ0xwUm50LzFEaVV3ZWxDWVhMRFNNQWR1cUZkaWdzUFRuNkM1Zy9CWUlBZStMVjgxT05BYUE2SjlVZTVLSTBqdVVydXVYajVqRXBFQlhzNERpN1E5QT09Iiwid3JhcHBlZEtleSI6ImNYcTB5STZkaU1pUWQrL1NJdUxSNXVucUFRT0tLUTJBTVR1QThDaGRaYTV2NjRjWXVJbkZPUGVnZ29OMHgzZ3I0bysyV0RKR2YrY1VIYmlMN0xORDFnPT0ifQ=="
多出了一个tmstoken的参数,这个参数哪来的?

寻找set-cookie的接口,我们来看看这个set-cookie是怎么产生的。刚开始以为js加密的,搞了半天没找到任何有关的迹象。

然后又回到抓包,继续分析请求。开心的是,在抓包中,居然找到了,相同的cookie:

request保持会话,寻找set-cookie来获取数据_第1张图片

这个请求的url如下:

 /app/approval/verifyOpenState?ignoreLoadingBar=true&userCenterToken=eyJlbmNyeXB0ZWREYXRhIjoiamJXZ0hPZUkrYWtCNW9QWEpCTHpOOHVtZW1oakhKVnRKM09aMjhEQllySlA3RW93OGJuaUtGNloyMnVialBhVE5iV2RsRSthMEx4MytSbDFsWHA1MUUyMDl2dXIwend5RVZFMVdtUEFmYnBQSjU4blFvVGd4VVoyM1hjY0xINXdYb0wyelp3Y3NEblhyRVZDUHV0YlgzUHdqZ2NXT0FoaFlCM3lESnU4eXRJd3JtVnVtaHhtcFZ6M2p4UUVHbVJjUnJ3TDBMaWdKQjA4MG95TE5rYkplNDBOR1dRVlZHS1UvYnpoT3liRlNyV0xTWk9McjlHcUh1c2lscFFCb2YzN3VDbEd2azJUdFAzd0RWZWF1KzQyb1RWdCtYOFlDTk4xMXVEOHZnZm5EVzZiYjkvR0xHTlJEL05NUThKSXlUUXJLVks5STRuenlMV2dBeVd2Q1JGUnFkdWkwMXdPeHd3MjVGMlJJdU5aMkZHQXIyYitXZjVyODZvTEFBQ01jenE1NkhzaWJ2elZ3Z3lrbjk5dEV1SzVkZ09YaWo1bXRkOGhFZ0kwdjE3T0toc295MVJ1dXh2SHVFai9KRlVDZUZqNit3Sk40Q2JZMlhNQzgxclYyMHhMWllPRDZEQ1hnSGh6Zityei9hcHRCYWM9Iiwid3JhcHBlZEtleSI6IlhpcEUrQlhKbGJOdldtRGZDVkRRVEhUTjFBUVMxMHMzT2c0RjlXM05sUEQ3UTh2SXBhRVVkUk1WQ3hqZ1hsWlR5L1RMeUJldUdaL01aSE5YYnlxR1pkUEhFS2RqcENGNE94MW1SNFJQWENlMmFQN3VRT2ppbUFmSE9HaHVPcHF5d2F6UFF1L0N5TWJyL09TcHgyL3JxUGZteUFFeHJlQjJndDBXVEMxbnBMUT0ifQ 

其中又又一个userCenterToken参数。我们只要拿到这个url然后用sessiong保持回话,不就可以了?

接下来的目标就是找这个userCenterToken的参数。

继续逆向分析,又看到了这个:

request保持会话,寻找set-cookie来获取数据_第2张图片

通过对比,发现这个appToken和userCenterToken的值是一样的,开发人员为了迷惑我们,特意将参数名给换掉了。

https://customer.api.keking.cn/product/getProductOpened/ 

我们只要访问这个接口不就可以找到userCenterToken,但是在这个请求头中,我们无奈的发现,还有一个token的参数。怎么办?

继续想办法。通过往上继续分析,发现这个token和我们刚才登陆后返回的cookie是一致的。

那么不是大功告成了。

整个流程就是这样。

代码如下:

import re
import requests
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
import time
import json
import traceback


class KaiJing():

    def __init__(self,username,password):
        self.username = username
        self.password = password
        self.s = requests.Session()

    def get_product(self):
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36',
            'Referer': 'https://i.keking.cn/user_index.html',
            "token": self.token
        }
        url = 'https://customer.api.keking.cn/product/getProductOpened'
        r = self.s.get(url, headers=headers)
        print(r.text)
        self.productId = re.findall('"productId":"(.*?)"', r.text)[1]
        self.corpId = re.findall('"corpId":"(.*?)"', r.text)[1]
        # print(self.productId)
        # print(self.corpId)
        self.userCenterToken = re.findall('"productAccessUrl":"http://cloud.keking.cn/#/transfer\?appToken=(.*?)"', r.text)[0]

    def get_tms_token(self):

        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36',
            'Referer': 'http://cloud.keking.cn/',
            # 'Cookie': 'token={0}'.format(self.token)
        }
        url1 = 'https://tms.api.keking.cn/app/approval/verifyOpenState?ignoreLoadingBar=true&userCenterToken={0}'.format(self.userCenterToken.replace('=', ''))
        url2 = 'https://tms.api.keking.cn/app/approval/tokenLogin?ignoreLoadingBar=true&userCenterToken={0}'.format(self.userCenterToken.replace('=', ''))
        r = self.s.get(url1, headers=headers)
        print(r.text)
        print(r.headers)
        r = self.s.get(url2, headers=headers)
        print(r.text)
        print(r.headers)

    def start_to_pay(self):
        url = 'https://tms.api.keking.cn/api/tms/pay/listDeparturePay?actualPayee=&applyDateFirst=2019-04-01&applyDateLast=2019-04-30&arriveCity=&arriveDistrict=&arriveProvince=&carNo=&carType=¤tPage=1&driverName=&globalCondition=&isCanLoan=&projects=&receiver=&rows=10&searchCondition=&searchContent=&searchMode=global&sendCity=&sendDistrict=&sendProvince=&supplierName='

        headers = {
            'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36',
            'Referer': 'http://cloud.keking.cn/?v={0}'.format(time.time()*1000),
        }
        res = self.s.get(url, headers=headers)
        print(res.text)


if __name__ == '__main__':
    kj = KaiJing('******','*****')
    kj.login()
    kj.get_product()
    # kj.get_tms_token()
    # kj.start_to_pay()

 

 

 

你可能感兴趣的:(爬虫)