python爬虫:破解途牛网国际机票查询接口

目标:调用途牛网数据,查询国际机票信息。

打开途牛网国际机票查询网站http://www.tuniu.com/flight_intel/,以上海到首尔为例,点击搜索后跳转到了http://www.tuniu.com/flight/intel/sha-sel,而链接后面的sha-sel简单比较可以知道,sha代表上海,sel代表首尔。即 出发地-到达地。确定url规律后,查看html源码发现并没有航班信息。那么只有进行抓包来分析了。

python爬虫:破解途牛网国际机票查询接口_第1张图片

逐个查看请求,发现url为 http://www.tuniu.com/tn?r=intlflight/remote/searchFlights的post请求返回的js格式,包涵了全部的航班信息,这个就是我们想要的。
那么分析这个post请求,可以看到提交的表单数据。
python爬虫:破解途牛网国际机票查询接口_第2张图片

direct:0
airline:
adultQuantity:1 (成人人数)
childQuantity:0 (儿童人数)
babyQuantity:0 (婴儿人数)
cabinClass:2 (舱位等级)
segmentList[0][dCityIataCode]:SHA (出发地:上海)
segmentList[0][aCityIataCode]:SEL (到达地:首尔)
segmentList[0][departDate]:2017-09-28 (出发时间)
segmentList[1][dCityIataCode]:SEL (返程出发地:首尔)
segmentList[1][aCityIataCode]:SHA (返程到达地:上海)
segmentList[1][departDate]:2017-10-02 (返程时间)
channelCount:0
stepFlag:0
cabinCode:
upFlightNo:
tokenS:B64B609D-8E9A-4A02-8A46-9FACC7C06EA0
token:40081

以上除了tokenS和token,其他没有备注的参数都固定不变,备注的那些也都是前面我们填写的行程信息和乘客信息。最主要的还是tokenS和token。接下来分析这俩。
首先分析html源码可以发现tokenS的值是调用的源码中的数据



而token的值则搜索不到。。。那么这个数据有可能是哪一个js文件产生的,打开所加载的js文件,搜索token关键词,发现search-all.js这个文件有token,选取该部分代码,格式化后如下

searchList: function(data) {
    var me = this;
    if (this.sendCount) {
        this.sendCount--;
        this.paramter = Eye.apply(this.paramter, data || {});
        this.paramter.tokenS = $('#token').val();
        this.paramter.token = typeof tokenSecret !== 'undefined' ? tokenSecret: '';
        if (typeof tokenSecret !== 'undefined') {
            var httpProxy = Eye.apply({
                data: this.paramter,
            },
            _TN_.remote.searchURI);
            console.log('Sending search request: ' + this.paramter.token);
            return this.send(httpProxy);
        } else {
            setTimeout(function() {
                me.paramter.token = typeof tokenSecret !== 'undefined' ? tokenSecret: '';
                var httpProxy = Eye.apply({
                    data: me.paramter,
                },
                _TN_.remote.searchURI);
                console.log('Resending search request: ' + me.paramter.token);
                return me.send(httpProxy);
            },
            3000);
        }
    }

第6行可以看到tokenS等于id=token的val的值,印证了前面我们对于tokenS的推测。第7行看到token=tokenSecret。可是在该js文件里搜索tokenSecret发现不存在,抓包所有请求搜索也没有发现其他出现tokenSecret的。。。这。。。不好搞啊。
试试在浏览器console中输入tokenSecret发现返回了前面post请求中的那么五位数字。那么可以确定,这个tokenSecret是个全局变量。
得了。。。javascripy我也不懂,笨办法的去找吧,用chrome的开发者工具,加断点,逐步调试。先给sha-sel第一行加断点,点下一步调试,没点几下就完了。。再来,点几下下一步,跳转到了in-min.js文件,查看Global全局变量,还好,没发现tokenSecret,说明还在后面,再在in-min.js第二行加断点,继续一直点下一步,后续打开了很多js文件,多次尝试发现执行完require.min.js文件第29行之后出现了tokenSecret,那么再在这里加断点,继续点下一步,跳转到了sha-sel的第1036行,继续执行了1037行,出现了tokenSecret。好了,这1037行是一大段的javascripy代码,拿出来格式化。

(function() {
    var izh = '',
    Tlr = 11;
    function KwJ(r) {
        var h = 1650957;
        var k = r.length;
        var e = [];
        for (var w = 0; w < k; w++) {
            e[w] = r.charAt(w)
        };
        for (var w = 0; w < k; w++) {
            var x = h * (w + 492) + (h % 22958);
            var l = h * (w + 735) + (h % 21496);
            var t = x % k;
            var u = l % k;
            var z = e[t];
            e[t] = e[u];
            e[u] = z;
            h = (x + l) % 2906955;
        };
        return e.join('')
    };
    var smd = KwJ('kodsfrtaozhrxnctrgucjeumnvqsoictwbpyl').substr(0, Tlr);
    var sxn = '={i.(u-90p+fcab=u[ lag]ifuxb)8f{(wij+ors(p6cor)d s9z.;106r1n)qs,dmh8v;11= ]a(f38rnl2<7n,(ha8=f7sc8cw}j0(o,n0av},).o7wrCh"lr,b +vata(,s9na[lf;0=c"e+lgj2t,;5}i).ra[t8Slforp+ she=li kh=>bhpt==5ec")90=0 r=(oagzuri.r[)anme"=v6;;n,wd;-g(Cw9=; r+e}vtm;nae.1l.+9=i+r+y;)kw8rm =;r=hn+r;n"f1-t;+(ve"kv,y{ut0n)=!+]l]sjr elun,m=ca; i=,nuuyvrrsqe0.vr,;eafdl= r4w;uahr5pfoa(n)3v0s9v,z,;.+g)avxu,r-v.,sr;i[dzAna{)vvo{tl(;xrb2on(u<{r)(labC*l,r.([82"=5e7;t(+17;k7qh;]rfp;p!"sa=.o )a6q;+awp*(l=te);te[i=x-cfelC.di.;+c6;)(+,.(hos}o+]At2wr7.ma;r=); +d;;]fa=e;ohg(ipu0;tivfs=a,flxff)hu;i1);>.6r+a.]vbxhs[ k- ](ghtoD("7s"xQ"c8"]"q8(i.QQ{C,"jm$QQa3QQ]0".,Q&50,QeQ"oQE17n, Ds"l nQ"m2.r+]"*EC0QshQ).2"i8#2FQ,Q6S[";"1CBiCQE3,C"7C"22q6%"!CQ,1y,9y-iQ"Q]",QQ0r""QdFC"a07- NQ+@_"t6nS@"C,66b8QtH79aQr.86v5 9j{"xQ(.#6.,=6n,Q(3,.s3s"Qs..[D8htj|>5201x7"0,=z.3g,8]],"s",C.c3(Q"9"TQ6DQDN\'-FQeQ.Q6y.,6Q,a| "36"6y*#j"2CQQ[Q:xz"8QjoD(t|CM2k\'",9sdn0Qu=9[Q\/8_[$d3",DQ.rsz,zQ2j57f6a"\/.5l43]rd()0nd.)(68.48n]j03uG8Q..CQ,]CmoQQ6viQk]Q&QQFn.(A9tQ6,(g"rQCj4,Q., @q<.DQ}.Qad6"jjEQ.5Q3]Qe"9,G9c49QyQ77"C]3[])Q)9Q8fQ,2 j"i.Q0mQC4.Q3C3)QQC7Q5FQQ\'[|"QQC.61[u,M]Q.,Qf8"a7,].tQvQ,dd-.. .)=Q..,"G"x4jQQ5!"4.Ds\'.7Q"|3])9bQ[e7QQECFq3QQ]r"D4F.#.{.(Q:3r]DdQQj6}e.j72tn7QQQ80Q]j},85"Q4..yQ]Q3N%;n!r(7"Q1!E4,!iCC#5Q)[.rQ\'e8S7Q3]0Q%Q3DQ0Q2,j(s,QQQ7Q(9 ".e3QQEQ.ja4x}5]Q,,zm!0)"Q0Cu8QyrQbzQD,8]7xEQ7H0"D]!yS7w4Qcd8zQ-nrQQ7NQ"Q07108[Q",g2.wQC7QCQ(cQ0.Q?CQ"&(_QQQ51).Q,x.vl)."Q3]7C8*[%QE7Q3n#.QQ%QQQ,E14=QQpEQ1(+iQ2Q,qQ{883"+l7Qv 7flF9C>QQD;2C.:;]hQQ0"0tyA9<2=5Q5;7".Inhe.x])5Q.d,6,jq Q.E6\/.Q]5Q]6.c5?3tQum76FQ.$I=D)Q{jy8FC3,oDc5]z3IQe0(*d6QjQ3.w!,Q.(dQ8Q1QBQ>.7.3Q]zC,QQC_Q].dkpT2]C.Q26Q]]"xa"Q4.84["8obG[]),9Qc5j"Q4"2=7Q"84dQnqFz?,r"Q}d+cC)5-2".AFQ0Q;1Q76Q"(tC)=%]0Qs41QQi,C\/QQ1"4t3nv"TAI7[8.Q}@x]Q]9")4,C,Qe].[(&2QQQ314,"5!]DE0MQ\/(y29]]QAQ0];#u]NQzQq,2C][";r.\'Q1.9E]"F,""9Q9MQE5t)p,>)[1Q3m$6|6QDsF.0]MC]0m0).3Q3utQh)h],"QQ!73,Q"._1(Q]Q2Qt|z];QcnQ]]QQ |QN]85;E."Q"3(Q"oEQ!,0.!.hQ!,.(yQg;4][5NC8)6Q3]@.Q2513=.Q.,sQ7r7.QQB6@.}1QED*].Q:dQQj]."Q]])QB2f]]0Q,]"1tQ"n&Q3Q="CQ(46].QA4d]k(]QDy,>aD5yQ2!+.*CQ!. Qv7Qi0:)539]"m."Q5A64M0y7)Q0r08."2Q{83D1.E82"7E]xHQ.QQ,7C0,0_.\'Qw!Q"Qd3y8D}HE.]C Q>71.k*_93}-"5B,Q)Ts1(Q]5((@8)""=8,"z&;Q"6{,xD"C_c]x7Qu,2.[Q!..9]QyxEQ;!Q9"_]hQE5,53ti.{QuzQ-nQC)!E,v4[rQ,334=u7i",,]n7)1r6,|t\'zQQt858s.}.]n79jQyQ7fE\'x"7.w2{!"* @))7F:=Q\'B|QA]]72Q  Q[mQj5i@ Q67 rQc[.dk1]]lj]DS.Ql8QQ7x[;cxx[[]9Cb}n6\',6QQQz93].,8xQj"")#)3QQ QQQ. G5y,Qr29Q1Q,.QC" j!j930",,;eQ]DQ _68.'));
    var ouN = wwd(izh, Hwt);
    ouN();
})()

喔操,这是啥??KwJ函数还看得懂。。。后面的是啥??
放进浏览器中调试一下











加入debugger,可以进行调试。
smd = "constructor"
bFt = KwJ[smd]。。。这个卡了很久,也咨询了做前端的朋友,说是什么构造函数,不懂。。。
XCL = bFt('', KwJ(sxn)),意思就是把KwJ(sxn)返回的字符串当作函数对象。
ouN = wwd(izh, Hwt)也是同样的意思。最后执行了ouN()。那么ouN函数到底是什么呢?调试最终得到一下代码。格式化后

(function(
/**/
) {
    var _0x37B6 = ["|", "split", "9|10|2|1|7|6|12|4|3|8|5|11|0", "0", "v", "a", "cKt", "l", "u", "beY", "e", "1", "FfZ", "Jrs", "2", "Nsd", "PrO", "MQc", "", "3", "c", "h", "zmE", "r", "C", "o", "d", "Vxx", "A", "BUV", "t", "4", "gsP", "5", "qBd", "k", "nWn", "n", "6", "g", "pIS", "E", "ZtD", "OaS", "m", "Kki", "WNS", "UDw", "ijw", "B", "Vdj", "y", "lIE", "I", "7", "8", "ndN", "EkQ", "NlO", "eYa", "csf", "sHO", "FeI", "qKv", "FnJ", "NvU", "LvJ", "DHJ", "TXR", "rsW", "hBq", "sFO", "uPp", "oAm", "9", "x", "TkV", "Uhd", "wXU", "M", "Gcb", "ynM", "10", "cbX", "DoO", "YMv", "yOx", "Fzs", "Wdt", "11", "cmR", "12"];
    function _0x37CF(_0x3BD0) {
        var _0x387E = {
            "gsP": function _0x3E41(_0x37B6) {
                return _0x37B6()
            },
            "ndN": function _0x39C3(_0x37B6, _0x37CF) {
                return _0x37B6 + _0x37CF
            },
            "EkQ": function _0x39AA(_0x37B6, _0x37CF) {
                return _0x37B6 + _0x37CF
            },
            "NlO": function _0x3991(_0x37B6, _0x37CF) {
                return _0x37B6 + _0x37CF
            },
            "eYa": function _0x3946(_0x37B6, _0x37CF) {
                return _0x37B6 + _0x37CF
            },
            "csf": function _0x3E28(_0x37CF, _0x37B6) {
                return _0x37CF + _0x37B6
            },
            "sHO": function _0x3D60(_0x37CF, _0x37B6) {
                return _0x37CF + _0x37B6
            },
            "FeI": function _0x3A59(_0x37CF, _0x37B6) {
                return _0x37CF + _0x37B6
            },
            "qKv": function _0x3D15(_0x37B6, _0x37CF) {
                return _0x37B6 + _0x37CF
            },
            "FnJ": function _0x3801(_0x37CF, _0x37B6) {
                return _0x37CF + _0x37B6
            },
            "NvU": function _0x3E0F(_0x37B6, _0x37CF) {
                return _0x37B6 + _0x37CF
            },
            "LvJ": function _0x3D47(_0x37CF, _0x37B6) {
                return _0x37CF + _0x37B6
            },
            "DHJ": function _0x3BB7(_0x37CF, _0x37B6) {
                return _0x37CF + _0x37B6
            },
            "TXR": function _0x37CF(_0x37B6, _0x37CF) {
                return _0x37B6 + _0x37CF
            },
            "rsW": function _0x3DC4(_0x37B6, _0x37CF) {
                return _0x37B6 + _0x37CF
            },
            "hBq": function _0x3C34(_0x37B6, _0x37CF) {
                return _0x37B6 + _0x37CF
            },
            "sFO": function _0x392D(_0x37B6) {
                return _0x37B6()
            },
            "uPp": function _0x3C1B(_0x37B6) {
                return _0x37B6()
            },
            "oAm": function _0x38E2(_0x37B6) {
                return _0x37B6()
            },
            "cbX": function _0x38FB(_0x37B6) {
                return _0x37B6()
            },
            "YMv": function _0x3B6C(_0x37B6, _0x37CF) {
                return _0x37B6 < _0x37CF
            },
            "DoO": function _0x3E5A(_0x37B6) {
                return _0x37B6()
            },
            "yOx": function _0x3CFC(_0x37CF, _0x37B6) {
                return _0x37CF * _0x37B6
            },
            "Fzs": function _0x3DF6(_0x37B6, _0x37CF) {
                return _0x37B6 > _0x37CF
            },
            "Wdt": function _0x3833(_0x37CF, _0x37B6) {
                return _0x37CF * _0x37B6
            }
        };
        var _0x3C02 = _0x37B6[2][_0x37B6[1]](_0x37B6[0]),
        _0x3978 = 0x0;
        while ( !! []) {
            switch (_0x3C02[_0x3978++]) {
            case _0x37B6[3]:
                return _0x3AD6;
                continue;
            case _0x37B6[11]:
                var _0x3865 = function() {
                    return _0x3A27[_0x37B6[9]](_0x3A27[_0x37B6[9]](_0x3A27[_0x37B6[6]](_0x3A27[_0x37B6[6]](_0x37B6[4], _0x37B6[5]), _0x37B6[7]), _0x37B6[8]), _0x37B6[10])
                };
                continue;
            case _0x37B6[14]:
                var _0x3C7F = function() {
                    return _0x3A27[_0x37B6[13]](_0x3A27[_0x37B6[12]](_0x37B6[4], _0x37B6[5]), _0x37B6[7])
                };
                continue;
            case _0x37B6[19]:
                var _0x3D92 = function() {
                    var _0x37CF = document[_0x3A27[_0x37B6[16]](_0x39DC)](_0x3A27[_0x37B6[15]](_0x3A0E));
                    return _0x37CF ? _0x37CF[_0x3A27[_0x37B6[17]](_0x3865)] : _0x37B6[18]
                };
                continue;
            case _0x37B6[31]:
                var _0x3A8B = function() {
                    return _0x3A27[_0x37B6[29]](_0x3A27[_0x37B6[29]](_0x3A27[_0x37B6[27]](_0x3A27[_0x37B6[22]](_0x3A27[_0x37B6[22]](_0x3A27[_0x37B6[22]](_0x3A27[_0x37B6[22]](_0x3A27[_0x37B6[22]](_0x3A27[_0x37B6[22]](_0x37B6[20], _0x37B6[21]), _0x37B6[5]), _0x37B6[23]), _0x37B6[24]), _0x37B6[25]), _0x37B6[26]), _0x37B6[10]), _0x37B6[28]), _0x37B6[30])
                };
                continue;
            case _0x37B6[33]:
                var _0x3AA4 = _0x3BD0 || _0x387E[_0x37B6[32]](_0x3D92);
                continue;
            case _0x37B6[38]:
                var _0x3A0E = function() {
                    return _0x3A27[_0x37B6[36]](_0x3A27[_0x37B6[36]](_0x3A27[_0x37B6[34]](_0x3A27[_0x37B6[34]](_0x37B6[30], _0x37B6[25]), _0x37B6[35]), _0x37B6[10]), _0x37B6[37])
                };
                continue;
            case _0x37B6[54]:
                var _0x39DC = function() {
                    return _0x3A27[_0x37B6[6]](_0x3A27[_0x37B6[6]](_0x3A27[_0x37B6[52]](_0x3A27[_0x37B6[50]](_0x3A27[_0x37B6[48]](_0x3A27[_0x37B6[47]](_0x3A27[_0x37B6[46]](_0x3A27[_0x37B6[45]](_0x3A27[_0x37B6[43]](_0x3A27[_0x37B6[42]](_0x3A27[_0x37B6[40]](_0x3A27[_0x37B6[40]](_0x3A27[_0x37B6[40]](_0x37B6[39], _0x37B6[10]), _0x37B6[30]), _0x37B6[41]), _0x37B6[7]), _0x37B6[10]), _0x37B6[44]), _0x37B6[10]), _0x37B6[37]), _0x37B6[30]), _0x37B6[49]), _0x37B6[51]), _0x37B6[53]), _0x37B6[26])
                };
                continue;
            case _0x37B6[55]:
                var _0x3AD6 = 0x0;
                continue;
            case _0x37B6[74]:
                var _0x3A27 = {
                    "ynM": function _0x3B08(_0x37E8, _0x37CF) {
                        return _0x387E[_0x37B6[56]](_0x37E8, _0x37CF)
                    },
                    "Gcb": function _0x3C66(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[56]](_0x37CF, _0x37E8)
                    },
                    "wXU": function _0x39F5(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[57]](_0x37CF, _0x37E8)
                    },
                    "Uhd": function _0x38C9(_0x37E8, _0x37CF) {
                        return _0x387E[_0x37B6[57]](_0x37E8, _0x37CF)
                    },
                    "TkV": function _0x3B3A(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[58]](_0x37CF, _0x37E8)
                    },
                    "Jrs": function _0x3914(_0x37E8, _0x37CF) {
                        return _0x387E[_0x37B6[58]](_0x37E8, _0x37CF)
                    },
                    "FfZ": function _0x3DAB(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[59]](_0x37CF, _0x37E8)
                    },
                    "beY": function _0x3B9E(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[59]](_0x37CF, _0x37E8)
                    },
                    "cKt": function _0x3A72(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[59]](_0x37CF, _0x37E8)
                    },
                    "lIE": function _0x3ABD(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[60]](_0x37CF, _0x37E8)
                    },
                    "Vdj": function _0x37E8(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[61]](_0x37CF, _0x37E8)
                    },
                    "ijw": function _0x3B85(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[61]](_0x37CF, _0x37E8)
                    },
                    "UDw": function _0x3CCA(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[61]](_0x37CF, _0x37E8)
                    },
                    "WNS": function _0x3CE3(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[62]](_0x37CF, _0x37E8)
                    },
                    "Kki": function _0x384C(_0x37E8, _0x37CF) {
                        return _0x387E[_0x37B6[63]](_0x37E8, _0x37CF)
                    },
                    "OaS": function _0x3D2E(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[64]](_0x37CF, _0x37E8)
                    },
                    "ZtD": function _0x381A(_0x37E8, _0x37CF) {
                        return _0x387E[_0x37B6[65]](_0x37E8, _0x37CF)
                    },
                    "pIS": function _0x3B21(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[65]](_0x37CF, _0x37E8)
                    },
                    "nWn": function _0x3C4D(_0x37E8, _0x37CF) {
                        return _0x387E[_0x37B6[65]](_0x37E8, _0x37CF)
                    },
                    "qBd": function _0x3897(_0x37CF, _0x37E8) {
                        return _0x387E[_0x37B6[66]](_0x37CF, _0x37E8)
                    },
                    "cmR": function _0x38B0(_0x37E8, _0x37CF) {
                        return _0x387E[_0x37B6[67]](_0x37E8, _0x37CF)
                    },
                    "BUV": function _0x3DDD(_0x37E8, _0x37CF) {
                        return _0x387E[_0x37B6[68]](_0x37E8, _0x37CF)
                    },
                    "Vxx": function _0x3AEF(_0x37E8, _0x37CF) {
                        return _0x387E[_0x37B6[69]](_0x37E8, _0x37CF)
                    },
                    "zmE": function _0x3BE9(_0x37E8, _0x37CF) {
                        return _0x387E[_0x37B6[70]](_0x37E8, _0x37CF)
                    },
                    "PrO": function _0x3CB1(_0x37CF) {
                        return _0x387E[_0x37B6[71]](_0x37CF)
                    },
                    "Nsd": function _0x3B53(_0x37CF) {
                        return _0x387E[_0x37B6[72]](_0x37CF)
                    },
                    "MQc": function _0x3D79(_0x37CF) {
                        return _0x387E[_0x37B6[73]](_0x37CF)
                    }
                };
                continue;
            case _0x37B6[82]:
                var _0x3C98 = function() {
                    return _0x3A27[_0x37B6[81]](_0x3A27[_0x37B6[80]](_0x3A27[_0x37B6[80]](_0x3A27[_0x37B6[78]](_0x3A27[_0x37B6[78]](_0x3A27[_0x37B6[77]](_0x3A27[_0x37B6[76]](_0x37B6[5], _0x37B6[75]), _0x37B6[4]), _0x37B6[10]), _0x37B6[8]), _0x37B6[79]), _0x37B6[49]), _0x37B6[75])
                };
                continue;
            case _0x37B6[89]:
                if (_0x3AA4 && _0x3AA4[_0x387E[_0x37B6[83]](_0x395F)]) {
                    for (var _0x3A40 = 0x0; _0x387E[_0x37B6[85]](_0x3A40, _0x3AA4[_0x387E[_0x37B6[84]](_0x395F)]); _0x3A40++) {
                        _0x3AD6 = _0x387E[_0x37B6[70]](_0x387E[_0x37B6[70]](_0x3AD6, _0x387E[_0x37B6[86]](_0x3AA4[_0x387E[_0x37B6[84]](_0x3A8B)](_0x3A40), _0x387E[_0x37B6[70]](_0x3A40, 0x1))), 0x7b);
                        if (_0x387E[_0x37B6[87]](_0x3AD6, _0x387E[_0x37B6[86]](0xa, 0x8))) {
                            _0x3AD6 -= _0x387E[_0x37B6[88]](0xa, 0x8)
                        }
                    }
                };
                continue;
            case _0x37B6[91]:
                var _0x395F = function() {
                    return _0x3A27[_0x37B6[34]](_0x3A27[_0x37B6[34]](_0x3A27[_0x37B6[34]](_0x3A27[_0x37B6[34]](_0x3A27[_0x37B6[90]](_0x37B6[7], _0x37B6[10]), _0x37B6[37]), _0x37B6[39]), _0x37B6[30]), _0x37B6[21])
                };
                continue
            };
            break
        }
    }
    tokenSecret = _0x37CF()
})

噢耶,看看最后一行是什么??? tokenSecret被我们找到了,逐步执行这段看似复杂的代码,其实它就只是实现了获取源码中的token值,然后代入进来,经过一段计算,得到最终tokenSecret的值。
那后面就简单了。

把tokenS代入下面的函数,返回token的值

    def make_token(tokenS):
        x = 0
        y = 0
        while x < len(tokenS):
            y += (x + 1) * ord(tokenS[x]) + 123
            if y > 80:
                y -= 80
            x += 1
        return y

至此,tokenS和token就解决了。
那屡屡思路
要获得机票信息,需要post请求http://www.tuniu.com/tn?r=intlflight/remote/searchFlights。post请求提交的表单中包涵有tokenS,而这个值需要从http://www.tuniu.com/flight/intel/sha-sel的网页源码中提取。而上面的url中sha和sel是怎么来的呢?
在上篇分析加载的js文件可以发现http://img1.tuniucdn.com/j/2018112003/intl_flight/v2/target/search/search-all.js中包涵的有很多城市,里面每个城市的cityIataCode即为url中的代码由来。我们可以把这段代码直接取出来,在查询时先进行匹配,找到cityIataCode,再拼接成url。
(另一种方式是,在抓包状态下,在出发城市文本框中输入上海,可以发现新增加了一个请求,http://www.tuniu.com/tn?r=intlflight/remote/getVagueCities&matchValue=%E4%B8%8A%E6%B5%B7,%E4%B8%8A%E6%B5%B7就是上海,该请求返回内容为该城市的信息,'code'即为前面url尾端的代码)
好了,都搞定了,直接写把。
第一步先获取出发地和到达地的城市code。拼接url,发起get请求,提取源码中的token的value值。再代入token生成代码生成token值。
构造post表单所需的数据,进行post请求。
结果。。。。然并卵,啥都没。
继续分析,加上请求头,还是不行,再加上cookie,可以了。分析cookie,逐步删除一些复杂的项,OLBSESSID就是这个货,带上它就可以正常获取。查询这个东西,发现是在第一次访问http://www.tuniu.com/flight/intel/sha-sel是set-cookie带上的,这个肯定是服务端生成的了,我们就不去想着破解了。。。不过也好办,直接用requests的session解决。

不过存在一个问题,第一次执行的时候会出错,再次执行就正常获取了。这个需要再琢磨一下。

你可能感兴趣的:(python爬虫:破解途牛网国际机票查询接口)