Python 爬虫实例(3)—— 爬取今日头条as cp 算法 解密

关于今日头条的 as cp 算法,只是对时间进行了加密,他们的js代码是压缩处理的,正常格式化就可以了

url = "http://www.toutiao.com/api/pc/feed/" data = { "category":"news_game", "utm_source":"toutiao", "widen":str(i), "max_behot_time":"0", "max_behot_time_tmp":"0", "tadrequire":"true", "as":"479BB4B7254C150", "cp":"7E0AC8874BB0985", } headers = { "Host":"www.toutiao.com", "Connection":"keep-alive", "Accept":"text/javascript, text/html, application/xml, text/xml, */*", "X-Requested-With":"XMLHttpRequest", "User-Agent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36", "Content-Type":"application/x-www-form-urlencoded", "Referer":"http://www.toutiao.com/ch/news_hot/", "Accept-Encoding":"gzip, deflate", "Accept-Language":"zh-CN,zh;q=0.8", } result1 = session.get(url=url,params=data,headers=headers).text result2 =json.loads(result1)

今日头条 as cp算法解析,我们在提交的时候"as":"479BB4B7254C150",   "cp":"7E0AC8874BB0985",就可以了,有兴趣的同学可以去看看他们的js代码,只是简单的对进行了加密

 

 

 

var e = {};
    e.getHoney = function() {
        var t = Math.floor((new Date).getTime() / 1e3),
            e = t.toString(16).toUpperCase(),
            n = md5(t).toString().toUpperCase();
        if (8 != e.length) return {
            as: "479BB4B7254C150",
            cp: "7E0AC8874BB0985"
        };
        for (var o = n.slice(0, 5), i = n.slice(-5), a = "", r = 0; 5 > r; r++) a += o[r] + e[r];
        for (var l = "", s = 0; 5 > s; s++) l += e[s + 3] + i[s];
        return {
            as: "A1" + a + e.slice(-3),
            cp: e.slice(0, 3) + l + "E1"
        }
    }, t.ascp = e
}(window, document), function() {
    var t = ascp.getHoney(),
        e = {
            path: "/",
            domain: "i.snssdk.com"
        };
    $.cookie("cp", t.cp, e), $.cookie("as", t.as, e), window._honey = t
}(), Flow.prototype = {
    init: function() {
        var t = this;
        this.url && (t.showState(t.auto_load ? NETWORKTIPS.LOADING : NETWORKTIPS.HASMORE), this.container.on("scrollBottom", function() {
            t.auto_load && (t.lock || t.has_more && t.loadmore())
        }), this.list_bottom.on("click", "a", function() {
            return t.lock = !1, t.loadmore(), !1
        }))
    },
    loadmore: function(t) {
        this.getData(this.url, this.type, this.param, t)
    },

 as  cp 算法 Python 实现

import time
import hashlib 

def get_as_cp():
    zz ={}
    now = round(time.time())
    print now  #获取计算机时间
    e = hex(int(now)).upper()[2:]  #hex()转换一个整数对象为十六进制的字符串表示
    print e 
    i = hashlib.md5(str(int(now))).hexdigest().upper() #hashlib.md5().hexdigest()创建hash对象并返回16进制结果
    if len(e)!=8:
        zz = {'as': "479BB4B7254C150",
            'cp': "7E0AC8874BB0985"}
        return zz
    n=i[:5]
    a=i[-5:]
    r = ""
    s = ""
    for i in range(5):
        s = s+n[i]+e[i]
    for j in range(5):
        r = r+e[j+3]+a[j]
    zz = {
            'as': "A1" + s + e[-3:],
            'cp': e[0:3] + r + "E1"
        } 
    print zz

if __name__ == "__main__":
    get_as_cp()

 

你可能感兴趣的:(Python 爬虫实例(3)—— 爬取今日头条as cp 算法 解密)