发布文章内容,只为自己自学逆向分析做一个记录,方便以后加以巩固学习逆向分析。
本人为逆向学习小白,所以发布的内容都是简单的逆向分析。大佬请高抬贵手!
猿人学的每一道题目不仅仅是单纯的一种破解一个参数的加密过程,还有一些反调试、混淆之类的综合性的题目,值得练手!
https://match.yuanrenxue.com/match/1
题目1:抓取所有(5页)机票的价格,并计算所有机票价格的平均值,填入答案
1、按下F12打开开发者工具后,刷新页面发现 有一个定时器 每500毫秒就开始debugger;
最简单的方法就是,在debugger处 下个断点,然后右键 编辑断点,填入 false ;
然后点击继续,发现过掉了,进入了另外的一个js 里面;
2、在这个JS里面也有一个计时器,并且 判断了setInterval.toString() 和 eval.toString() 的字符串是否一致;这个才是检测反调试的判断检测口;
那么刚才的那个好像过了,但是好像没有全过,所以需要换一种方式进行;
最简单的方法就是 把这两个JS 检测的文件直接删除
删除JS文件
在Overrides 里面把 uzt.js 和 uyt.js 都保存到这里,并且 把 当下保存的文件里面的JS代码给删除掉!
当删除保存后,重新刷新页面后,没有进行了debugger了;
网页也正常的刷新出机票的数据,我们可以非常清晰的看到机票的json数据
分析加密数据
https://match.yuanrenxue.com/api/match/1?m=a0c1d8933ce19752722a965cb2424ded%E4%B8%A81633117185
从headers 里面可以清楚的看到这个是一个get请求
并且有一个m 值 m: a0c1d8933ce19752722a965cb2424ded丨1633117185
从分析多次的网页经验可以大概判断一下,
分隔符 | 左边的可能是一个32位MD5的加密数据 右边的可能是一个10位时间戳
并且是一个ajax的请求;
那么可以通过传统的xhr 里面进行跟随分析
点击 request 这个点 跟随进去看一下;
点击进去一下 ,哦豁16进制编码的数据
那就直接在头部下个断点试试看呗;设置好重新刷新页面,发现断点被停留住;
发现第一个 就是 机票数据接口的URL
并且发现 $['\x61\x6a\x61\x78'] 就是 $ajax 的数据发送组合的地方;
那么就自己在这个地方下断点,然后流程跟过去;
可以看到:
_0x2268f9 就是 格式化时间转换为时间戳 外加(16798545 + -72936737 + 156138192) =10W秒的时间
_0x57feae 就是M 值
是oo0O0(_0x2268f9['\x74\x6f\x53\x74\x72' + '\x69\x6e\x67']()) + window['\x66'] 组合
断点调试后 可以了解到
m = oo0O0(_0x2268f9.toString()) + window['\x66']
并且发现了一个问题,
_0x57feae 中前面的这个
oo0O0(_0x2268f9['\x74\x6f\x53\x74\x72' + '\x69\x6e\x67']())
这个值 为空
window['\x66'] 是直接获取到的
并且是在 window 里面 定义的,那么就可以说明一点,在这个js 请求前
这个M的前半部分值已经是计算好了 并且存到了 window['\x66'] 里面
这里仅仅是调用了数据而已
然后开始往上找调用栈;发现了在当前网页中有js数据存在 ,并且可以看到 oo0O0(mw)
这个函数,传递啥,后续 return出来一个 ‘’ 空 ;
function oo0O0(mw) {
window.b = '';
for (var i = 0,
len = window.a.length; i < len; i++) {
console.log(window.a[i]);
window.b += String[document.e + document.g](window.a[i][document.f + document.h]() - i - window.c)
}
var U = ['W5r5W6VdIHZcT8kU', 'WQ8CWRaxWQirAW=='];
var J = function(o, E) {
o = o - 0x0;
var N = U[o];
if (J['bSSGte'] === undefined) {
var Y = function(w) {
var m = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/=',
T = String(w)['replace'](/=+$/, '');
var A = '';
for (var C = 0x0,
b, W, l = 0x0; W = T['charAt'](l++);~W && (b = C % 0x4 ? b * 0x40 + W: W, C++%0x4) ? A += String['fromCharCode'](0xff & b >> ( - 0x2 * C & 0x6)) : 0x0) {
W = m['indexOf'](W)
}
return A
};
var t = function(w, m) {
var T = [],
A = 0x0,
C,
b = '',
W = '';
w = Y(w);
for (var R = 0x0,
v = w['length']; R < v; R++) {
W += '%' + ('00' + w['charCodeAt'](R)['toString'](0x10))['slice']( - 0x2)
}
w = decodeURIComponent(W);
var l;
for (l = 0x0; l < 0x100; l++) {
T[l] = l
}
for (l = 0x0; l < 0x100; l++) {
A = (A + T[l] + m['charCodeAt'](l % m['length'])) % 0x100,
C = T[l],
T[l] = T[A],
T[A] = C
}
l = 0x0,
A = 0x0;
for (var L = 0x0; L < w['length']; L++) {
l = (l + 0x1) % 0x100,
A = (A + T[l]) % 0x100,
C = T[l],
T[l] = T[A],
T[A] = C,
b += String['fromCharCode'](w['charCodeAt'](L) ^ T[(T[l] + T[A]) % 0x100])
}
return b
};
J['luAabU'] = t,
J['qlVPZg'] = {},
J['bSSGte'] = !![]
}
var H = J['qlVPZg'][o];
return H === undefined ? (J['TUDBIJ'] === undefined && (J['TUDBIJ'] = !![]), N = J['luAabU'](N, E), J['qlVPZg'][o] = N) : N = H,
N
};
eval(atob(window['b'])[J('0x0', ']dQW')](J('0x1', 'GTu!'), '\x27' + mw + '\x27'));
return ''
但是 这个函数 肯定不会这么简单的写在这里,通过函数里面的数据可以发现;
这个函数 是可以改变window['\x66']的值;忘记说了,
window['\x66'] 这个 就是 windows.f
eval(atob(window['b'])[J('0x0', ']dQW')](J('0x1', 'GTu!'), '\x27' + mw + '\x27'));
继续对 oo0O0(mw) 这个函数 进行研究
在最后return 前面 有这么一串 代码
atob() 方法用于解码使用 base-64 编码的字符串。
可以在 控制台打印一下 看看 atob(window['b']) 可以输出什么东东
以下是格式化美化过的输出代码
在最后可以 清楚的看到 window.f = hex_md5(mwqqppz)
也就是 MD5 加密
var hexcase = 0;
var b64pad = "";
var chrsz = 16;
function hex_md5(a) {
return binl2hex(core_md5(str2binl(a), a.length * chrsz))
}
function b64_md5(a) {
return binl2b64(core_md5(str2binl(a), a.length * chrsz))
}
function str_md5(a) {
return binl2str(core_md5(str2binl(a), a.length * chrsz))
}
function hex_hmac_md5(a, b) {
return binl2hex(core_hmac_md5(a, b))
}
function b64_hmac_md5(a, b) {
return binl2b64(core_hmac_md5(a, b))
}
function str_hmac_md5(a, b) {
return binl2str(core_hmac_md5(a, b))
}
function md5_vm_test() {
return hex_md5("abc") == "900150983cd24fb0d6963f7d28e17f72"
}
function core_md5(p, k) {
p[k >> 5] |= 128 << ((k) % 32);
p[(((k + 64) >>> 9) << 4) + 14] = k;
var o = 1732584193;
var n = -271733879;
var m = -1732584194;
var l = 271733878;
for (var g = 0; g < p.length; g += 16) {
var j = o;
var h = n;
var f = m;
var e = l;
o = md5_ff(o, n, m, l, p[g + 0], 7, -680976936);
l = md5_ff(l, o, n, m, p[g + 1], 12, -389564586);
m = md5_ff(m, l, o, n, p[g + 2], 17, 606105819);
n = md5_ff(n, m, l, o, p[g + 3], 22, -1044525330);
o = md5_ff(o, n, m, l, p[g + 4], 7, -176418897);
l = md5_ff(l, o, n, m, p[g + 5], 12, 1200080426);
m = md5_ff(m, l, o, n, p[g + 6], 17, -1473231341);
n = md5_ff(n, m, l, o, p[g + 7], 22, -45705983);
o = md5_ff(o, n, m, l, p[g + 8], 7, 1770035416);
l = md5_ff(l, o, n, m, p[g + 9], 12, -1958414417);
m = md5_ff(m, l, o, n, p[g + 10], 17, -42063);
n = md5_ff(n, m, l, o, p[g + 11], 22, -1990404162);
o = md5_ff(o, n, m, l, p[g + 12], 7, 1804660682);
l = md5_ff(l, o, n, m, p[g + 13], 12, -40341101);
m = md5_ff(m, l, o, n, p[g + 14], 17, -1502002290);
n = md5_ff(n, m, l, o, p[g + 15], 22, 1236535329);
o = md5_gg(o, n, m, l, p[g + 1], 5, -165796510);
l = md5_gg(l, o, n, m, p[g + 6], 9, -1069501632);
m = md5_gg(m, l, o, n, p[g + 11], 14, 643717713);
n = md5_gg(n, m, l, o, p[g + 0], 20, -373897302);
o = md5_gg(o, n, m, l, p[g + 5], 5, -701558691);
l = md5_gg(l, o, n, m, p[g + 10], 9, 38016083);
m = md5_gg(m, l, o, n, p[g + 15], 14, -660478335);
n = md5_gg(n, m, l, o, p[g + 4], 20, -405537848);
o = md5_gg(o, n, m, l, p[g + 9], 5, 568446438);
l = md5_gg(l, o, n, m, p[g + 14], 9, -1019803690);
m = md5_gg(m, l, o, n, p[g + 3], 14, -187363961);
n = md5_gg(n, m, l, o, p[g + 8], 20, 1163531501);
o = md5_gg(o, n, m, l, p[g + 13], 5, -1444681467);
l = md5_gg(l, o, n, m, p[g + 2], 9, -51403784);
m = md5_gg(m, l, o, n, p[g + 7], 14, 1735328473);
n = md5_gg(n, m, l, o, p[g + 12], 20, -1921207734);
o = md5_hh(o, n, m, l, p[g + 5], 4, -378558);
l = md5_hh(l, o, n, m, p[g + 8], 11, -2022574463);
m = md5_hh(m, l, o, n, p[g + 11], 16, 1839030562);
n = md5_hh(n, m, l, o, p[g + 14], 23, -35309556);
o = md5_hh(o, n, m, l, p[g + 1], 4, -1530992060);
l = md5_hh(l, o, n, m, p[g + 4], 11, 1272893353);
m = md5_hh(m, l, o, n, p[g + 7], 16, -155497632);
n = md5_hh(n, m, l, o, p[g + 10], 23, -1094730640);
o = md5_hh(o, n, m, l, p[g + 13], 4, 681279174);
l = md5_hh(l, o, n, m, p[g + 0], 11, -358537222);
m = md5_hh(m, l, o, n, p[g + 3], 16, -722881979);
n = md5_hh(n, m, l, o, p[g + 6], 23, 76029189);
o = md5_hh(o, n, m, l, p[g + 9], 4, -640364487);
l = md5_hh(l, o, n, m, p[g + 12], 11, -421815835);
m = md5_hh(m, l, o, n, p[g + 15], 16, 530742520);
n = md5_hh(n, m, l, o, p[g + 2], 23, -995338651);
o = md5_ii(o, n, m, l, p[g + 0], 6, -198630844);
l = md5_ii(l, o, n, m, p[g + 7], 10, 11261161415);
m = md5_ii(m, l, o, n, p[g + 14], 15, -1416354905);
n = md5_ii(n, m, l, o, p[g + 5], 21, -57434055);
o = md5_ii(o, n, m, l, p[g + 12], 6, 1700485571);
l = md5_ii(l, o, n, m, p[g + 3], 10, -1894446606);
m = md5_ii(m, l, o, n, p[g + 10], 15, -1051523);
n = md5_ii(n, m, l, o, p[g + 1], 21, -2054922799);
o = md5_ii(o, n, m, l, p[g + 8], 6, 1873313359);
l = md5_ii(l, o, n, m, p[g + 15], 10, -30611744);
m = md5_ii(m, l, o, n, p[g + 6], 15, -1560198380);
n = md5_ii(n, m, l, o, p[g + 13], 21, 1309151649);
o = md5_ii(o, n, m, l, p[g + 4], 6, -145523070);
l = md5_ii(l, o, n, m, p[g + 11], 10, -1120210379);
m = md5_ii(m, l, o, n, p[g + 2], 15, 718787259);
n = md5_ii(n, m, l, o, p[g + 9], 21, -343485551);
o = safe_add(o, j);
n = safe_add(n, h);
m = safe_add(m, f);
l = safe_add(l, e)
}
return Array(o, n, m, l)
}
function md5_cmn(h, e, d, c, g, f) {
return safe_add(bit_rol(safe_add(safe_add(e, h), safe_add(c, f)), g), d)
}
function md5_ff(g, f, k, j, e, i, h) {
return md5_cmn((f & k) | ((~f) & j), g, f, e, i, h)
}
function md5_gg(g, f, k, j, e, i, h) {
return md5_cmn((f & j) | (k & (~j)), g, f, e, i, h)
}
function md5_hh(g, f, k, j, e, i, h) {
return md5_cmn(f ^ k ^ j, g, f, e, i, h)
}
function md5_ii(g, f, k, j, e, i, h) {
return md5_cmn(k ^ (f | (~j)), g, f, e, i, h)
}
function core_hmac_md5(c, f) {
var e = str2binl(c);
if (e.length > 16) {
e = core_md5(e, c.length * chrsz)
}
var a = Array(16),
d = Array(16);
for (var b = 0; b < 16; b++) {
a[b] = e[b] ^ 909522486;
d[b] = e[b] ^ 1549556828
}
var g = core_md5(a.concat(str2binl(f)), 512 + f.length * chrsz);
return core_md5(d.concat(g), 512 + 128)
}
function safe_add(a, d) {
var c = (a & 65535) + (d & 65535);
var b = (a >> 16) + (d >> 16) + (c >> 16);
return (b << 16) | (c & 65535)
}
function bit_rol(a, b) {
return (a << b) | (a >>> (32 - b))
}
function str2binl(d) {
var c = Array();
var a = (1 << chrsz) - 1;
for (var b = 0; b < d.length * chrsz; b += chrsz) {
c[b >> 5] |= (d.charCodeAt(b / chrsz) & a) << (b % 32)
}
return c
}
function binl2str(c) {
var d = "";
var a = (1 << chrsz) - 1;
for (var b = 0; b < c.length * 32; b += chrsz) {
d += String.fromCharCode((c[b >> 5] >>> (b % 32)) & a)
}
return d
}
function binl2hex(c) {
var b = hexcase ? "0123456789ABCDEF" : "0123456789abcdef";
var d = "";
for (var a = 0; a < c.length * 4; a++) {
d += b.charAt((c[a >> 2] >> ((a % 4) * 8 + 4)) & 15) + b.charAt((c[a >> 2] >> ((a % 4) * 8)) & 15)
}
return d
}
function binl2b64(d) {
var c = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
var f = "";
for (var b = 0; b < d.length * 4; b += 3) {
var e = (((d[b >> 2] >> 8 * (b % 4)) & 255) << 16) | (((d[b + 1 >> 2] >> 8 * ((b + 1) % 4)) & 255) << 8) | ((d[b + 2 >> 2] >> 8 * ((b + 2) % 4)) & 255);
for (var a = 0; a < 4; a++) {
if (b * 8 + a * 6 > d.length * 32) {
f += b64pad
} else {
f += c.charAt((e >> 6 * (3 - a)) & 63)
}
}
}
return f
};
window.f = hex_md5(mwqqppz)
通过 鬼鬼调试工具 把 刚才打印出来的JS 直接拿来即用
就可以获取到 正常的 m 值前半部分了!
爬取测试
打开我的vscode 开始python爬虫之旅
对于js的解析 使用了 execjs库去调用js代码
# coding:utf-8
import requests
import time
import execjs
def get_time():
now = int(time.time())*1000
now = now + (16798545 + -72936737 + 156138192)
print(now)
return now
def js_md5(timestamp):
js_txt = open('yuanrenxue1.js','r',encoding='utf-8').read()
js_complie = execjs.compile(js_txt)
hex_md5 = js_complie.call('hex_md5',str(timestamp))
print(hex_md5)
return hex_md5
def yuanrenxue_sprider(md5,timestamp):
url = 'https://match.yuanrenxue.com/api/match/1?m={}'.format(md5+'丨'+str(int(timestamp / 1000)))
print(url)
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.10 Safari/537.36',
'cookie': 'Hm_lvt_0362c7a08a9a04ccf3a8463c590e1e2f=1632991231; vaptchaNetway=cn; Hm_lvt_c99546cf032aaa5a679230de9a95c7db=1632991226,1633063557; Hm_lvt_9bcbda9cbf86757998a2339a0437208e=1632991248,1633063559; Hm_lpvt_9bcbda9cbf86757998a2339a0437208e=1633063559; Hm_lpvt_c99546cf032aaa5a679230de9a95c7db=1633063623'
}
reulsts = requests.get(url,headers=headers)
print(reulsts.text)
if __name__ == '__main__':
timestamp = get_time()
md5 = js_md5(timestamp)
yuanrenxue_sprider(md5,timestamp)
最终输出了 机票的 价格
点击第二页 可以发现 URL上面增加了一个 page=2 的参数传递
那么就可以根据这个参数 传递 把 之前的python脚本修改一下 就可以抓取到 后面几页的数据
# coding:utf-8
import requests
import time
import execjs
def get_time():
now = int(time.time())*1000
now = now + (16798545 + -72936737 + 156138192)
print(now)
return now
def js_md5(timestamp):
js_txt = open('yuanrenxue1.js','r',encoding='utf-8').read()
js_complie = execjs.compile(js_txt)
hex_md5 = js_complie.call('hex_md5',str(timestamp))
print(hex_md5)
return hex_md5
def yuanrenxue_sprider(md5,timestamp,page):
url = 'https://match.yuanrenxue.com/api/match/1?page={page}&m={m}'.format(page=page,m=md5+'丨'+str(int(timestamp / 1000)))
print(url)
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.10 Safari/537.36',
'cookie': 'Hm_lvt_0362c7a08a9a04ccf3a8463c590e1e2f=1632991231; vaptchaNetway=cn; Hm_lvt_c99546cf032aaa5a679230de9a95c7db=1632991226,1633063557; Hm_lvt_9bcbda9cbf86757998a2339a0437208e=1632991248,1633063559; Hm_lpvt_9bcbda9cbf86757998a2339a0437208e=1633063559; Hm_lpvt_c99546cf032aaa5a679230de9a95c7db=1633063623'
}
reulsts = requests.get(url,headers=headers)
print(reulsts.text)
if __name__ == '__main__':
timestamp = get_time()
md5 = js_md5(timestamp)
for i in range(1,5):
yuanrenxue_sprider(md5,timestamp,i)
但是在脚本运行的时候,又发现了一个问题;
第四页,第五页已锁定。只能使用程序进行协议请求才能看到数据。在使用程序请求这两个页面时请将User-Agent设置为: yuanrenxue.project
那么 就需要 在执行 第四页 第五页的时候 把 uesr-Agent 修改一下 在进行请求,那么我们就进行一个简单的判断即可
在请求URL传递数据之前,我们可以判断 page 的页数值 如果是 4 或者是 5 的情况 就把 user-Agent的值进行修改一下即可
# coding:utf-8
import requests
import time
import execjs
def get_time():
now = int(time.time())*1000
now = now + (16798545 + -72936737 + 156138192)
print(now)
return now
def js_md5(timestamp):
js_txt = open('yuanrenxue1.js','r',encoding='utf-8').read()
js_complie = execjs.compile(js_txt)
hex_md5 = js_complie.call('hex_md5',str(timestamp))
print(hex_md5)
return hex_md5
def yuanrenxue_sprider(md5,timestamp,page):
url = 'https://match.yuanrenxue.com/api/match/1?page={page}&m={m}'.format(page=page,m=md5+'丨'+str(int(timestamp / 1000)))
print(url)
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.10 Safari/537.36',
'cookie': 'Hm_lvt_0362c7a08a9a04ccf3a8463c590e1e2f=1632991231; vaptchaNetway=cn; Hm_lvt_c99546cf032aaa5a679230de9a95c7db=1632991226,1633063557; Hm_lvt_9bcbda9cbf86757998a2339a0437208e=1632991248,1633063559; Hm_lpvt_9bcbda9cbf86757998a2339a0437208e=1633063559; Hm_lpvt_c99546cf032aaa5a679230de9a95c7db=1633063623'
}
if page == 4 or page == 5:
headers['user-agent'] = 'yuanrenxue.project'
reulsts = requests.get(url,headers=headers)
print(reulsts.text)
if __name__ == '__main__':
timestamp = get_time()
md5 = js_md5(timestamp)
for i in range(1,6):
yuanrenxue_sprider(md5,timestamp,i)
输出的结果 展示:
至此 我们获取了1~5页的所有的机票票价数据了,貌似离成功不远了!
现在我们可以获取到所有的5页的票价数据,然后我们可以 创建一个数组,然后把票价数据全部放入这个数组里面,通过sum函数获取总数值,通过len函数获取数量,然后计算即可;
那么现在我们还需要重新的把之前的脚本在修改一下下!
# coding:utf-8
import requests
import time
import execjs
def get_time():
now = int(time.time())*1000
now = now + (16798545 + -72936737 + 156138192)
print(now)
return now
def js_md5(timestamp):
js_txt = open('yuanrenxue1.js','r',encoding='utf-8').read()
js_complie = execjs.compile(js_txt)
hex_md5 = js_complie.call('hex_md5',str(timestamp))
print(hex_md5)
return hex_md5
def yuanrenxue_sprider(md5,timestamp,page):
url = 'https://match.yuanrenxue.com/api/match/1?page={page}&m={m}'.format(page=page,m=md5+'丨'+str(int(timestamp / 1000)))
print(url)
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.10 Safari/537.36',
'cookie': 'Hm_lvt_0362c7a08a9a04ccf3a8463c590e1e2f=1632991231; vaptchaNetway=cn; Hm_lvt_c99546cf032aaa5a679230de9a95c7db=1632991226,1633063557; Hm_lvt_9bcbda9cbf86757998a2339a0437208e=1632991248,1633063559; Hm_lpvt_9bcbda9cbf86757998a2339a0437208e=1633063559; Hm_lpvt_c99546cf032aaa5a679230de9a95c7db=1633063623'
}
if page == 4 or page == 5:
headers['user-agent'] = 'yuanrenxue.project'
response = requests.get(url,headers=headers)
res = response.json()
for i in res['data']:
data = i['value']
ticket_lists.append(data)
if __name__ == '__main__':
ticket_lists = []
timestamp = get_time()
md5 = js_md5(timestamp)
for i in range(1,6):
yuanrenxue_sprider(md5,timestamp,i)
#计算飞机票总数 和平均值
average = sum(ticket_lists) / len(ticket_lists)
print('飞机票的平均值为:',average)
最终获取到 了票价的平均值 !