应对java反编译中的JVM指令
反编译的一个主要困难就是遭遇虚拟机指令,但是并不可怕。现举一例:
1: private static String b(String s)
2: {
3: char ac[];
4: int i;
5: int j;
6: ac = s.toCharArray();
7: i = ac.length;
8: j = 0;
9: goto _L1
10: _L9:
11: ac;
12: j;
13: JVM INSTR dup2 ;
14: JVM INSTR caload ;
15: j % 5;
16: JVM INSTR tableswitch 0 3: default 76
17: // 0 52
18: // 1 58
19: // 2 64
20: // 3 70;
21: goto _L2 _L3 _L4 _L5 _L6
22: _L3:
23: 0x62;
24: goto _L7
25: _L4:
26: 23;
27: goto _L7
28: _L5:
29: 46;
30: goto _L7
31: _L6:
32: 14;
33: goto _L7
34: _L2:
35: 95;
36: _L7:
37: JVM INSTR ixor ;
38: (char);
39: JVM INSTR castore ;
40: j++;
41: _L1:
42: if(j < i) goto _L9; else goto _L8
43: _L8:
44: return new String(ac);
45: }
2: {
3: char ac[];
4: int i;
5: int j;
6: ac = s.toCharArray();
7: i = ac.length;
8: j = 0;
9: goto _L1
10: _L9:
11: ac;
12: j;
13: JVM INSTR dup2 ;
14: JVM INSTR caload ;
15: j % 5;
16: JVM INSTR tableswitch 0 3: default 76
17: // 0 52
18: // 1 58
19: // 2 64
20: // 3 70;
21: goto _L2 _L3 _L4 _L5 _L6
22: _L3:
23: 0x62;
24: goto _L7
25: _L4:
26: 23;
27: goto _L7
28: _L5:
29: 46;
30: goto _L7
31: _L6:
32: 14;
33: goto _L7
34: _L2:
35: 95;
36: _L7:
37: JVM INSTR ixor ;
38: (char);
39: JVM INSTR castore ;
40: j++;
41: _L1:
42: if(j < i) goto _L9; else goto _L8
43: _L8:
44: return new String(ac);
45: }
从整体看,尤其是40和42行表现出这是一个循环:
char[] ac= s.toCharArray();
int i = ac.length;
int j = 0;
while(j < i) {
//循环体
}
关键是循环体。首先介绍一下这里出现的几个JVM INSTR。
dup2即duplicate 2,它把operand stack顶端的两个数复制到operand stack;
caload即char array load;
tableswitch即switch case结构,其中的52、58、64、70、76是源代码中的行编号;
ixor即int xor;
castore即char array store。
从11行到40行就是循环体。由于java的操作都是栈操作,后面的分析会发现每一步都涉及到栈。循环体的内容通过栈来理解,就是:
11: push ac;
12: push j;
13: dup ac j;
12: push j;
13: dup ac j;
//此时栈的内容是ac,j,ac,j,这是为了后来的caload和castore服务
14: caload ac[j];
14: caload ac[j];
//此时栈的内容是ac,j,ac[j]
15: push j % 5;
15: push j % 5;
//此时栈的内容是ac,j,ac[j],j%5
16: switch ( j%5 )
16: switch ( j%5 )
22: case 0:
23: push 0x62;
23: push 0x62;
24: break;
25: case 1:
26: push 23;
27: break;
28: case 2:
29: push 46;
30: break;
31: case 3:
32: push 14;
33: break;
34: default:
35: push 95;
//此时栈的内容是ac,j,ac[j],(0x62或23或46或者14或者95,定义为k)
37: ixor ac[j] k;对栈顶两个数进行异或操作,结果定义为r
25: case 1:
26: push 23;
27: break;
28: case 2:
29: push 46;
30: break;
31: case 3:
32: push 14;
33: break;
34: default:
35: push 95;
//此时栈的内容是ac,j,ac[j],(0x62或23或46或者14或者95,定义为k)
37: ixor ac[j] k;对栈顶两个数进行异或操作,结果定义为r
//此时栈的内容为ac,j,r
38: (char) r;对栈顶数据进行类型转换
39: castore ac[j],r; 把r赋值给ac[j]
38: (char) r;对栈顶数据进行类型转换
39: castore ac[j],r; 把r赋值给ac[j]
//此时栈已经空了
40: j++;
我想到这里聪明的朋友已经知道这段代码是做什么的了。学习java虚拟机和指令集对反编译非常有帮助。下面我写成java代码:
private static String b(String s) {
char ac[] = s.toCharArray();
int i = ac.length;
int j = 0;
while (j < i) {
int k;
switch (j % 5) {
case 0:
k = 0x62;
break;
case 1:
k = 23;
break;
case 2:
k = 46;
break;
case 3:
k = 14;
break;
default:
k = 95;
}
ac[j] = (char) (ac[j] ^ k);
j++;
}
return new String(ac);
}
char ac[] = s.toCharArray();
int i = ac.length;
int j = 0;
while (j < i) {
int k;
switch (j % 5) {
case 0:
k = 0x62;
break;
case 1:
k = 23;
break;
case 2:
k = 46;
break;
case 3:
k = 14;
break;
default:
k = 95;
}
ac[j] = (char) (ac[j] ^ k);
j++;
}
return new String(ac);
}
现在我们发现,这是一段对字符串进行简单加密的算法。优化一下:
private static String b(String s) {
int[] key = {0x62, 23, 46, 14, 95};
char ac[] = s.toCharArray();
for (int j = 0; j < ac.length; j++) {
ac[j] ^= key[j % 5];
}
return new String(ac);
}
int[] key = {0x62, 23, 46, 14, 95};
char ac[] = s.toCharArray();
for (int j = 0; j < ac.length; j++) {
ac[j] ^= key[j % 5];
}
return new String(ac);
}