PSP《大众高尔夫2P》XB资源包算法分析(2)

分析环境

  • 完整的psptoolchain编译环境(内含psplinkusb源代码)
  • IDA Pro

准备工作

  • 修改psplinkusb中关于api_hook这部分的代码,以便于打印出api函数的调用参数
  • 在psplinkusb的代码中添加对数据访问断点的支持
    参考:
    http://www.pspchina.net/viewthread.php?tid=290484&highlight=psplink (此链接失效,用下面的地址)

              http://bbs.pspchina.net/forum.php?mod=viewthread&tid=290484&highlight=psplink

  • 在psplinkusb的代码中添加显示栈帧的功能
    参考:
    http://blog.csdn.net/jerryutscn/archive/2010/07/30/5775226.aspx
  • 通过IDA Pro获得boot.bin的反汇编信息(如boot.bin为无效数据,需先解密eboot.bin)

算法分析方法

函数定位:

如果无法获得目标函数的入口信息,庞大的反汇编清单会把我们淹没。

我们找出解压函数的地址是基于下面的假设:

任何访问(读)压缩数据地址的指令,都和解压函数相关。(这个假设不会总成立,你可以为他的成立创造条件)

首先在游戏启动的时候通过psplink,hook住下面6个API函数:

sceIoOpen

sceIoOpenAsync

sceIoRead

sceIoReadAsync

sceIoLseek

sceIoLseekAsync

游戏启动时会看到下面显示:

host0:/>
TH:0x0477D673(RA:0x8002013A) sceIoOpen("disc0:/PSP_GAME/USRDIR/mgp.ufl", 0x00000001, 00, ) = 3
TH:0x0477D673(RA:0x8002013A) sceIoLseek(3, 0x00000000_00000000, 0x00000002, ) = 0x00056876
TH:0x0477D673(RA:0x8002013A) sceIoLseek(3, 0x00000000_00000000, 0x00000000, ) = 0x00000000
TH:0x0477D673(RA:0x8002013A) sceIoRead(3, 0x08DA7980, 0x00056876, ) = 0x00056876
TH:0x0477D673(RA:0x8002013A) sceIoOpen("umd1:", 0x00000001, 00, ) = 3
TH:0x0477D673(RA:0x8002013A) sceIoLseek(3, 0x00000000_00055FC0, 0x00000000, ) = 0x00055FC0
TH:0x0477D673(RA:0x8002013A) sceIoRead(3, 0x08BE47C0, 0x00000005, ) = 0x00000005

其中sceIoLseek里的00055FC0这个数值是指在ISO文件中偏移的扇区个数,sceIoRead里的0x00000005是说读入了5个扇区的信息量(2048*5Bytes)到内存地址0x08BE47C0

0x55FC0 = 352192

从UMDGen导出的ISO 结构文件中可以找到下面对应项:

0352192 , /PSP_GAME/USRDIR/xbdata/kuwa/loading/loading.xb

这是游戏启动加载的第一个xb文件。

经过多次试验发现0x08BE47C0这个地址是固定的。所以我们可以通过设置硬件读访问断点来获取算法的地址。

 

文件名列表的lzss 算法分析

定位函数

首先我们来分析那个用于文件名列表的lzss类压缩算法。用WinHex打开loading.xb,确认偏移量0x44的地址(0x40已经是文件名列表了,设置到这里也没问题)是经过压缩过的文件名列表。

Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F
00000000 78 65 00 01 06 00 00 00 40 20 00 00 26 00 00 20 xe......@ ..&..
00000010 40 20 00 00 81 01 00 20 40 20 00 00 E1 02 00 20 @ ..?.. @ ..á..
00000020 40 20 00 00 47 04 00 20 40 20 00 00 AE 05 00 20 @ ..G.. @ ..?..
00000030 40 20 00 00 15 07 00 20 0E 01 00 00 5F 00 00 00 @ ..... ...._...
00000040 78 2A EA 2E 2E 5C 64 61 74 61 5C 6B 75 77 61 62 x*ê../data/kuwab
00000050 61 72 61 5C 6C 6F 61 64 69 6E 67 5C 6E 6F 77 5F ara/loading/now_

0x08BE47C0 + 0x44 = 0x8BE4804

于是设置下面断点:(注意这里的地址需要4字节对齐)

host0:/> bpset 0x8BE4804 r
… …
TH:0x0477A25F(RA:0x8002013A) sceIoRead(3, 0x08BE47C0, 0x00000005, ) = 0x00000005
Exception - DEBUG (D)
Thread ID – 0x0477A25F
Th Name – main
Module ID – 0x0478A077
Mod Name – main
EPC – 0x0882AE08
DRCNTL – 0x0043D004
Status – 0x60088613
zr:0x00000000 at:0x00000001 v0:0x08BE47F0 v1:0x08B398A6
a0:0x08B3979B a1:0x08BE4804 a2:0x0000001C a3:0x0000002E
t0:0x00000001 t1:0x0000001B t2:0x09FFE840 t3:0xDEADBEEF
t4:0xDEADBEEF t5:0xDEADBEEF t6:0xDEADBEEF t7:0xDEADBEEF
s0:0x0000005F s1:0x0000010E s2:0x0882ADD4 s3:0x08BE47F8
s4:0x08B39798 s5:0x00000000 s6:0x08AEF53C s7:0xDEADBEEF
t8:0xDEADBEEF t9:0xDEADBEEF k0:0x09FFEB00 k1:0x00000000
gp:0x08B3FB20 sp:0x09FFE6E0 fp:0x09FFEAC0 ra:0x0882AF18
0x0882AE08: 0x90A70000 '....' - lbu $a3, 0($a1)

同时也可以获得当前的函数调用关系,以便于分析:

host0:/> bt
882add4(8ff1700,8ff37c9,10,0) pc [882ae08] ra [882af18] sz [0]
882aec0(8ff1700,8ff37c9,10,0) pc [882af18] ra [882a94c] sz [32]
882a860(8ff1700,8ff37c9,10,0) pc [882a94c] ra [882f504] sz [16]
882f4d0(8ff1700,8ff37c9,10,0) pc [882f504] ra [882f160] sz [16]
882f078(8ff1700,8ff37c9,10,0) pc [882f160] ra [8a8e8e8] sz [32]
8a8e890(8ff1700,8ff37c9,10,0) pc [8a8e8e8] ra [8a9075c] sz [48]
8a906c0(8ff1700,8ff37c9,10,0) pc [8a9075c] ra [8918ac4] sz [48]
8918a2c(8ff1700,8ff37c9,10,0) pc [8918ac4] ra [890f7dc] sz [16]
890ed24(8ff1700,8ff37c9,10,0) pc [890f7dc] ra [890ecc4] sz [16]
890ec7c(8ff1700,8ff37c9,10,0) pc [890ecc4] ra [890c1fc] sz [16]
890be5c(8ff1700,8ff37c9,10,0) pc [890c1fc] ra [88ffdf8] sz [64]
88ffcf0(8ff1700,8ff37c9,10,0) pc [88ffdf8] ra [88ffeb8] sz [48]
88ffcf0(8ff1700,8ff37c9,10,0) pc [88ffeb8] ra [88ffeb8] sz [48]
88ffcf0(8ff1700,8ff37c9,10,0) pc [88ffeb8] ra [88ffeb8] sz [48]
88ffcf0(8ff1700,8ff37c9,10,0) pc [88ffeb8] ra [88ffcc8] sz [48]
88ffc98(8ff1700,8ff37c9,10,0) pc [88ffcc8] ra [8804c28] sz [16]
8804bbc(8ff1700,8ff37c9,10,0) pc [8804c28] ra [88047f4] sz [16]
8804358(8ff1700,8ff37c9,10,0) pc [88047f4] ra [9ffeac0] sz [32]
9e3f088(8ff1700,8ff37c9,10,0) pc [9ffeac0] ra [0] sz [0]

done!

于是可以看出下面的调用流程:

882add4 <- 882aec0 <- 882a860 <- 882f4d0 <- 882f078

在IDA上可以看到这个调用的路径:
PSP《大众高尔夫2P》XB资源包算法分析(2)_第1张图片

经过上面的分析,我们怀疑0x882add4就是目标函数。

通过分析caller的调用:

.text:0882AF08 addiu $a1, $s3, 8
.text:0882AF0C move $a2, $s1
.text:0882AF10 jalr $s2
.text:0882AF14 move $a0, $s4
.text:0882AF18 b loc_882AF30

发现寄存器a0,a1,a2被赋值,而没有在函数调用完成后没有处理v1,我们可以猜测函数的原型为:

void lzss_decoder(u32 a0, u32 a1, u32 a2);

MIPS 上函数的调用规范

  • 参数是通过寄存器a0 a1 a2 a3 t0 t1 来进行传递的(第一个参数是a0,下面a1,依次类推),再多的话就要用到堆栈了。
  • 也存在例外的情况,例如:如果有64位参数,假如lseek(fd, offset, where)则a0放fd,a1不用,a2 a3放offset,t0 放where。就是说64位参数要64位对齐。
  • 返回值放在寄存器v0 v1中,其中v1用于64位返回值的情况。

确认函数功能

为确认函数的功能,我们在函数的入口地址和出口地址上设置断点,然后分析函数运行前后参数的变化情况:

先看入口:

host0:/> bpset 0x882add4
host0:/> exprint
Exception – Breakpoint
Thread ID – 0x047DA35D
Th Name – main
Module ID – 0x03ADE673
Mod Name – main
EPC – 0x0882ADD4
Cause – 0x10000024
BadVAddr – 0xA6A11114
Status – 0x60088613
zr:0x00000000 at:0xDEADBEEF v0:0x08BE47F0 v1:0x00000030
a0:0x08B39798 a1:0x08BE4800 a2:0x0000010E a3:0x642F2E2E
t0:0x80808080 t1:0xFEFEFEFF t2:0x09FFE840 t3:0xDEADBEEF
t4:0xDEADBEEF t5:0xDEADBEEF t6:0xDEADBEEF t7:0xDEADBEEF
s0:0x0000005F s1:0x0000010E s2:0x0882ADD4 s3:0x08BE47F8
s4:0x08B39798 s5:0x00000000 s6:0x08AEF53C s7:0xDEADBEEF
t8:0xDEADBEEF t9:0xDEADBEEF k0:0x09FFEB00 k1:0x00000000
gp:0x08B3FB20 sp:0x09FFE6E0 fp:0x09FFEAC0 ra:0x0882AF18
host0:/> memdump 0x08B39798
- 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f – 0123456789abcdef
----------------------------------------------------------------------------
8b39798 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b397a8 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b397b8 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b397c8 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b397d8 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b397e8 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b397f8 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b39808 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b39818 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b39828 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b39838 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b39848 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b39858 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b39868 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b39878 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
08b39888 - 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 – ................
host0:/> memdump 0x08BE4800
- 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f – 0123456789abcdef
-----------------------------------------------------------------------------
08be4800 - 78 2A EA 2E 2E 5C 64 61 74 61 5C 6B 75 77 61 62 - x*.../data/kuwab
08be4810 - 61 72 61 5C 6C 6F 61 64 69 6E 67 5C 6E 6F 77 5F - ara/loading/now_
08be4820 - C9 00 20 30 31 2E 74 6D 32 00 2A DA 8A D0 02 00 - .. 01.tm2.*.....
08be4830 - 32 D7 02 00 CA 8A D0 02 00 33 D7 02 00 BA 8A D0 - 2........3......
08be4840 - 02 00 34 D7 02 00 AA 8A D0 02 00 35 D7 02 00 9A - ..4........5....
08be4850 - 8A D0 02 00 36 D5 02 00 40 20 00 00 6A 05 00 00 - ....6...@ ..j...
08be4860 - 1C 54 49 4D 32 04 00 01 00 1B 00 04 30 20 99 00 - .TIM2.......0 ..
08be4870 - 81 00 00 30 83 00 14 01 00 01 40 00 40 13 01 00 - ...0......@.@...
08be4880 - 98 63 02 04 60 02 BD 02 F6 1F 00 F6 1F 00 EA 10 - .c..`...........
08be4890 - 00 3C 8C B1 94 D2 F7 DE 18 E3 39 E7 F7 DE B5 D6 - .<........9.....
08be48a0 - CE B9 A6 C1 07 18 31 CA 9B F3 DE FF FE 21 00 13 - ......1......!..
08be48b0 - 00 24 FE FF DE FF DD FB 9C F7 72 CE 86 C1 07 04 - .$........r.....
08be48c0 - 94 D6 A3 07 14 FD FF DC FF 9A FB 2F 00 00 BB 21 - .........../...!
08be48d0 - 01 14 DE FF DE FB D6 DE 6E E1 07 08 BC F7 FF 81 - ........n.......
08be48e0 - 06 28 99 FB 77 F7 56 F7 57 F7 57 F3 58 25 00 A1 - .(..w.V.W.W.X%..
08be48f0 - 00 14 F7 77 F7 98 FB DD E1 10 04 BD F7 5E E1 07 - ...w.........^..
host0:/>

这个时候在函数返回的地址设置断点:

host0:/> bl

Breakpoint List:

1 : Addr:0x0882AEB8 Inst:0x03E00008 Flags:---

这时我们看到,数据已经解压缩完成了。

host0:/> memdump 0x08B39798
- 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f – 0123456789abcdef
-----------------------------------------------------------------------------
08b39798 - 2A EA 2E 2E 5C 64 61 74 61 5C 6B 75 77 61 62 61 - *.../data/kuwaba
08b397a8 - 72 61 5C 6C 6F 61 64 69 6E 67 5C 6E 6F 77 5F 6C - ra/loading/now_l
08b397b8 - 6F 61 64 69 6E 67 30 31 2E 74 6D 32 00 2A DA 2E - oading01.tm2.*..
08b397c8 - 2E 5C 64 61 74 61 5C 6B 75 77 61 62 61 72 61 5C - ./data/kuwabara/
08b397d8 - 6C 6F 61 64 69 6E 67 5C 6E 6F 77 5F 6C 6F 61 64 - loading/now_load
08b397e8 - 69 6E 67 30 32 2E 74 6D 32 00 2A CA 2E 2E 5C 64 - ing02.tm2.*.../d
08b397f8 - 61 74 61 5C 6B 75 77 61 62 61 72 61 5C 6C 6F 61 - ata/kuwabara/loa
08b39808 - 64 69 6E 67 5C 6E 6F 77 5F 6C 6F 61 64 69 6E 67 - ding/now_loading
08b39818 - 30 33 2E 74 6D 32 00 2A BA 2E 2E 5C 64 61 74 61 - 03.tm2.*.../data/
08b39828 - 5C 6B 75 77 61 62 61 72 61 5C 6C 6F 61 64 69 6E - /kuwabara/loadin
08b39838 - 67 5C 6E 6F 77 5F 6C 6F 61 64 69 6E 67 30 34 2E - g/now_loading04.
08b39848 - 74 6D 32 00 2A AA 2E 2E 5C 64 61 74 61 5C 6B 75 - tm2.*.../data/ku
08b39858 - 77 61 62 61 72 61 5C 6C 6F 61 64 69 6E 67 5C 6E - wabara/loading/n
08b39868 - 6F 77 5F 6C 6F 61 64 69 6E 67 30 35 2E 74 6D 32 - ow_loading05.tm2
08b39878 - 00 2A 9A 2E 2E 5C 64 61 74 61 5C 6B 75 77 61 62 - .*.../data/kuwab
08b39888 - 61 72 61 5C 6C 6F 61 64 69 6E 67 5C 6E 6F 77 5F - ara/loading/now_
host0:/> memdump 0x08BE4800
- 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f – 0123456789abcdef
-----------------------------------------------------------------------------
08be4800 - 78 2A EA 2E 2E 5C 64 61 74 61 5C 6B 75 77 61 62 - x*.../data/kuwab
08be4810 - 61 72 61 5C 6C 6F 61 64 69 6E 67 5C 6E 6F 77 5F - ara/loading/now_
08be4820 - C9 00 20 30 31 2E 74 6D 32 00 2A DA 8A D0 02 00 - .. 01.tm2.*.....
08be4830 - 32 D7 02 00 CA 8A D0 02 00 33 D7 02 00 BA 8A D0 - 2........3......
08be4840 - 02 00 34 D7 02 00 AA 8A D0 02 00 35 D7 02 00 9A - ..4........5....
08be4850 - 8A D0 02 00 36 D5 02 00 40 20 00 00 6A 05 00 00 - ....6...@ ..j...
08be4860 - 1C 54 49 4D 32 04 00 01 00 1B 00 04 30 20 99 00 - .TIM2.......0 ..
08be4870 - 81 00 00 30 83 00 14 01 00 01 40 00 40 13 01 00 - ...0......@.@...
08be4880 - 98 63 02 04 60 02 BD 02 F6 1F 00 F6 1F 00 EA 10 - .c..`...........
08be4890 - 00 3C 8C B1 94 D2 F7 DE 18 E3 39 E7 F7 DE B5 D6 - .<........9.....
08be48a0 - CE B9 A6 C1 07 18 31 CA 9B F3 DE FF FE 21 00 13 - ......1......!..
08be48b0 - 00 24 FE FF DE FF DD FB 9C F7 72 CE 86 C1 07 04 - .$........r.....
08be48c0 - 94 D6 A3 07 14 FD FF DC FF 9A FB 2F 00 00 BB 21 - .........../...!
08be48d0 - 01 14 DE FF DE FB D6 DE 6E E1 07 08 BC F7 FF 81 - ........n.......
08be48e0 - 06 28 99 FB 77 F7 56 F7 57 F7 57 F3 58 25 00 A1 - .(..w.V.W.W.X%..
08be48f0 - 00 14 F7 77 F7 98 FB DD E1 10 04 BD F7 5E E1 07 - ...w.........^..

于是我们不但确认了0x0882ADD4是我们要分析的函数,而且其函数原型为:

void lzss_decoder(u8* dst, u8* src, int len);

而且这个len,表示的解压后的长度,于是记做dst_len。

这时保存一份当前PSP上的内存信息,以便于后面进行分析。

host0:/> savemem 0x08800000 25165824 host0:/leaveLzssDecoder.bin

对应的反汇编列表

下面是从IDA中取出的该函数的反汇编代码:

=============== S U B R O U T I N E =======================================
.text:0882ADD4
.text:0882ADD4
.text:0882ADD4 sub_882ADD4: # DATA XREF: sub_882A5A8+84o
.text:0882ADD4 # sub_882A860+E8o ...
.text:0882ADD4 addu $v1, $a0, $a2
.text:0882ADD8 sltu $1, $a0, $v1
.text:0882ADDC beqz $1, locret_882AEB8
.text:0882ADE0 nop
.text:0882ADE4 li $t0, 1
.text:0882ADE8 lbu $t1, 0($a1)
.text:0882ADEC
.text:0882ADEC loc_882ADEC: # CODE XREF: sub_882ADD4:loc_882AEB0j
.text:0882ADEC andi $a2, $t1, 3
.text:0882ADF0 bnez $a2, loc_882AE2C
.text:0882ADF4 addiu $a1, 1
.text:0882ADF8 srl $a2, $t1, 2
.text:0882ADFC addiu $a2, 1
.text:0882AE00 blez $a2, loc_882AEAC
.text:0882AE04 addiu $t1, $a2, –1
.text:0882AE08
.text:0882AE08 loc_882AE08: # CODE XREF: sub_882ADD4+48j
.text:0882AE08 lbu $a3, 0($a1)
.text:0882AE0C move $a2, $t1
.text:0882AE10 addiu $t1, -1
.text:0882AE14 sb $a3, 0($a0)
.text:0882AE18 addiu $a1, 1
.text:0882AE1C bgtz $a2, loc_882AE08
.text:0882AE20 addiu $a0, 1
.text:0882AE24 b loc_882AEB0
.text:0882AE28 sltu $a2, $a0, $v1
.text:0882AE2C
# ---------------------------------------------------------------------------
.text:0882AE2C
.text:0882AE2C loc_882AE2C: # CODE XREF: sub_882ADD4+1Cj
.text:0882AE2C andi $a2, $t1, 1
.text:0882AE30 bnel $a2, $t0, loc_882AE5C
.text:0882AE34 lbu $a3, 0($a1)
.text:0882AE38 lbu $a2, 0($a1)
.text:0882AE3C sll $a2, 8
.text:0882AE40 addu $a3, $t1, $a2
.text:0882AE44 ext $a2, $a3, 1, 3
.text:0882AE48 addiu $t1, $a2, 3
.text:0882AE4C srl $a2, $a3, 4
.text:0882AE50 subu $t2, $a0, $a2
.text:0882AE54 b loc_882AE84
.text:0882AE58 addiu $a1, 1
.text:0882AE5C
# ---------------------------------------------------------------------------
.text:0882AE5C
.text:0882AE5C loc_882AE5C: # CODE XREF: sub_882ADD4+5Cj
.text:0882AE5C lbu $a2, 1($a1)
.text:0882AE60 sll $a3, 8
.text:0882AE64 sll $a2, 16
.text:0882AE68 addu $a2, $a3, $a2
.text:0882AE6C addu $a3, $t1, $a2
.text:0882AE70 ext $a2, $a3, 2, 0xA
.text:0882AE74 addiu $t1, $a2, 3
.text:0882AE78 srl $a2, $a3, 12
.text:0882AE7C subu $t2, $a0, $a2
.text:0882AE80 addiu $a1, 2
.text:0882AE84
.text:0882AE84 loc_882AE84: # CODE XREF: sub_882ADD4+80j
.text:0882AE84 move $a2, $t1
.text:0882AE88 blez $a2, loc_882AEAC
.text:0882AE8C addiu $t1, –1
.text:0882AE90
.text:0882AE90 loc_882AE90: # CODE XREF: sub_882ADD4+D0j
.text:0882AE90 lbu $a3, 0($t2)
.text:0882AE94 move $a2, $t1
.text:0882AE98 addiu $t1, –1
.text:0882AE9C sb $a3, 0($a0)
.text:0882AEA0 addiu $t2, 1
.text:0882AEA4 bgtz $a2, loc_882AE90
.text:0882AEA8 addiu $a0, 1
.text:0882AEAC
.text:0882AEAC loc_882AEAC: # CODE XREF: sub_882ADD4+2Cj
.text:0882AEAC # sub_882ADD4+B4j
.text:0882AEAC sltu $a2, $a0, $v1
.text:0882AEB0
.text:0882AEB0 loc_882AEB0: # CODE XREF: sub_882ADD4+50j
.text:0882AEB0 bnezl $a2, loc_882ADEC
.text:0882AEB4 lbu $t1, 0($a1)
.text:0882AEB8
.text:0882AEB8 locret_882AEB8: # CODE XREF: sub_882ADD4+8j
.text:0882AEB8 jr $ra
.text:0882AEBC nop
.text:0882AEBC
# End of function sub_882ADD4

你可能感兴趣的:(thread,c,exception,算法,汇编,hook)