http://pan.baidu.com/s/1boiw3iJ
共享出来源码吧,里面有远程注入(inject)和下面方法2的源码以及使用libsbustrate的方式,修改got表的百度就知道方法了,源码一大堆,只需要清楚修改got表到底做了什么。
网上关于hook的文章很多,开源代码也很多,但是真心没找到一个可以用(不是说写的人有问题,而是代码都很久了,并且写的代码有一定的应用局限性,作者不说明,网上的copy侠只会把代码copy一下然后传播,到了后面这些代码也就基本无法用了),网上方法分为两大类:
1、修改got表(http://blog.csdn.net/jinzhuojun/article/details/9900105 ,该博客里面有如何修改got表的方法,我下面的分析也是针对该博客的代码来说的,关于got hook的例子,有需要邮箱找我要了,csdn不知道怎么加附件,蛋疼~~~~)
2、直接修改函数的汇编实现,例如在前面加一条jmp指令来跳转到自己的函数
修改GOT表
修改got表的方式有很大局限性,无法做到一次hook对整个进程有效,从动态链接的原理来说我们可以知道,若某个进程有3个so,这三个so都调用了libc的gettimeofday方法,那么在这三个so的映射段里面会分别有got的表项来指向libc在当前进程映射区域的gettimeofday虚拟地址。那么显然,我们就需要对这三个so的got表都作出替换才可以。不了解got和plt原理的人以为可以再libc的got表里面找到gettimeofday的表项,怎么可能呢~~~~
简单把android手机system/lib/libc.so拉下来分析下就知道了:
arm-linux-androideabi-objdump.exe -d libc.so > libc.code
上面指令把libc里面的可执行代码都给反汇编出来,
Disassembly of section .plt:
0000c968 < dlopen@plt-0x14>:
c968: e52de004 push {lr} ; (str lr, [sp, #-4]!)
c96c: e59fe004 ldr lr, [pc, #4] ; c978 < dlopen@plt-0x4>
c970: e08fe00e add lr, pc, lr
c974: e5bef008 ldr pc, [lr, #8]!
c978: 0005b664 andeq fp, r5, r4, ror #12
0000c97c < dlopen@plt>:
c97c: e28fc600 add ip, pc, #0, 12
c980: e28cca5b add ip, ip, #372736 ; 0x5b000
c984: e5bcf664 ldr pc, [ip, #1636]! ; 0x664
0000c988 < dlsym@plt>:
c988: e28fc600 add ip, pc, #0, 12
c98c: e28cca5b add ip, ip, #372736 ; 0x5b000
c990: e5bcf65c ldr pc, [ip, #1628]! ; 0x65c
0000c994 < dlclose@plt>:
c994: e28fc600 add ip, pc, #0, 12
c998: e28cca5b add ip, ip, #372736 ; 0x5b000
c99c: e5bcf654 ldr pc, [ip, #1620]! ; 0x654
0000c9a0 < dl_unwind_find_exidx@plt>:
c9a0: e28fc600 add ip, pc, #0, 12
c9a4: e28cca5b add ip, ip, #372736 ; 0x5b000
c9a8: e5bcf64c ldr pc, [ip, #1612]! ; 0x64c
0000c9ac < __cxa_begin_cleanup@plt>:
c9ac: e28fc600 add ip, pc, #0, 12
c9b0: e28cca5b add ip, ip, #372736 ; 0x5b000
c9b4: e5bcf644 ldr pc, [ip, #1604]! ; 0x644
0000c9b8 < __cxa_type_match@plt>:
c9b8: e28fc600 add ip, pc, #0, 12
c9bc: e28cca5b add ip, ip, #372736 ; 0x5b000
c9c0: e5bcf63c ldr pc, [ip, #1596]! ; 0x63c
////////////////////////////////////////////////////上面是libc里面的plt项,可以看到哪里会有gettimeofday呢。
00021368 :
21368: e92d0090 push {r4, r7}
2136c: e3a0704e mov r7, #78 ; 0x4e
21370: ef000000 svc 0x00000000
21374: e8bd0090 pop {r4, r7}
21378: e1b00000 movs r0, r0
2137c: 512fff1e bxpl lr
21380: ea0075ef b 3eb44 <__set_errno+0x1c>
/////这个事libc的gettimeofday的实现,可以看到实际上就是一个系统调用,r7用来传递调用号,bxpl lr表示结果不为负就返回,否则接着执行后面的代码,b 3eb44 <__set_errno+0x1c>这个很熟悉了,挑战到set_errno来设置错误码,我们在c里面的全局(准确来说是线程独有)errno.
所以你只有去修改调用gettimofday那个so的got表项,才能够做到把这个so对gettimeofday的调用给hook住。。这个方法的局限性就很明显,你还得去分析目标进程有哪些so调用了gettimeofday,然后对每个so在进程里面的got表都进行修改。。。很蛋疼吧。。而且,因为plt的延迟绑定方式存在,在第一次调用gettimeofday,你在通过这个博客里面对比got表的地址来找到gettimeofday表项之前,你最好自己先手动在目标so里面调用一下gettimeofday函数,这样使得got表里面的地址得到真正的更新,你才能够正确找到符号。PS:另外提一点就是只有用了PIC编译的so,才会用got/plt技术。如gcc使用-fPIC 参数编译出来so,用ndk-build编译出来的so默认采用IPC编译。
建议做so里面hook的人又不太懂elf、动态链接、汇编等等的人呢,直接使用libsubstrate.so,自己百度了,这个是国外的的越狱大神团队维护的,使用起来简单的要命,还帮你解决了hook时候的多线程问题。PS:另外提一句,libsubtrate肯定不是修改got表了,而是直接修改gettimeofday函数在当前进程里面映射的代码段,通过jmp指令来调到本地的函数,当前实际实现不会这么简单,还需要考虑很多东西。
我给出一份自己的代码吧:
#define _GNU_SOURCE
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include "command.h"
#undef log
#define DEBUG_INFO 1
#if DEBUG_INFO
#define log(...) __android_log_print(ANDROID_LOG_DEBUG, "HOOK_YY", __VA_ARGS__);
#else
#define log(...)
#endif
#define INVALID_FUNC_ADDR NULL
#define HOOK_FUNC_COUNT 2
#define COMMAND_PORT 32455//default port will be decreased automaticlly to find usable one
#define MAX_COMMAND_SIZE 1024
//orig function copy
int (*gettimeofday_orig)(struct timeval *tv, struct timezone *tz) = INVALID_FUNC_ADDR;
int (*clock_gettime_orig)(clockid_t clk_id, struct timespec *tp) = INVALID_FUNC_ADDR;
//local V
double time_velocity = 1.0;
int listenfd, connectfd, suc_port = 0;
struct sockaddr_in server;
struct sockaddr_in client;
socklen_t addrlen;
pthread_t thread;
uint64_t g_gettimeofday_saved_usecs = 0;
uint64_t g_last_returned_max_real_usecs = 0;
uint64_t g_last_returned_max_usecs = 0;
//local function
int gettimeofday_local(struct timeval *tv, struct timezone *tz) {
int64_t diff;
int64_t cur_usecs;
int64_t ret_usecs;
int ret = gettimeofday_orig(tv, tz);
if (0 != ret) {
return ret;
}
if (g_gettimeofday_saved_usecs == 0) {
g_gettimeofday_saved_usecs = (tv->tv_sec * 1000000LL) + tv->tv_usec;
// g_last_returned_max_real_usecs = g_gettimeofday_saved_usecs;
// g_last_returned_max_usecs = g_gettimeofday_saved_usecs;
} else {
cur_usecs = (tv->tv_sec * 1000000LL) + tv->tv_usec;
diff = cur_usecs - g_gettimeofday_saved_usecs;
diff = diff * time_velocity;
ret_usecs = g_gettimeofday_saved_usecs + diff;
// if (ret_usecs < g_last_returned_max_usecs)
// {
//
log("!!!time error1!!!\n")
//
ret_usecs = g_last_returned_max_usecs + 1000;
//
g_last_returned_max_usecs = ret_usecs;
// }
// else {
//
g_last_returned_max_real_usecs = cur_usecs;
//
g_last_returned_max_usecs = ret_usecs;
// }
tv->tv_sec = (time_t)(ret_usecs / 1000000LL);
tv->tv_usec = (suseconds_t)(ret_usecs % 1000000LL);
}
return ret;
}
int clock_gettime_local(clockid_t clk_id, struct timespec *tp) {
int ret = clock_gettime_orig(clk_id, tp);
if (0 == ret) {
tp->tv_sec *= time_velocity;
tp->tv_nsec *= time_velocity;
}
//log("clock_gettime:%d", ret);
return ret;
}
void *hook_params[HOOK_FUNC_COUNT][4] = {\
{"/system/lib/libc.so", "gettimeofday", gettimeofday_local, &gettimeofday_orig},\
{"/system/lib/libc.so", "clock_gettime", clock_gettime_local, &clock_gettime_orig}\
};
void hookSubstrate(const char *dlLocation) {
void (*MSHookFunction)(void *symbol, void *replace, void **result) = INVALID_FUNC_ADDR;
log("hookSubstrate:%s\n", dlLocation);
void *substrate_sub = dlopen(dlLocation, RTLD_NOW);
if (!substrate_sub) {
log("open libsubstrate.so fail:%s\n", dlLocation);
return;
}
MSHookFunction = dlsym(substrate_sub, "MSHookFunction");
if (INVALID_FUNC_ADDR == MSHookFunction) {
log("can't find MSHookFunction\n");
return;
}
//hook now
int i;
for (i = 0; i < HOOK_FUNC_COUNT - 1; i++) {
char *target_lib = hook_params[i][0];
char *target_func = hook_params[i][1];
void *local_func = hook_params[i][2];
void *orig_func = hook_params[i][3];
log ("start hook func(%s) in lib(%s), local(%d), orig(%d)", target_func, target_lib, local_func, orig_func);
void *target_lib_sub = dlopen(target_lib, RTLD_NOW);
if (!target_lib_sub) {
log("can't find lib(%s) to hook\n", target_lib);
continue;
}
void *target_func_symbol = dlsym(target_lib_sub, target_func);
if (!target_func_symbol) {
log("can't find func(%s) in (%s)", target_func, target_lib);
dlclose(target_lib_sub);
continue;
}
MSHookFunction(target_func_symbol, local_func, (void **)orig_func);
dlclose(target_lib_sub);
}
dlclose(substrate_sub);
}
void handleCommand(char *command, size_t command_size) {
char param[250];
int param_size;
int command_type = parseCommand(command, command_size, param, 250, param_size);
log("handleCommand:%d param:%s\n", command_type, param);
switch (command_type) {
case Command_Set_Time_Velocity :
if (param) {
double new_time_velocity = strtod(param, NULL);
if (new_time_velocity < 1) {
//g_gettimeofday_saved_usecs = g_last_returned_max_real_usecs;
}
time_velocity = new_time_velocity;
log("set time velocity2:%f\n", time_velocity);
}
break;
default :
log("invalid command:%d", command_type);
break;
}
}
//listener
int acceptClient(void) {
//keep one client alive
if (0 != connectfd) {
//alive
return 1;
}
log("server accept client\n");
addrlen = sizeof(client);
if((connectfd = accept(listenfd,(struct sockaddr*)&client,&addrlen)) == -1) {
log("accept()error\n");
return 0;
}
log("You got a connection from cient's ip is %s, port is %d\n",inet_ntoa(client.sin_addr), htons(client.sin_port));
return 1;
}
void listenerRun(void) {
//jsut keep for one client in queue
if(listen(listenfd,1) == -1) {
log("listen error:%d\n", errno);
thread = 0;
return;
}
log("start listen at ip:%s, port:%d\n", inet_ntoa(server.sin_addr), htons(server.sin_port))
char command_buf[MAX_COMMAND_SIZE];
int command_size = 0;
//read command
while (1) {
if (1 == acceptClient()) {
if ((command_size = recv(connectfd, command_buf, MAX_COMMAND_SIZE - 1, 0)) == -1) {
log("recv error:%d\n", errno);
close(connectfd);
connectfd = 0;
}
else if (command_size > 0) {
command_buf[command_size] = '\0';
log("server received command:%s\n", command_buf)
handleCommand(command_buf, command_size);
}
else {
log("client closed\n");
close(connectfd);
connectfd = 0;
}
}
else {
sleep(1);
}
}
}
int startListener() {
//bind port firstly to report valid port for caller
if((listenfd = socket(AF_INET, SOCK_STREAM, 0)) == -1) {
log("Creating socket failed\n");
thread = 0;
return 0;
}
//reusable port
int opt = 1;
setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
int index;
for (index = 0;index < 100; index++) {
bzero(&server,sizeof(server));
server.sin_family = AF_INET;
server.sin_port = htons(COMMAND_PORT + index);
server.sin_addr.s_addr = htonl (INADDR_ANY);
if(bind(listenfd, (struct sockaddr *)&server, sizeof(server)) == -1) {
log("Binderror:(port:%d, error:%d)\n", COMMAND_PORT + index, errno);
}
else {
suc_port = COMMAND_PORT + index;
break;
}
}
if (suc_port == 0) {
log("can't find usable port to listen\n");
thread = 0;
return 0;
}
//create sub thread
int ret = pthread_create(&thread, NULL, (void*)listenerRun, NULL);
if (ret !=0)
{
log("create listener thread fail\n");
}
return suc_port;
}
/*
hook entry, param a must point where the so located
*/
int hook_entry(const char * a){
if (thread != 0) {
log("hook_entry already called before, serverPort:%d\n", suc_port);
return suc_port;
}
const char *process_name = a;
log("Hook start------(pid:%d)\n", getpid());
int port = startListener();
hookSubstrate(a);
log("Hook end------port:%d\n", port)
return port;
}
只需要关注
void hookSubstrate(const char *dlLocation)函数就好了,dlLocation是libsubstrate.so的路径
PS:不建议去看网上太多的hook的代码,因为在你不了解elf文件格式、elf文件的装载原理、动态链接的原理等等基础知识前,看那些代码也是然并卵,关于substrate的实现后面我在第二种实现方式里面会说明,有兴趣的自己去写代码了,或者找我要一份也可以,but,我个人对汇编不太熟悉,对arm/X86的ABI也不熟悉,哈哈,写的汇编可能有问题的,本人概不负责。
直接修改指令:
个人对汇编没那么熟悉了,在github上有一个adbi的开源项目就是按照修改指令的方式来实现的,不过adbi代码太老了,存在一些问题,如多线程问题、使用不方便、没支持i386架构等,我这里讲解下关键的函数,顺道说一下如何基于其代码进行完整的修改;
关键4个函数:
1、hook实现首次的代码覆写与数据保存逻辑
2、hook_precall:把原始指令写回去
3、hook_postcall 把hook的指令写回去
4、hook_cacheflush,系统调用来刷新CPU的指令缓存
下面来说明下:
int hook(struct hook_t *h, int pid, char *libname, char *funcname, void *hook_arm, void *hook_thumb)
{
unsigned long int addr;
int i;
//找到hook函数的地址
if (find_name(pid, funcname, libname, &addr) < 0) {
log("can't find: %s\n", funcname)
return 0;
}
log("hooking: %s = 0x%lx ", funcname, addr)
strncpy(h->name, funcname, sizeof(h->name)-1);
if (addr % 4 == 0) {
log("ARM using 0x%lx\n", (unsigned long)hook_arm)
h->thumb = 0;
h->patch = (unsigned int)hook_arm;
h->orig = addr;
//LDR pc, [pc, #0],既把h->jump[1]的内容写入PC,则发生了跳转;
//并且本地函数与被hook的函数参数一致,所以可以正确得到栈里面的参数
h->jump[0] = 0xe59ff000;
h->jump[1] = h->patch;
h->jump[2] = h->patch;
//保存原始指令,后面要写回去来调用原始实现
for (i = 0; i < 3; i++)
h->store[i] = ((int*)h->orig)[i];
//覆写指令
for (i = 0; i < 3; i++)
((int*)h->orig)[i] = h->jump[i];
}
else {
//thumb指令同样原理,不过thumb改了8条指令,我对thumb下的ABI没那么熟悉了
//没去深入了解
if ((unsigned long int)hook_thumb % 4 == 0)
log("warning hook is not thumb 0x%lx\n", (unsigned long)hook_thumb)
h->thumb = 1;
log("THUMB using 0x%lx\n", (unsigned long)hook_thumb)
h->patch = (unsigned int)hook_thumb;
h->orig = addr;
h->jumpt[1] = 0xb4;
h->jumpt[0] = 0x60; // push {r5,r6}
h->jumpt[3] = 0xa5;
h->jumpt[2] = 0x03; // add r5, pc, #12
h->jumpt[5] = 0x68;
h->jumpt[4] = 0x2d; // ldr r5, [r5]
h->jumpt[7] = 0xb0;
h->jumpt[6] = 0x02; // add sp,sp,#8
h->jumpt[9] = 0xb4;
h->jumpt[8] = 0x20; // push {r5}
h->jumpt[11] = 0xb0;
h->jumpt[10] = 0x81; // sub sp,sp,#4
h->jumpt[13] = 0xbd;
h->jumpt[12] = 0x20; // pop {r5, pc}
h->jumpt[15] = 0x46;
h->jumpt[14] = 0xaf; // mov pc, r5 ; just to pad to 4 byte boundary
memcpy(&h->jumpt[16], (unsigned char*)&h->patch, sizeof(unsigned int));
unsigned int orig = addr - 1; // sub 1 to get real address
for (i = 0; i < 20; i++) {
h->storet[i] = ((unsigned char*)orig)[i];
//log("%0.2x ", h->storet[i])
}
//log("\n")
for (i = 0; i < 20; i++) {
((unsigned char*)orig)[i] = h->jumpt[i];
//log("%0.2x ", ((unsigned char*)orig)[i])
}
}
//刷新CPU指令缓存
hook_cacheflush((unsigned int)h->orig, (unsigned int)h->orig+sizeof(h->jumpt));
return 1;
}
//系统调用__ARM_NR_cacheflush,因为cpu有指令缓存功能,所以当我们改了一段地址的指令以后要
//用该系统调用来刷新这一段地址的缓存,否则修改可能不会生效
//android源码Android\bionic\libc\arch-mips\bionic\cacheflush.c可以查看参数的含义
void inline hook_cacheflush(unsigned int begin, unsigned int end)
{
//r0 r1 r2传递前三个参数,r7保存系统调用号
const int syscall = 0xf0002;
__asm __volatile (
"mov r0, %0\n"
"mov r1, %1\n"
"mov r7, %2\n"
"mov r2, #0x0\n"
"svc 0x00000000\n"
:
: "r" (begin), "r" (end), "r" (syscall)
: "r0", "r1", "r7"
);
}
//把原始的指令写回去
void hook_precall(struct hook_t *h)
{
int i;
if (h->thumb) {
unsigned int orig = h->orig - 1;
for (i = 0; i < 20; i++) {
((unsigned char*)orig)[i] = h->storet[i];
}
}
else {
for (i = 0; i < 3; i++)
((int*)h->orig)[i] = h->store[i];
}
hook_cacheflush((unsigned int)h->orig, (unsigned int)h->orig+sizeof(h->jumpt));
}
//写入修改后的跳转指令
void hook_postcall(struct hook_t *h)
{
int i;
if (h->thumb) {
unsigned int orig = h->orig - 1;
for (i = 0; i < 20; i++)
((unsigned char*)orig)[i] = h->jumpt[i];
}
else {
for (i = 0; i < 3; i++)
((int*)h->orig)[i] = h->jump[i];
}
hook_cacheflush((unsigned int)h->orig, (unsigned int)h->orig+sizeof(h->jumpt));
}
调用的时候类似如下代码:
int gettimeofday_new(struct timeval *tv, struct timezone *tz) {
//old gettimeofday
static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_lock(&mutex);
int (*gettimeofday_old)(struct timeval *tv, struct timezone *tz);
gettimeofday_old = (void*)eph2.orig;
log("gettimeofday_new begin %d\n", gettid());
hook_precall(&eph2);
int ret = gettimeofday_old(tv, tz);
if (ret == 0) {
tv->tv_sec *= velocity;
tv->tv_usec *= velocity;
}
hook_postcall(&eph2);
log("gettimeofday_end end %d\n", gettid());
pthread_mutex_unlock(&mutex);
return ret;
}
可以看到调用原始实现时候,通过hook_precall把指令写回去来完成调用,然后再hook_postcall把hook的指令写回去;同时我在这个函数里面加锁,从而避免要在hook_precall和hook_postcall都加锁;
分析substrate的实现:
不过可以看到这种调用很繁琐的,cygia提供的libsubstrate库提供的hook接口实现方法应该也是覆写指令,同时不需要这种方式,看其接口:
MSHookFunction(target_func_symbol, local_func, (void **)orig_func);//参数分别是目标函数地址,本地函数地址,substrate返回的原始函数的地址;
后面我们想要调用原始实现就通过orig_func就可以了,显然orig_func和target_func_symbol不是同一个地址了,否则就得像我们上的例子那样才能访问。打印日志也发现的确不同,既substrate提供了另外一个指令入口来调用原始的指令,猜测其实现:
1、在内存开辟一块区域用来写入之前被替换掉的指令,然后后面再跟一条指令把pc的地址修改到原始实现的被我们修改的区域接下来的指令,这样不就间接实现了执行一遍原始函数逻辑,同时又不需要每次都反复去写这一块区域。
直接给出代码吧:
int hook_lwp(int pid, char *libname, char *funcname, void *hook_arm, void **origFunc) {
unsigned long int addr;
struct hook_t hookT;
struct hook_t *h = &hookT;
int i;
if (find_name(pid, funcname, libname, &addr) < 0) {
log("can't find: %s\n", funcname)
return 0;
}
log("LWP2\n")
log("hooking: %s = 0x%lx ", funcname, addr)
strncpy(h->name, funcname, sizeof(h->name)-1);
if (addr % 4 == 0) {
log("ARM using 0x%lx\n", (unsigned long)hook_arm)
h->thumb = 0;
h->patch = (unsigned int)hook_arm;
h->orig = addr;
//LDR pc, [pc, #0],既把h->jump[1]的内容写入PC,则发生了跳转;
//并且本地函数与被hook的函数参数一致,所以可以正确得到栈里面的参数
h->jump[0] = 0xe59ff000;
h->jump[1] = h->patch;
h->jump[2] = h->patch;
for (i = 0; i < 3; i++)
h->store[i] = ((int*)h->orig)[i];
for (i = 0; i < 3; i++)
((int*)h->orig)[i] = h->jump[i];
//mmap memory to replace origFunc entry
void *mmap_base = mmap(0, 0x20, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
log("mmap_base:0x%lx\n", (unsigned long)mmap_base);
//write instrucment
int i;
for (i = 0; i < 3; ++i)
{
((int*)mmap_base)[i] = h->store[i];
}
//在开辟的内存区域里面写入跳转的代码,和被hook函数的第四条指令衔接起来
((int*)mmap_base)[3] = 0xe59ff000;
((int*)mmap_base)[4] = h->orig + 12;
((int*)mmap_base)[5] = h->orig + 12;
*origFunc = mmap_base;
log("origFunc:0x%lx, jump:0x%lx\n", *origFunc, h->orig + 12);
}
//上述代码我自己测试通过了的,因为无需多次写内存了,所以多线程下也没问题的,这里需要留一下就是mmap出来的地址会否不对齐的情况,安全一点的话需要检查一下mmap_base的地址,如果不能被4整除,还需要对地址修正一下,偏移几个字节来对齐,然后再写入我们的指令