1.前言
最近线上突然多了一些crash,类型是SEGV_ACCER,一看就认为是对象野指针了,基本都是多线程读写导致的;
但是仔细再一看crash堆栈,不是平常的objc_xxxx,而是 cache_getImp 这就有点怪了,和平常的objc_msgSend或objc_retain等挂的不太一样;
我们实际的业务代码挂在如下
[delegate performSelector:didReceiveDataSelector withObject:self withObject:data];
实际的crash堆栈是
Thread 0 Crashed:
0 libobjc.A.dylib 0x00000001837bcd04 _cache_getImp + 4
1 libobjc.A.dylib 0x00000001837b1900 _lookUpImpOrNil + 12
2 libobjc.A.dylib 0x00000001837a7578 class_respondsToSelector + 32
3 CoreFoundation 0x00000001845f41d8 ____forwarding___ + 372
4 CoreFoundation 0x00000001844da41c _CF_forwarding_prep_0 + 80
5 mttlite 0x0000000102cd94f4 -[QBASIHTTPRequest passOnReceivedData:] (QBASIHTTPRequest.m:2111)
6 Foundation 0x000000018503a0ec ___NSThreadPerformPerform + 340
7 CoreFoundation 0x0000000184597404 ___CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24
+ 24
8 CoreFoundation 0x0000000184596c2c ___CFRunLoopDoSources0 + 276
2.分析
这个问题又是不好复现的问题,那只能沿着Crash堆栈反向去推了;
顺手就找来objc源码对比着看;
挂的代码在_cache_getImp 这里实际上是一段汇编代码(苹果为了优化效率该函数时直接汇编实现的),
oc对应源码如下
IMP lookUpImpOrNil(Class cls, SEL sel, id inst,
bool initialize, bool cache, bool resolver)
{
IMP imp = lookUpImpOrForward(cls, sel, inst, initialize, cache, resolver);
if (imp == _objc_msgForward_impcache) return nil;
else return imp;
}
IMP lookUpImpOrForward(Class cls, SEL sel, id inst,
bool initialize, bool cache, bool resolver)
{
Class curClass;
IMP imp = nil;
Method meth;
bool triedResolver = NO;
runtimeLock.assertUnlocked();
// Optimistic cache lookup
if (cache) {
imp = cache_getImp(cls, sel);//这里是汇编实现,crash发生在这里
if (imp) return imp;
}
//...省略之后代码
}
我们看到cache_getImp传了2个参数cls和sel;尝试一步步step into最终定位到了crash堆栈的汇编代码如下:
这里x0就是上面C代码的cls参数,x1就是上面C代码的sel;
libobjc.A.dylib`cache_getImp:
-> 0x182a70d00 <+0>: and x16, x0, #0xffffffff8
0x182a70d04 <+4>: ldp x10, x11, [x16, #0x10]
0x182a70d08 <+8>: and w12, w1, w11
0x182a70d0c <+12>: add x12, x10, x12, lsl #4
0x182a70d10 <+16>: ldp x9, x17, [x12]
0x182a70d14 <+20>: cmp x9, x1
0x182a70d18 <+24>: b.ne 0x182a70d24 ; <+36>
0x182a70d1c <+28>: mov x0, x17
0x182a70d20 <+32>: ret
0x182a70d24 <+36>: cbz x9, 0x182a70d68 ; <+104>
0x182a70d28 <+40>: cmp x12, x10
0x182a70d2c <+44>: b.eq 0x182a70d38 ; <+56>
0x182a70d30 <+48>: ldp x9, x17, [x12, #-0x10]!
0x182a70d34 <+52>: b 0x182a70d14 ; <+20>
0x182a70d38 <+56>: add x12, x12, w11, uxtw #4
0x182a70d3c <+60>: ldp x9, x17, [x12]
0x182a70d40 <+64>: cmp x9, x1
0x182a70d44 <+68>: b.ne 0x182a70d50 ; <+80>
0x182a70d48 <+72>: mov x0, x17
0x182a70d4c <+76>: ret
0x182a70d50 <+80>: cbz x9, 0x182a70d68 ; <+104>
0x182a70d54 <+84>: cmp x12, x10
0x182a70d58 <+88>: b.eq 0x182a70d64 ; <+100>
0x182a70d5c <+92>: ldp x9, x17, [x12, #-0x10]!
0x182a70d60 <+96>: b 0x182a70d40 ; <+64>
0x182a70d64 <+100>: b 0x182a70d68 ; <+104>
0x182a70d68 <+104>: mov x0, #0x0
0x182a70d6c <+108>: ret
0x182a70d70 <+112>: nop
0x182a70d74 <+116>: nop
0x182a70d78 <+120>: nop
0x182a70d7c <+124>: nop
寄存器信息如下
(lldb) re read -a
General Purpose Registers:
x0 = 0x0000000106c13998 (void *)0x000001a106c139c1
x1 = 0x000000018e24619b "forwardingTargetForSelector:"
x2 = 0x0000000000000000
x3 = 0x0000000000000000
x4 = 0x0000000000000001
x5 = 0x0000000000000001
x6 = 0x0000000000000000
x7 = 0x0000000000000000
x8 = 0x0000000000000000
x9 = 0x0000000000000002
x10 = 0x000000012fed70f0
x11 = 0x000000130000001f
x12 = 0x000000012fed7150
x13 = 0x000001a106c1399d
x14 = 0x0000000000000000
x15 = 0x0005210000052100
x16 = 0x0000000106c13998 (void *)0x000001a106c139c1
x17 = 0x000000018378e3c0 CoreFoundation`_CF_forwarding_prep_0
x18 = 0x0000000000000000
x19 = 0x0000000000000000
x20 = 0x000000018e24619b "forwardingTargetForSelector:"
x21 = 0x0000000106c13998 (void *)0x000001a106c139c1
x22 = 0x0000000000000001
x23 = 0x0000000105f3cae0 "QBWeakProxy"
x24 = 0x0000000000000000
x25 = 0x000000018e24619b "forwardingTargetForSelector:"
x26 = 0x0000000000000000
x27 = 0x00000001b440f000 Foundation`NSUnarchiver.map
x28 = 0x000000010a878090
fp = 0x000000016dd1a580
lr = 0x0000000182a65974 libobjc.A.dylib`lookUpImpOrForward + 64
sp = 0x000000016dd1a530
pc = 0x0000000182a70d00 libobjc.A.dylib`cache_getImp
挂在了<+4>偏移,即第2条指令这里,这里是在干什么;
第一步指令and x16, x0, #0xffffffff8这里应该是为了内存对齐,
接着ldp x10, x11, [x16, #0x10] 这里是关键;这里是从x16对应指针+16偏移处进行间接寻址,取出对应内存地址开始的16字节内容,依次赋值给x10,x11;实际的作用是从Class的cache_t中取出缓存的方法列表;
关于为什么是+16,参照如下结构就能明白了
struct objc_class : objc_object {
// Class ISA;
Class superclass;
cache_t cache; // formerly cache pointer and vtable
class_data_bits_t bits; // class_rw_t * plus custom rr/alloc flags
//```省略
}
因为x0指向了一个Class对象, 所以其内存地址偏移16字节的地方自然就是objc_class的第3个成员变量即cache_t cache;
好的,crash正好就发生在这里,读取这个内存的时候出错了,而且这个地址是一个非法地址,因为地址空间不在主进程的地址空间范围内,只能怀疑是传入的cls有问题
cls有问题,那就说明delegate的isa有问题,但是一般而言isa是不可能有问题的,否则crash的堆栈就不是在这里了,那么只有一种可能delegate有问题,导致delegate+8地址偏移出错,取出来的isa地址越界,所以导致了这个crash;
3.结论
由上分析,初步认为这个crash本质还是因为传入的对象野指针了,导致获取其isa时出现了问题,接着再操作读取这个问题isa时自然就会野指针了;所以解决办法还是对传入对象做多线程读写互斥加锁;
至于为什么这里不是平常的objc_msgSend问题呢?
这里有个特别的地方在于之前我们为了解决ASIHTTPRequest的assign delegate的问题,引用了NSProxy去管理weak delegate;所以最终delegate实际上是一个把消息转发给weak delegate的NSProxy实例,而不是一个NSObject实例