runtime1

Objective-C作为一门动态语言，将很多事都在运行期间完成，如消息发送、消息转发、动态的方法交换、对象关联(为类添加实例变量)、拦截系统自带的方法调用(Swizzle黑魔法)、KVC、KVO

我们先来探究下我们最常用的消息发送机制是如何实现的。

1.消息发送的基础

(1)对象、类、元类

[receiver message];

这一类代码我们可以通过clang转换为C++代码进行窥探

//NSString * str = [[NSString alloc] initWithFormat: @"hello world!"];
NSUInteger a = [str length];

//terminal终端输入: clang -rewrite-objc test.m

NSUInteger a = ((NSUInteger (*)(id, SEL))(void *)objc_msgSend)((id)str, sel_registerName("length"));

我们看到调用了一个返回NSUInteger参数为(id, SEL)的函数，通过函数指针objc_msgSend调用。通过对参数的观察发现，传入了str对象指针，方法是通过对方法的字符串"length"进行解析获得。即转变成的调用形式为

objc_msgSend(receiver, selector)    //无参数
objc_msgSend(receiver, selector, arg1, arg2, ...)  //有参数

通过汇编查看也是调用_objc_msgSend

movq    L_OBJC_SELECTOR_REFERENCES_.4(%rip), %rsi
movq    %rax, %rdi
callq   _objc_msgSend

没有直接调用对应的方法，而是放在了运行时进行查找。这也是OC语言的动态性所在。

我们先看参数id与SEL的意义，id是OC中的关键字之一，作用类似于void *，这里传入了我们的对象指针str,显然这是一个对象指针，OC中对id的定义可以在runtime中窥见

typedef struct objc_class *Class;
typedef struct objc_object *id;
struct objc_object {
private:
    isa_t isa;
public:
    //一堆方法实现
};

这个结构体包含了一个isa和一堆方法实现，isa是isa_t的联合体类型，

OC所有对象都包含isa，实例对象中其指向类，类中指向元类。OC中类和元类也是对象，不过只是唯一而已。类中存了实例对象方法，元类中存了类方法。具体关系如图所示，superclass和isa构成了整个Objective-C的对象继承关系。发送方法消息时，实例对象可以通过isa找到类对象。

对象类元类

Rootclass在OC中为NSObject类，继承关系图示非常清晰了。值得注意的便是NSObject的元类的超类是NSObject。

我们需要了解类对象的定义。在runtime.h中我们可以找到类对象的定义

struct objc_class {
    Class _Nonnull isa  OBJC_ISA_AVAILABILITY;

#if !__OBJC2__
    Class _Nullable super_class                              OBJC2_UNAVAILABLE;
    const char * _Nonnull name                               OBJC2_UNAVAILABLE;
    long version                                             OBJC2_UNAVAILABLE;
    long info                                                OBJC2_UNAVAILABLE;
    long instance_size                                       OBJC2_UNAVAILABLE;
    struct objc_ivar_list * _Nullable ivars                  OBJC2_UNAVAILABLE;
    struct objc_method_list * _Nullable * _Nullable methodLists                    OBJC2_UNAVAILABLE;
    struct objc_cache * _Nonnull cache                       OBJC2_UNAVAILABLE;
    struct objc_protocol_list * _Nullable protocols          OBJC2_UNAVAILABLE;
#endif

} OBJC2_UNAVAILABLE;

我们在其中可以看到类对象中定义如下

super_class:指明其父类

isa：指明其元类

methodLists :指明方法的具体实现

cache：缓存最近用到的方法，减少消息发送开销

protocols ：要实现的原型列表

Ivars: 类的实例变量

下图展示了相关结构体的关系与isa中结构体bit意义

类对象与isa

isa

ivars：该类的对象成员变量链表,这里和下面的两个结构体都用到了变长结构体的技术，即通过定义一个ivar_list[1]占位符，最后alloc变长堆空间时，以其为基准地址进行偏移量的增加获得成员变量。

如

objc_ivar_list * test = malloc(sizeof(objc_ivar_list)+sizeof(objc_ivar)*SIZE);
//然后通过test->ivar_list[i]可以直接访问

struct objc_ivar_list {
    int ivar_count                                           OBJC2_UNAVAILABLE;//成员变量个数
    #ifdef __LP64__
    int space                                                OBJC2_UNAVAILABLE;
    #endif
    /* variable length structure */
    struct objc_ivar ivar_list[1]                            OBJC2_UNAVAILABLE;
} 

struct objc_ivar {
    char *ivar_name                 OBJC2_UNAVAILABLE;  // 变量名
    char *ivar_type                 OBJC2_UNAVAILABLE;  // 变量类型
    int ivar_offset                 OBJC2_UNAVAILABLE;  // 基地址偏移字节
    #ifdef __LP64__
    int space                       OBJC2_UNAVAILABLE;
    #endif
}

methodLists：该类的实例方法链表。这是一个二级指针，即指针变量当中存的是一个地址，你可以改变这个地址的值从而改变最终指向的变量。可以动态的修改*methodLists的值来添加方法。而ivars是一级指针，这也是category能增加方法而无法增加实例变量的体现了。objc_method中包含了IMP的指针，IMP实际就是方法地址。定义见下面

struct objc_method_list {
    struct objc_method_list *obsolete                        OBJC2_UNAVAILABLE;

    int method_count                                         OBJC2_UNAVAILABLE;
    #ifdef __LP64__
    int space                                                OBJC2_UNAVAILABLE;
    #endif
    /* variable length structure */
    struct objc_method method_list[1]                        OBJC2_UNAVAILABLE;
}   

struct objc_method {
    SEL method_name                                          OBJC2_UNAVAILABLE;//方法名
    char *method_types                                       OBJC2_UNAVAILABLE;//返回值和参数
    IMP method_imp                                           OBJC2_UNAVAILABLE;//实现
}


typedef void (*IMP)(void /* id, SEL, ... */ );

cache：缓存最近用到的方法，消息发送时先从cache中查找，没有才会进入方法表中查找。

struct objc_cache {
    unsigned int mask /* total = mask + 1 */                 OBJC2_UNAVAILABLE;
    //一个整数，指定分配的缓存bucket的总数
    unsigned int occupied                                    OBJC2_UNAVAILABLE;
    //指定实际占用的缓存bucket的总数。
    Method buckets[1]                                        OBJC2_UNAVAILABLE;
    //buckets作为cache数组基址，这个数组包含不超过mask+1个元素，
};

(2)消息发送

有了大致的基础我们现在来看如何进行消息发送

消息发送和转发流程如图，可略，仅做备份。

消息发送与转发路径流程图

<1>.objc_msgSend

参考objc-msg-x86_64.s，汇编伪指令参考汇编伪指令

首先看调用的objc_msgSend(id self, SEL _cmd,...);方法实现

我们看到在objc-msg-x86_64.s定义了多个msgsend函数，在编译器消息发送时，编译器会根据不同的返回值类型决定生成哪一个msgsend函数，具体情况如下

//定义八字节数据.quard伪指令，即函数地址。
.align 4
.private_extern _objc_entryPoints
_objc_entryPoints:
    .quad   _cache_getImp
    .quad   _objc_msgSend
    //Sends a message with a simple return value to an instance of a class.
    .quad   _objc_msgSend_fpret
    //a floating-point return value
    .quad   _objc_msgSend_fp2ret
    //complex long double` return only.
    .quad   _objc_msgSend_stret
    //a data-structure return value
    .quad   _objc_msgSendSuper
    //Sends a message with a simple return value to the superclass of an instance of a class.
    .quad   _objc_msgSendSuper_stret
    //a data-structure return value
    .quad   _objc_msgSendSuper2
    .quad   _objc_msgSendSuper2_stret
    .private_extern _objc_exitPoints
_objc_exitPoints:
    .quad   LExit_cache_getImp
    .quad   LExit_objc_msgSend
    .quad   LExit_objc_msgSend_fpret
    .quad   LExit_objc_msgSend_fp2ret
    .quad   LExit_objc_msgSend_stret
    .quad   LExit_objc_msgSendSuper
    .quad   LExit_objc_msgSendSuper_stret
    .quad   LExit_objc_msgSendSuper2
    .quad   LExit_objc_msgSendSuper2_stret
    .quad   0

我们这次只看第一个，其他代码结构是类似的。

/********************************************************************
 *
 * id objc_msgSend(id self, SEL _cmd,...);
 *
 ********************************************************************/
    
    .data
    .align 3
    .globl _objc_debug_taggedpointer_classes
_objc_debug_taggedpointer_classes:
    .fill 16, 8, 0

    ENTRY   _objc_msgSend
    MESSENGER_START

    NilTest NORMAL

    GetIsaFast NORMAL       // r11 = self->isa
    CacheLookup NORMAL      // calls IMP on success
    /*
    CacheLookup实现中会判断调用IMP
    4:  ret
    .elseif $0 == NORMAL  ||  $0 == FPRET  ||  $0 == FP2RET
    // eq already set for forwarding by `jne`
    MESSENGER_END_FAST
    jmp *8(%r10)        // call imp
    */

    NilTestSupport  NORMAL

    GetIsaSupport   NORMAL

// cache miss: go search the method lists
LCacheMiss:
    // isa still in r11
    MethodTableLookup %a1, %a2  // r11 = IMP
    cmp %r11, %r11      // set eq (nonstret) for forwarding
    jmp *%r11           // goto *imp

    END_ENTRY   _objc_msgSend
    
    ENTRY _objc_msgSend_fixup   //int3断点
    int3
    END_ENTRY _objc_msgSend_fixup

    
    STATIC_ENTRY _objc_msgSend_fixedup
    // Load _cmd from the message_ref
    movq    8(%a2), %a2
    jmp _objc_msgSend
    END_ENTRY _objc_msgSend_fixedup

根据注解我们可以看到，objc_msgSend首先调用GetIsaFast将isa赋值到R11寄存器中，然后通过调用CacheLookup进行cache查找，如果找到则直接调用IMP指向的方法。

如果没有找到则通过MethodTableLookup方法进行查找方法表，动态解析也在这一步中添加。

/////////////////////////////////////////////////////////////////////
//
// MethodTableLookup classRegister, selectorRegister
//
// Takes:   $0 = class to search (a1 or a2 or r10 ONLY)
//      $1 = selector to search for (a2 or a3 ONLY)
//      r11 = class to search
//
// On exit: imp in %r11
//
/////////////////////////////////////////////////////////////////////
.macro MethodTableLookup

    MESSENGER_END_SLOW
    
    SaveRegisters

    // _class_lookupMethodAndLoadCache3(receiver, selector, class)

    movq    $0, %a1
    movq    $1, %a2
    movq    %r11, %a3
    call    __class_lookupMethodAndLoadCache3

    // IMP is now in %rax
    movq    %rax, %r11

    RestoreRegisters

.endmacro

/***********************************************************************
* _class_lookupMethodAndLoadCache.
* Method lookup for dispatchers ONLY. OTHER CODE SHOULD USE lookUpImp().
* This lookup avoids optimistic cache scan because the dispatcher 
* already tried that.
**********************************************************************/
IMP _class_lookupMethodAndLoadCache3(id obj, SEL sel, Class cls)
{
    return lookUpImpOrForward(cls, sel, obj, 
                              YES/*initialize*/, NO/*cache*/, YES/*resolver*/);
}

其中调用了_class_lookupMethodAndLoadCache3(receiver, selector, class)参数分别是对象，方法，类对象。

并将 IMP 返回（从 rax 挪到 r11）。最后在 objc_msgSend 中调用 IMP方法。

那么查找路径显然在lookUpImpOrForward里面

<2>. lookUpImpOrForward 快速查找 IMP

/***********************************************************************
* lookUpImpOrForward.
* The standard IMP lookup. 
* initialize==NO tries to avoid +initialize (but sometimes fails)
* cache==NO skips optimistic unlocked lookup (but uses cache elsewhere)
* Most callers should use initialize==YES and cache==YES.
* inst is an instance of cls or a subclass thereof, or nil if none is known. 
*   If cls is an un-initialized metaclass then a non-nil inst is faster.
* May return _objc_msgForward_impcache. IMPs destined for external use 
*   must be converted to _objc_msgForward or _objc_msgForward_stret.
*   If you don't want forwarding at all, use lookUpImpOrNil() instead.
**********************************************************************/
IMP lookUpImpOrForward(Class cls, SEL sel, id inst, 
                       bool initialize, bool cache, bool resolver)
{
    IMP imp = nil;
    bool triedResolver = NO;

    runtimeLock.assertUnlocked();

    // Optimistic cache lookup
    if (cache) {
        imp = cache_getImp(cls, sel);
        if (imp) return imp;
    }

    // runtimeLock is held during isRealized and isInitialized checking
    // to prevent races against concurrent realization.

    // runtimeLock is held during method search to make
    // method-lookup + cache-fill atomic with respect to method addition.
    // Otherwise, a category could be added but ignored indefinitely because
    // the cache was re-filled with the old value after the cache flush on
    // behalf of the category.

    runtimeLock.read();

    if (!cls->isRealized()) {
        // Drop the read-lock and acquire the write-lock.
        // realizeClass() checks isRealized() again to prevent
        // a race while the lock is down.
        runtimeLock.unlockRead();
        runtimeLock.write();

        realizeClass(cls);

        runtimeLock.unlockWrite();
        runtimeLock.read();
    }

    if (initialize  &&  !cls->isInitialized()) {
        runtimeLock.unlockRead();
        _class_initialize (_class_getNonMetaClass(cls, inst));
        runtimeLock.read();
        // If sel == initialize, _class_initialize will send +initialize and 
        // then the messenger will send +initialize again after this 
        // procedure finishes. Of course, if this is not being called 
        // from the messenger then it won't happen. 2778172
    }

    
 retry:    
    runtimeLock.assertReading();

    // Try this class's cache.

    imp = cache_getImp(cls, sel);
    if (imp) goto done;

    // Try this class's method lists.
    {
        Method meth = getMethodNoSuper_nolock(cls, sel);
        if (meth) {
            log_and_fill_cache(cls, meth->imp, sel, inst, cls);
            imp = meth->imp;
            goto done;
        }
    }

    // Try superclass caches and method lists.
    {
        unsigned attempts = unreasonableClassCount();
        for (Class curClass = cls->superclass;
             curClass != nil;
             curClass = curClass->superclass)
        {
            // Halt if there is a cycle in the superclass chain.
            if (--attempts == 0) {
                _objc_fatal("Memory corruption in class list.");
            }
            
            // Superclass cache.
            imp = cache_getImp(curClass, sel);
            if (imp) {
                if (imp != (IMP)_objc_msgForward_impcache) {
                    // Found the method in a superclass. Cache it in this class.
                    log_and_fill_cache(cls, imp, sel, inst, curClass);
                    goto done;
                }
                else {
                    // Found a forward:: entry in a superclass.
                    // Stop searching, but don't cache yet; call method 
                    // resolver for this class first.
                    break;
                }
            }
            
            // Superclass method list.
            Method meth = getMethodNoSuper_nolock(curClass, sel);
            if (meth) {
                log_and_fill_cache(cls, meth->imp, sel, inst, curClass);
                imp = meth->imp;
                goto done;
            }
        }
    }

    // No implementation found. Try method resolver once.

    if (resolver  &&  !triedResolver) {
        runtimeLock.unlockRead();
        _class_resolveMethod(cls, sel, inst);
        runtimeLock.read();
        // Don't cache the result; we don't hold the lock so it may have 
        // changed already. Re-do the search from scratch instead.
        triedResolver = YES;
        goto retry;
    }

    // No implementation found, and method resolver didn't help. 
    // Use forwarding.

    imp = (IMP)_objc_msgForward_impcache;
    cache_fill(cls, sel, imp, inst);

 done:
    runtimeLock.unlockRead();

    return imp;
}

我们传入了

return lookUpImpOrForward(cls, sel, obj, YES/*initialize*/, NO/*cache*/, YES/*resolver*/);

先判断类是否时第一次用到，是否要初始化，然后进行初始化。retray中开始进行标准的方法表查找

过程如下：

查找类中的cache是否缓存该方法。IMP非nil，命中缓存则跳转到done；否则继续

尝试查找类的方法表，调用getMethodNoSuper_nolock(cls, sel)查找Method。如果search_method_list有序则采用二分法查找，无序则顺序查找。如果找到Method，存cache，赋值imp=Method->imp,跳转到done；否则继续

继承层级中递归向超类查找。先查cache。命中缓存，找到imp，则跳到4；没找到则跳5

其不是_objc_msgForward_impcache(消息转发函数)则写入类的cache，跳转到done；否则遇到消息转发的标记，终止递归查找，跳6

尝试查找超类的方法表，调用getMethodNoSuper_nolock(curClass, sel)查找Method。如果search_method_list有序则采用二分法查找，无序则顺序查找。如果找到Method，存cache，赋值imp=Method->imp,跳转到done；否则跳3

动态方法解析，这步是消息转发前的最后一次机会。此时释放读入锁（runtimeLock.unlockRead()），接着间接地发送 +resolveInstanceMethod 或 +resolveClassMethod 消息动态添加imp方法。完成后回到1查找imp，否则将_objc_msgForward_impcache当做 IMP 并写入缓存，进入done

done：读操作解锁，并将之前找到的 IMP 返回。这里的imp有可能是找到的方法地址，也可能是_objc_msgForward_impcache方法转发函数地址。

通过这一过程消息发送找到imp，或者返回消息转发函数的地址。在这一步消息机制结束，

jmp *%r11           // goto *imp

要么执行方法，要么进入消息转发。

<3>.动态解析的使用

方法表的递归查找失败时会进行动态解析，动态解析失败则进入方法转发。动态解析会给我们三次机会。

Method resolution

Fast forwarding

Normal forwarding

具体参考Objective-C Runtime 之动态方法解析实践

具体将会在后面文章阐述