iOS 源码解析 - Runtime篇（2 objc

objc-runtime 开源地址

由于OC是属于C的超集再加上runtime的存在，我们写的每一个OC方法在编译阶段被转成
id objc_msgSend(id self, SEL op, ...)

关于它的实现已经有大神提供了C语言版本的实现由于每个OC方法都会转换成这个函数调用，所以它的高效性显得尤为重要。

关于objc_msgSend的实现过程，上篇文章其实我们也有提到过，归根到底，就是利用SEL去寻找IMP，执行目标函数。
我们来分析一下这个"寻根"的过程:

YY大神在他的博客中提到：

id objc_msgSend(id self, SEL op, ...) {
    if (!self) return nil;
    IMP imp = class_getMethodImplementation(self->isa, SEL op);
    imp(self, op, ...); //调用这个函数，伪代码.
}

class_getMethodImplementation他的实现在源码里是可以找到的。

第一步调用这个函数

IMP lookUpImpOrNil(Class cls, SEL sel, id inst, 
                   bool initialize, bool cache, bool resolver)
{
    IMP imp = lookUpImpOrForward(cls, sel, inst, initialize, cache, resolver);
    if (imp == _objc_msgForward_impcache) return nil;
    else return imp;
}

然后主要是这个函数：

IMP lookUpImpOrForward(Class cls, SEL sel, id inst, 
                       bool initialize, bool cache, bool resolver)

这个方法的执行过程很有趣：

// 先从cache里检查是否存在
 if (cache) {
        imp = cache_getImp(cls, sel); // 此方法在上面大神提供的C语言实现中，有具体实现。
        if (imp) return imp;
    }

 // cache寻找
  cls   = self->isa;
  cache = cls->cache;
  hash  = cache->mask;
  index = (unsigned int) _cmd & hash;
      
  do{
     method = cache->buckets[ index];
     if(!method) goto recache;
     index = (index + 1) & cache->mask;
  }while( method->method_name != _cmd);

  return( (*method->method_imp)( (id) self, _cmd));

为了读懂上面的代码。这里不得不提的事class的源码结构：盗Vanney大神的图

iOS 源码解析 - Runtime篇（2 objc_msgSend）_第1张图片

bits.png

isa superclass 这二者的作用很容易理解。
cache 缓存的方法列表
class_data_bits_t bits这个结构体非常重要！它存储了非常多的信息，包括编译时确定的类的变量信息，方法列表，协议方法列表，weak表... 关于它我们稍后再谈。

现在我们知道了Class类型的内部结构，回头我们再看下刚才的消息调用。

cls = self->isa; // 通过isa指针拿到当前对象的class
cache = cls->cache; // 通过class拿到cache—方法缓存列表
hash = cache->mask; index = (unsigned int) _cmd & hash; // 通过cmd和cache掩码的与运算获取method在map表中的序列号（这里我们可以看到，哈希表中其实存储的是method即SEL和IMP的映射关系）,进而拿到最终的IMP指针。

struct method_t {
  SEL name;
  const char *types;
  IMP imp;
}

通过这一系列的操作我们最终获取到了函数的地址。但这仅仅是从方法缓存中获取方法。那么，如果cache里没有对应的IMP呢？

在回到IMP lookUpImpOrForward(Class cls, SEL sel, id inst, bool initialize, bool cache, bool resolver)函数，往下走会代码会执行到这一句:
它主要对类创建了真正的运行时环境( rwlock_writer_t lock(runtimeLock); 保证线程安全)。

if (!cls->isRealized()) {
  rwlock_writer_t lock(runtimeLock);
  realizeClass(cls);
}

具体实现:

static Class realizeClass(Class cls){
...
 // 中间代码
    ro = (const class_ro_t *)cls->data();
    if (ro->flags & RO_FUTURE) {
        // This was a future class. rw data is already allocated.
        rw = cls->data();
        ro = cls->data()->ro;
        cls->changeInfo(RW_REALIZED|RW_REALIZING, RW_FUTURE);
    } else {
        // Normal class. Allocate writeable class data.
        rw = (class_rw_t *)calloc(sizeof(class_rw_t), 1);
        rw->ro = ro;
        rw->flags = RW_REALIZED|RW_REALIZING;
        cls->setData(rw);
    }
...
}

到了这里我们可以接着看class_data_bits_t结构体了：

// 只截取了部分 源码
struct class_data_bits_t {

    // Values are the FAST_ flags above.
    uintptr_t bits;
  
    class_rw_t* data() {
        return (class_rw_t *)(bits & FAST_DATA_MASK);
    }
    void setData(class_rw_t *newData)
    {
        assert(!data()  ||  (newData->flags & (RW_REALIZING | RW_FUTURE)));
        // Set during realization or construction only. No locking needed.
        bits = (bits & ~FAST_DATA_MASK) | (uintptr_t)newData;
    }
    bool isSwift() {
        return getBit(FAST_IS_SWIFT);
    }
}

这个结构体中只有一个变量 uintptr_t bits; 它是一个拥有指针存储功能的unsigned long 类型。它只有64位大小，指针存储结构如下:

class_bits.png

通过与bits与对应flag的按位运算得到对应的指针地址。比如：

bool isSwift() {
     return getBit(FAST_IS_SWIFT);
}

#define FAST_IS_SWIFT (1UL<<0)与上图的结构一直，第一位标识位储存是否是Swift语言的flag(由编译器设置)。

不过，最关键的还是下面这个:

 class_rw_t* data() {
        return (class_rw_t *)(bits & FAST_DATA_MASK);
}

它的data函数，返回一个class_rw_t 类型的结构体指针。

根据上面的数据结构分布图，bits 里有44位储存着class_rw_t。

这一点可以在#define FAST_DATA_MASK 0x00007ffffffffff8UL里解释。

然而，在源码中，ro = (const class_ro_t *)cls->data();class的data即：

    class_rw_t *data() { 
        return bits.data(); // 调用的上面的函数
    }

被强转成了const class_ro_t *这是为什么呢？
其实，在runtime调用之前，编译之后，bits.data()也就是bits的class_rw_t data是指向const class_ro_t结构的。
我们再来看class_ro_t的结构


struct class_ro_t {
    uint32_t flags;
    uint32_t instanceStart;
    uint32_t instanceSize;
#ifdef __LP64__
    uint32_t reserved;
#endif

    const uint8_t * ivarLayout;
    
    const char * name;
    method_list_t * baseMethodList;
    protocol_list_t * baseProtocols;
    const ivar_list_t * ivars;

    const uint8_t * weakIvarLayout;
    property_list_t *baseProperties;

    method_list_t *baseMethods() const {
        return baseMethodList;
    }
};

这里面正真存储了，class在编译时期就确定的属性，方法，协议等等。
而且这里的

    method_list_t * baseMethodList;
    const ivar_list_t * ivars;

都是基于entsize_list_tt实现，保证了它们在runtime期间的不可变性。
同时我们在这里也可以顺便解释下分类方法的加载过程，为什么在分类中不能添加成员变量的问题:
我们创建的分类其实在源码中属于另一种类型:

struct category_t {
    const char *name;
    classref_t cls;
    struct method_list_t *instanceMethods;
    struct method_list_t *classMethods;
    struct protocol_list_t *protocols;
    struct property_list_t *instanceProperties;

    method_list_t *methodsForMeta(bool isMeta) {
        if (isMeta) return classMethods;
        else return instanceMethods;
    }

    property_list_t *propertiesForMeta(bool isMeta) {
        if (isMeta) return nil; // classProperties;
        else return instanceProperties;
    }
};

可以看到，它其实是没有isa指针的！但这并不能解释不能添加变量的问题。
我们要从它的装载过程说起。
在app启动后，系统会调用load_images的方法，来加载各种库文件，当然就包括runtime库，下面是objclib的加载过程：

_objc_init
└──map_2_images
    └──map_images_nolock
        └──_read_images

当执行到_read_images的时候，我们可以在源码中找到实现过程
线从boundle里获取class目录。
然后我们会发现，在这里调用了realizeClass(cls);方法！为类开辟了runtime预备环境(将bits的data重新指向了class_rw_t类型，并且将class_ro_t放入了class_rw_t的ro变量中)。
做完这些之后，在是对categories的处理。
真正实现分类中的类attach到class的方法是：

static void 
attachCategories(Class cls, category_list *cats, bool flush_caches);

    auto rw = cls->data();

    prepareMethodLists(cls, mlists, mcount, NO, fromBundle);
    rw->methods.attachLists(mlists, mcount);
    free(mlists);
    if (flush_caches  &&  mcount > 0) flushCaches(cls);

    rw->properties.attachLists(proplists, propcount);
    free(proplists);

    rw->protocols.attachLists(protolists, protocount);
    free(protolists);

可以看到，分类中的method等是被赋予到了，cls->data()中，这是cls->data()指向的是class_rw_t类型。
而在category_t中只有property_list_t没有ivar_list_t，并且在class_rw_t ro 中的ivar_list_t又是只读的，所以分类中的属性是不会生成实例变量的（但是可以利用另一种方法变相实现“添加变量”）。

苹果这样做的目的是为了保护class的在编译时期确定的内存空间的连续性，防止runtime时期增加的变量或者方法造成的内存重叠。

继续objc_msgSend的调用过程，通过isa指针得到的method_list_t等信息，我们就直接可以得到对应的IMP，然后调用函数，同时存入cache表中。

这一切都是基于函数能够成功调用的前提。那么，如果IMP没有找到呢？runtime会被触发另一套机制——消息转发。

关于runtime 方法调用源码中还有好多细节，由于精力能力有限，以后会慢慢补充！

iOS 源码解析 - Runtime篇 （2 objc_msgSend）

你可能感兴趣的:(iOS 源码解析 - Runtime篇 （2 objc_msgSend）)

iOS 源码解析 - Runtime篇（2 objc_msgSend）

你可能感兴趣的:(iOS 源码解析 - Runtime篇（2 objc_msgSend）)