iOS底层原理 - alloc & init & new 详解

问题：在iOS开发中，我们写的最多的可能就是以下代码

Car *car = [[Car alloc] init];

Car *car = [Car new];

创建对象必须要调用的方法。但是，你知道他们的区别，以及分别具有什么作用吗？为什么要alloc init 一起使用？对象的本质是什么？

接下来我们从源码角度对以上几个函数进行分析

1.准备工作

源码下载：objc4-781
源码编译：可参考教程源码编译调试

2.源码分析

2.1 alloc源码

源码在手，天下我有！接着就很简单了，创建一个Person类，然后调用 alloc 方法，main.m 如下:

#import 
#import "Person.h"

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        // insert code here...
        Person *objc = [Person alloc];
        NSLog(@"Hello, World! %@", objc);
    }
    return 0;
}

此时，按住command 鼠标左键就可以跳入alloc方法内部了
【第一步】NSObject.mm中的 alloc 实现，可以看到alloc内部调用了一个叫 _objc_rootAlloc 的函数，并传入了当前Person类

+ (id)alloc {
    return _objc_rootAlloc(self);
}

【第二步】_objc_rootAlloc内部又调用了 callAlloc() ，同样传入了Person类

id _objc_rootAlloc(Class cls)
{
    return callAlloc(cls, false/*checkNil*/, true/*allocWithZone*/);
}

【第三步】callAlloc函数内部

static ALWAYS_INLINE id
callAlloc(Class cls, bool checkNil, bool allocWithZone=false)
{
#if __OBJC2__
    if (slowpath(checkNil && !cls)) return nil;
    if (fastpath(!cls->ISA()->hasCustomAWZ())) {
        return _objc_rootAllocWithZone(cls, nil);
    }
#endif

    // No shortcuts available.
    if (allocWithZone) {
        return ((id(*)(id, SEL, struct _NSZone *))objc_msgSend)(cls, @selector(allocWithZone:), nil);
    }
    return ((id(*)(id, SEL))objc_msgSend)(cls, @selector(alloc));
}

到了这里，函数出现了分支。因此，在流程判断上，就会复杂一些不过，我们编译了源码，根据断点很容易判断流程走了 _objc_rootAllocWithZone() 函数
这里解释下为什么会走 _objc_rootAllocWithZone() ?
我们的 NSObject 类经历过改版，原本会调用allocWithZone:方法，改版后去掉了。
【第四步】接下来，_objc_rootAllocWithZone() 将我们的流程带入到了objc-runtime-new.mm文件中

id
_objc_rootAllocWithZone(Class cls, malloc_zone_t *zone __unused)
{
    // allocWithZone under __OBJC2__ ignores the zone parameter
    return _class_createInstanceFromZone(cls, 0, nil,
                                         OBJECT_CONSTRUCT_CALL_BADALLOC);
}

【第五步】_objc_rootAllocWithZone() 内部进入到了alloc源码的核心操作 _class_createInstanceFromZone() 函数，也就是这部分代码，可以解答文章开始提出的问题

static ALWAYS_INLINE id
_class_createInstanceFromZone(Class cls, size_t extraBytes, void *zone,
                              int construct_flags = OBJECT_CONSTRUCT_NONE,
                              bool cxxConstruct = true,
                              size_t *outAllocatedSize = nil)
{
    ASSERT(cls->isRealized());

    // Person类或者父类是否有C++构造方法或析够方法
    bool hasCxxCtor = cxxConstruct && cls->hasCxxCtor();
    bool hasCxxDtor = cls->hasCxxDtor();
    //是否能alloc nonpointer isa
    bool fast = cls->canAllocNonpointer();
    size_t size;
    //计算对象所需大学，这里extraBytes == 0，由前一函数传递
    size = cls->instanceSize(extraBytes);
    //将计算好的size通过指针地址传递出去，不影响我们理解主流程，可以忽略
    if (outAllocatedSize) *outAllocatedSize = size;

    id obj;
    //判断是否有zone,在之前的版本是有zone的
    if (zone) {
        obj = (id)malloc_zone_calloc((malloc_zone_t *)zone, 1, size);
    } else {
        // alloc 开辟内存的地方
        obj = (id)calloc(1, size);
    }
    //这里是内存开辟错误执行的流程，字面意思可以看到，系统调用了alloc失败句柄
    if (slowpath(!obj)) {
        if (construct_flags & OBJECT_CONSTRUCT_CALL_BADALLOC) {
            return _objc_callBadAllocHandler(cls);
        }
        return nil;
    }

    if (!zone && fast) {
        // 将类与 isa 关联
        obj->initInstanceIsa(cls, hasCxxDtor);
    } else {
        // Use raw pointer isa on the assumption that they might be
        // doing something weird with the zone or RR.
        //老版本的内存分配
        obj->initIsa(cls);
    }
    
    //若没有C++构造函数，则将关联好类的isa返回
    if (fastpath(!hasCxxCtor)) {
        return obj;
    }
    //若有c++构造函数，则走c++构造流程
    construct_flags |= OBJECT_CONSTRUCT_FREE_ONFAILURE;
    return object_cxxConstructFromClass(obj, cls, construct_flags);
}

这段代码我写了注释，通过源码的阅读，我们发现这里执行了三个重要的操作，分别是：

cls->instanceSize()：计算需要开辟的内存空间大小。
calloc()：申请内存，返回地址指针。
obj->initInstanceIsa()：将类与内存以及指针进行关联。

接下来我们分别看一下这三个函数。

首先，第一个函数，size = cls->instanceSize(extraBytes);这里cls就是我们之前传进来的参数Person类。因此，此时我们已经来到了objc_class对象中（详情请参考runtime中class的数据结构）

size_t instanceSize(size_t extraBytes) const {
        //判断能否快速查找需要开辟的内存空间大小（查找是否有缓存数据）
        if (fastpath(cache.hasFastInstanceSize(extraBytes))) {
            return cache.fastInstanceSize(extraBytes);
        }

        //无法快速查找，alignedInstanceSize()计算
        size_t size = alignedInstanceSize() + extraBytes;
        // CF requires all objects be at least 16 bytes.
        // 内存大小最小为16，不足的会补齐
        if (size < 16) size = 16;
        return size;
    }

根据断点判断，这里调用了cache.fastInstanceSize()，cache是objc_runtime_new.mm 中的一个结构体。

size_t fastInstanceSize(size_t extra) const
    {
        ASSERT(hasFastInstanceSize(extra));

        //GCC的内建函数 __builtin_constant_p 用于判断一个值是否为编译时常数，如果参数EXP 的值是常数，函数返回 1，否则返回 0
        if (__builtin_constant_p(extra) && extra == 0) {
            return _flags & FAST_CACHE_ALLOC_MASK16;
        } else {
            //类所需的内存大小已经存在于_flags中了，与一个常数做&运算就是size的精准大小
            size_t size = _flags & FAST_CACHE_ALLOC_MASK;
            // remove the FAST_CACHE_ALLOC_DELTA16 that was added
            // by setFastInstanceSize
            //删除由setFastInstanceSize添加的FAST_CACHE_ALLOC_DELTA16 8个字节
            //之后做16位字节对齐，也就是按16取整
            return align16(size + extra - FAST_CACHE_ALLOC_DELTA16);
        }
    }

align16() 具体实现，优化过的取整运算，之后就得到了我们要分配的内存大小

static inline size_t align16(size_t x) {
    return (x + size_t(15)) & ~size_t(15);
}

然后再来看看无法快速查找的流程 alignedInstanceSize()，这里，系统进行了两步操作，分别是计算了Person类占用大小，然后进行了字节对齐算法。

uint32_t alignedInstanceSize() const {
        //第一步，计算对象要占用的内存大小
        //第二步，对内存大小进行字节对齐算法，这里是8字节对齐
        return word_align(unalignedInstanceSize());
    }

unalignedInstanceSize() , 其实这个函数比较简单，就是调用了逐级查找，最终找到了 instanceSize ，另一方面也说明了instanceSize在类的初始化过程中已经计算好了。

uint32_t unalignedInstanceSize() const {
        ASSERT(isRealized());
        return data()->ro()->instanceSize;
    }

其次，是 calloc() 内存空间的申请，这个函数在_malloc中。这个函数做的工作是分配一块大小为size的内存，并赋值给obj，因此 obj是指向内存地址的指针。

obj = (id)calloc(1, size);

最后是obj->initInstanceIsa()，这个函数负责将类与内存以及指针进行关联，具体如何关联，我们点进去看一下，这里，来到了objc_object.h 当中，这也表示着，我们上一步得到的 obj 是一个objc_object类型的对象。

inline void 
objc_object::initInstanceIsa(Class cls, bool hasCxxDtor)
{
    ASSERT(!cls->instancesRequireRawIsa());
    ASSERT(hasCxxDtor == cls->hasCxxDtor());

    initIsa(cls, true, hasCxxDtor);
}

initInstanceIsa()函数内部又调用了initIsa(cls, true, hasCxxDtor);

inline void 
objc_object::initIsa(Class cls, bool nonpointer, bool hasCxxDtor) 
{ 
    ASSERT(!isTaggedPointer()); 
    //判断是否是nonpointer类型对象，根据断点判断，我们的对象是nonpointer类型，因此走了else分支
    if (!nonpointer) {
        //利用cls初始化了一个isa_t的联合体，里面的代码并没有开源
        isa = isa_t((uintptr_t)cls);
    } else {
        ASSERT(!DisableNonpointerIsa);
        ASSERT(!cls->instancesRequireRawIsa());
        //主流程
        //新声明了一个isa_t类型的变量newisa,之后进行赋值
        isa_t newisa(0);

#if SUPPORT_INDEXED_ISA
        ASSERT(cls->classArrayIndex() > 0);
        newisa.bits = ISA_INDEX_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
        newisa.has_cxx_dtor = hasCxxDtor;
        newisa.indexcls = (uintptr_t)cls->classArrayIndex();
#else
        newisa.bits = ISA_MAGIC_VALUE;
        // isa.magic is part of ISA_MAGIC_VALUE
        // isa.nonpointer is part of ISA_MAGIC_VALUE
        newisa.has_cxx_dtor = hasCxxDtor;
        //重点：shiftcls存放了类对象的指针，这里是赋值操作，从newisa的第3～35位，为类对象指针地址
        newisa.shiftcls = (uintptr_t)cls >> 3;
#endif

        // This write must be performed in a single store in some cases
        // (for example when realizing a class because other threads
        // may simultaneously try to use the class).
        // fixme use atomics here to guarantee single-store and to
        // guarantee memory order w.r.t. the class index table
        // ...but not too atomic because we don't want to hurt instantiation
        //返回isa
        isa = newisa;
    }
}

这里是对isa的赋值操作，其实类对象的地址存在了isa的3～35位，详情请看nonpointerisa介绍。通过断点调试来印证上面的说法，在执行完initInstanceIsa后，在通过po obj可以得出一个对象指针

1603162993810.jpg

总结：
通过对 alloc 源码分析，我们了解到，alloc 开辟内存的核心步骤有3步，分别是 计算size -> 开辟内存 -> 关联对象

2.2 init源码

alloc源码分析完后，我们来看一下 init 源码。出乎意料的发现，init方法不只有实例方法，还有一个类方法 + (id)init，因此我们先来看一下init类方法实现。

+ (id)init {
    return (id)self;
}

试一下直接调用，系统抛出了错误异常，说不能这么调用，呃 ...... 不好意思，打扰了，当我没有看见。
还是看一下init实例方法吧

- (id)init {
    return _objc_rootInit(self);
}

id
_objc_rootInit(id obj)
{
    // In practice, it will be hard to rely on this function.
    // Many classes do not properly chain -init calls.
    return obj;
}

发现，依然是返回 self ，所以，这里的 init 不起任何作用，这里单纯的是一种工厂设计模式。系统开发了一个 init 的初始化函数，供我们在对象初始化是执行一些操作。

2.3 new源码

一般在开发中，初始化除了init，还可以使用new，两者本质上并没有什么区别，以下是objc中new的源码实现，通过源码可以得知，new函数中直接调用了callAlloc函数（即alloc中分析的函数），且调用了init函数，所以可以得出new 其实就等价于 [[Person alloc] init]的结论。

+ (id)new {
    return [callAlloc(self, false/*checkNil*/) init];
}

3.总结

本篇文章从源码角度探索了 alloc/ init/ new三个方法的内部实现

alloc: 主要负责了计算要开辟的内存大小，开辟内存，以及关联类。
init：并没有执行任何操作，只是系统为我们提供的一个初始化的入口
new: 等同于 [[Class alloc] init] 操作，可理解我alloc init的一个简写。