iOS方法缓存-cache

1. cache的结构

  • 我们之前探索过Class的结构以及其内部的成员,其中了解到了isa,superClass以及bits的作用,但是剩下的cache,我们只能基本知道,其内部存放的只是一个key和imp的键值对,至于具体的作用我们还不是很清楚
  • 首先看一下,cache是一个cache_t结构体,在objc源码的objc-runtime-new.h中可以看到定义,以下就是cache_t的完整结构
struct cache_t {
    struct bucket_t *_buckets;
    mask_t _mask;
    mask_t _occupied;

public:
    struct bucket_t *buckets();
    mask_t mask();
    mask_t occupied();
    void incrementOccupied();
    void setBucketsAndMask(struct bucket_t *newBuckets, mask_t newMask);
    void initializeToEmpty();

    mask_t capacity();
    bool isConstantEmptyCache();
    bool canBeFreed();

    static size_t bytesForCapacity(uint32_t cap);
    static struct bucket_t * endMarker(struct bucket_t *b, uint32_t cap);

    void expand();
    void reallocate(mask_t oldCapacity, mask_t newCapacity);
    struct bucket_t * find(cache_key_t key, id receiver);

    static void bad_cache(id receiver, SEL sel, Class isa) __attribute__((noreturn));
};
  • cache_t的内部定义了三个成员,分别为mask_t类型的 _mask和_occupied,以及一个bucket_t的结构体指针
  • 其中mask_t可以看出是一个无符号Int类型,在64位下为uint32_t
  • bucket_t则是存放着impkey
#if __LP64__
typedef uint32_t mask_t;  // x86_64 & arm64 asm are less efficient with 16-bits
#else
typedef uint16_t mask_t;
#endif

struct bucket_t {
private:
    // IMP-first is better for arm64e ptrauth and no worse for arm64.
    // SEL-first is better for armv7* and i386 and x86_64.
#if __arm64__
    MethodCacheIMP _imp;
    cache_key_t _key;
#else
    cache_key_t _key;
    MethodCacheIMP _imp;
#endif

public:
    inline cache_key_t key() const { return _key; }
    inline IMP imp() const { return (IMP)_imp; }
    inline void setKey(cache_key_t newKey) { _key = newKey; }
    inline void setImp(IMP newImp) { _imp = newImp; }

    void set(cache_key_t newKey, IMP newImp);
};

2. cache功能

  • 根据名字,大家可以猜想,cache肯定是一种缓存,而且imp又是函数的调用地址,所以可以猜想一样,cache的功能就是对方法进行缓存,从加快之后的方法调用速度

3. cache验证

  • 还是在我们的源码工程下,新建一个类,然后调用一下方法sayHello,按照之前的逻辑在lldb调试台上,打印一下bucket的内容,可以看出bucket中的确保存了方法sayHelloimp
    image.png
2019-12-25 00:39:22.566292+0800 LGTest[3586:42169] LGPerson say : -[LGPerson sayHello]
(lldb) x/4gx pClass
0x1000012e0: 0x001d8001000012b9 0x0000000100b36140
0x1000012f0: 0x0000000101e23c20 0x0000000100000003
(lldb) p (cache_t *)0x1000012f0
(cache_t *) $1 = 0x00000001000012f0
(lldb) p *$1
(cache_t) $2 = {
  _buckets = 0x0000000101e23c20
  _mask = 3
  _occupied = 1
}
(lldb) p $2._buckets
(bucket_t *) $3 = 0x0000000101e23c20
(lldb) p *$3
(bucket_t) $4 = {
  _key = 4294971020
  _imp = 0x0000000100000c60 (LGTest`-[LGPerson sayHello] at LGPerson.m:13)
}
(lldb) 
  • 这里要注意一点,可能有人会问,为什么调用了allocclass,但是这两个方法怎么没有缓存,这里要提到我们之前探索类的方法存储中说到的,对象的方法存在类中,类的类方法以实例方法的形式存在元类中,我们这里探索的是类的cache缓存,所以只能找到实例方法sayHello,下面直接给大家看一下元类里的cache以及bucket,也找到了alloc方法的缓存,这也说明,我们的思路是正确的
(lldb) p/x 0x001d8001000012b9 & 0x00007ffffffffff8ULL
(unsigned long long) $5 = 0x00000001000012b8
// 0x00000001000012b8这个玩意就是元类的地址了,有疑惑的可以看我之前的isa的走向分析,里面介绍到了如何从类查找到元类
(lldb) x/4gx 0x00000001000012b8
0x1000012b8: 0x001d800100b360f1 0x0000000100b360f0
0x1000012c8: 0x0000000101e236c0 0x0000000200000003
(lldb) p (cache_t *)0x1000012c8
(cache_t *) $6 = 0x00000001000012c8
(lldb) p *$6
(cache_t) $7 = {
  _buckets = 0x0000000101e236c0
  _mask = 3
  _occupied = 2
}
(lldb) p $7._buckets
(bucket_t *) $8 = 0x0000000101e236c0
(lldb) p *$8
(bucket_t) $9 = {
  _key = 4298994200
  _imp = 0x00000001003cc3b0 (libobjc.A.dylib`::+[NSObject alloc]() at NSObject.mm:2294)
}
(lldb) 

4. cache的策略

4.1验证缓存是的确存在策略的
  • 现在,我们尝试多调用几次类方法,然后继续看看cache和buckets的值


    image.png
  • 如上图,我们依次调用了 init,sayHello,sayCode,sayNB一共4个实例方法,按照我们的猜测,cache中应该缓存了他们4个方法,我们下面打印输出看了一下,结果发现mask的值的确如我们所想的那样增加了很多,从3增加到了7,但是在buckets存放的值中,只有_buckets[2]中缓存了我们最新调用了的实例方法sayNB,其他位置全部都是空的

  • 那么我们可以推测,cache的缓存并不是无脑的,肯定是在某个条件达成时,进行了一些优化

2019-12-25 00:57:52.143504+0800 LGTest[3662:48762] LGPerson say : -[LGPerson sayHello]
2019-12-25 00:57:52.144031+0800 LGTest[3662:48762] LGPerson say : -[LGPerson sayCode]
2019-12-25 00:57:52.144133+0800 LGTest[3662:48762] LGPerson say : -[LGPerson sayNB]
(lldb) x/4gx pClass
0x1000012e8: 0x001d8001000012c1 0x0000000100b36140
0x1000012f8: 0x0000000101029950 0x0000000100000007
(lldb) p (cache_t *)0x1000012f8
(cache_t *) $1 = 0x00000001000012f8
(lldb) p *$1
(cache_t) $2 = {
  _buckets = 0x0000000101029950
  _mask = 7
  _occupied = 1
}
(lldb) p $2._buckets
(bucket_t *) $3 = 0x0000000101029950
(lldb) p *$3
(bucket_t) $4 = {
  _key = 0
  _imp = 0x0000000000000000
}
(lldb) p $2._buckets[0]
(bucket_t) $5 = {
  _key = 0
  _imp = 0x0000000000000000
}
(lldb) p $2._buckets[1]
(bucket_t) $6 = {
  _key = 0
  _imp = 0x0000000000000000
}
(lldb) p $2._buckets[2]
(bucket_t) $7 = {
  _key = 4294971026
  _imp = 0x0000000100000ce0 (LGTest`-[LGPerson sayNB] at LGPerson.m:25)
}
(lldb) p $2._buckets[3]
(bucket_t) $8 = {
  _key = 0
  _imp = 0x0000000000000000
}
(lldb) p $2._buckets[5]
(bucket_t) $9 = {
  _key = 0
  _imp = 0x0000000000000000
}
(lldb) p $2._buckets[6]
(bucket_t) $10 = {
  _key = 0
  _imp = 0x0000000000000000
}
(lldb) p $2._buckets[7]
(bucket_t) $11 = {
  _key = 0
  _imp = 0x0000000000000000
}
4.2 找出缓存策略
  • 那么现在只能回归到源码当中,首先因为mask的值是增加的了,所以我们先找到cache_t当中的mask_t mask()方法,结果发现其只是反回了_mask本身
mask_t cache_t::mask() 
{
    return _mask; 
}
  • 继续搜索mask(),发现在capacity方法中有mask的相应操作,但是操作目的不是很明确
mask_t cache_t::capacity() 
{
    return mask() ? mask()+1 : 0; 
}
  • 那么现在关注点放到搜索capacity方法上,在扩容方法expand方法内部看到了capacity方法的调用,意思是,如果oldCapacity获取的值为0,那么久用INIT_CACHE_SIZE(1<<2 实际为4)来初始化,如果存在,那么就用oldCapacity的2倍来作为newCapacity,扩容的逻辑我们已经找到
enum {
    INIT_CACHE_SIZE_LOG2 = 2,
    INIT_CACHE_SIZE      = (1 << INIT_CACHE_SIZE_LOG2) //就是4
};

void cache_t::expand()
{
    cacheUpdateLock.assertLocked();
    
    uint32_t oldCapacity = capacity();
    uint32_t newCapacity = oldCapacity ? oldCapacity*2 : INIT_CACHE_SIZE;

    if ((uint32_t)(mask_t)newCapacity != newCapacity) {
        // mask overflow - can't grow further
        // fixme this wastes one bit of mask
        newCapacity = oldCapacity;
    }

    reallocate(oldCapacity, newCapacity);
}
  • 那么接下来,找到cache在哪里,在什么条件下进行了expandcache_fill_nolock方法内部,如果newOccupied大于capacity的3/4,则进行扩容,cache->capacity()返回的就是缓存的值(0或者mask+1),
static void cache_fill_nolock(Class cls, SEL sel, IMP imp, id receiver)
{
     // 好多代码

    // Make sure the entry wasn't added to the cache by some other thread 
    // before we grabbed the cacheUpdateLock.
    if (cache_getImp(cls, sel)) return; // 如果有缓存,直接取imp,并且返回

    cache_t *cache = getCache(cls);
    cache_key_t key = getKey(sel);

    // Use the cache as-is if it is less than 3/4 full
    mask_t newOccupied = cache->occupied() + 1;
    mask_t capacity = cache->capacity();
    if (cache->isConstantEmptyCache()) {
        // Cache is read-only. Replace it.
        cache->reallocate(capacity, capacity ?: INIT_CACHE_SIZE);
    }
    else if (newOccupied <= capacity / 4 * 3) {
        // Cache is less than 3/4 full. Use it as-is.
    }
    else {
        // Cache is too full. Expand it.
        cache->expand();
    }

    // Scan for the first unused slot and insert there.
    // There is guaranteed to be an empty slot because the 
    // minimum size is 4 and we resized at 3/4 full.
    bucket_t *bucket = cache->find(key, receiver);
    if (bucket->key() == 0) cache->incrementOccupied();
    bucket->set(key, imp);
}
  • 到这里,还是没有解决,为什么bucke中只缓存了一个sayNB的问题,这里让我们看expand方法的最后,reallocate(oldCapacity, newCapacity)方法,在reallocate方法中,首先使用newCapacity初始化了一个newBuckets,之后设置了新的buckets以及mask,并且最后释放了旧的oldBuckets,这里之所以直接用newBuckets代替,而不是用追加或者修改oldBuckets的方式,主要还是为了安全以及执行效率
void cache_t::reallocate(mask_t oldCapacity, mask_t newCapacity)
{
    bool freeOld = canBeFreed();

    bucket_t *oldBuckets = buckets();
    bucket_t *newBuckets = allocateBuckets(newCapacity);

    // Cache's old contents are not propagated. 
    // This is thought to save cache memory at the cost of extra cache fills.
    // fixme re-measure this

    assert(newCapacity > 0);
    assert((uintptr_t)(mask_t)(newCapacity-1) == newCapacity-1);

    // -1 是一种算法,为了提前扩容,更安全
    setBucketsAndMask(newBuckets, newCapacity - 1);
    
    if (freeOld) {
        cache_collect_free(oldBuckets, oldCapacity);
        cache_collect(false);
    }
}
  • 在上面的cache_fill_nolock方法内部,可以发现,expand之后,才会把把最新的imp和key缓存了下来,这里就解释了为什么cache中仅仅只留下了最新的sayNB方法,这里就适用了LRU算法,把最近调用过的方法缓存下来
bucket_t *bucket = cache->find(key, receiver);
    if (bucket->key() == 0) cache->incrementOccupied();
    bucket->set(key, imp);

知识扩展

  • 最后延伸一下,关于cache_fill_nolock的调用时机,我们在源码中可以看到,是在cache_fill中进行了调用,其中cache_fill,我也追踪源码发现,其调用时机其实是在method lookup的过程中调用的,而方法查找则要牵扯到OC底层的objc_msgSend,也就是消息发送机制,所以我们姑且可以认为,在消息发送的过程中,先通过缓存查找imp,如果查找到就直接调用,如果没有,那么就进行缓存。
void cache_fill(Class cls, SEL sel, IMP imp, id receiver)
{
#if !DEBUG_TASK_THREADS
    mutex_locker_t lock(cacheUpdateLock);
    cache_fill_nolock(cls, sel, imp, receiver);
#else
    _collecting_in_critical();
    return;
#endif
}
/* method lookup */
extern IMP lookUpImpOrNil(Class, SEL, id obj, bool initialize, bool cache, bool resolver);
extern IMP lookUpImpOrForward(Class, SEL, id obj, bool initialize, bool cache, bool resolver);

总结

  • Class中的Cache主要是为了在消息发送的过程中,进行方法的缓存,加快调用效率,其中使用了动态扩容的方法,当容量达到最大值的3/4时,开始2倍扩容,扩容时会完全抹除旧的buckets,并且创建新的buckets代替,之后把最近一次临界的imp和key缓存进来,经典的LRU算法案例~
  • 那么此次对于cache的分析就到这里,如果有不足的地方,还请大家留言沟通,我会及时更改~
  • 诙谐学习,不干不燥~

你可能感兴趣的:(iOS方法缓存-cache)