mc集群写入恍惚问题排查

最近在公司的wiki里写了几篇查问题的日志,感觉有分享的必要,就贴出来了。

1.现象

业务方反馈在向memcache集群写入数据时,出现不稳定。表现为向mc写入一个creative和ad对象的list,有的时候能写进去并读出来,有的时候写成功但是读不出来。

2.问题排查

2.1 复现问题

  • a.有的key没有问题,能够一直写+读。
  • b.有的key一直都是写ok,读None。
  • c.有的key写ok,有的时候读ok有的时候读None.

2.2 proxy的问题?

使用同一个proxy的再次复现问题,出现了之前的多种情况。所以排除proxy问题。

另外在排查中发现出现问题的key长度小于20B、value长度在9K~10K左右。

2.3.mc集群问题?

2.3.1 集群各节点状态

集群各个节点状态,以10.0.0.1:11211为例:

============10.0.0.1:11211
  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
  1      96B    604774s     158 1021044      no        0        0    0
  2     120B      3600s       2   10013      no        0        0    0
  3     152B       186s       1       1      no        0        0    0
  4     192B   6157491s       1      69      no        0        0    0
  5     240B   1721785s       1     835      no        0        0    0
  6     304B   1721786s       1    2673      no        0        0    0
  7     384B   1721783s       1      91      no        0        0    0
  8     480B   1721786s       1       6      no        0        0    0
  9     600B    140686s       2       1      no        0        0    0
 10     752B       125s       7      11      no        0        0    0
 11     944B       121s       4     940      no        0        0    0
 12     1.2K       120s       9    4666     yes        3      562    0
 13     1.4K       121s       5    1447     yes     2047      495    0
 14     1.8K       437s     754    1209      no        0        0    0
 15     2.3K     83618s      58     261     yes      575   138922    0
 16     2.8K    172787s     558   80573      no        0        0    0
 17     3.5K    172780s     576  131417     yes    96835   172745    0
 18     4.4K    172788s    2869  169486      no        0        0    0
 19     5.5K      3576s      90   16560     yes  3357047     3577    0
 20     6.9K    118334s       7     988     yes        1    72968    0
 21     8.7K        82s       6     708     yes 12016644       85   88
 22    10.8K         1s       2     188     yes 393841058        1 8640
 23    13.6K         1s       1      75     yes 118541885        1 1153
 24    16.9K        59s       1      60     yes  1262831       60   14
 25    21.2K       338s       1      16      no        0        0    0
 26    26.5K       144s       1       5      no        0        0    0
 27    33.1K        21s       1       1      no        0        0    0
 28    41.4K         5s       1       2      no        0        0    0
 30    64.7K        23s       1       2      no        0        0    0
 31    80.9K         0s       1       0      no        0        0    0

通过看集群各个节点的状态,发现节点的slab存在不同程度的Full.

情节较为严重的有三个节点,并且Evicted很多。

2.3.2 再次复现问题

用client直接连接不同的节点,有的节点上读写都ok,有的节点上出现了之前的问题。确定是集群节点出现问题。

基本确定问题和10.0.0.1:11211节点上5.5K~16.9K之间的slab的Full状态以及Evicted有关。

2.3.3 集群内存使用情况

内存使用情况

最大内存5GB,每个实例用了1.5GB左右。

内存没满啊,为什么存不进去?

再次返回来看各个节点的状态,以10.0.0.1:11211为例:

把分配的Page都加起来:158+2+1+1+1+1+1+1+2+7+4+9+5+754+58+558+576+2869+90+7+6+2+1+1+1+1+1+1+1+1=5121 ~ 5GB

5GB是分配给每个节点的maxmemory.

说明所有的memory page都被分配给相应的slab了,目前即使有一部分page回收后空闲,但是这部分空闲的page没有被重新分配到全局空闲空间,供其他slab使用。

看一下chunk size为1.8K的一行,分配page为754,item数量1209,也就是说这个slab里面,实际只有1MB左右的数据,却分配了754M的空间,严重浪费。

为什么mc就不能把已经分配的空闲空间回收呢?

问题定位:mc没有把已经分配的空闲空间回收。

3.问题解决

3.1 再造集群复现问题

1.自己搭了一个64M的mc节点。

2.用4k的value数据写满:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 18     4.4K         6s      64   14720     yes     2056        2    0

3.删除所有数据:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 18     4.4K         0s      64       0     yes        0        0    0

4.再用1k的value写满:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 12     1.2K         9s       2     885      no    66222        0    0
 18     4.4K         0s      63       0     yes        0        0    0

发现这次value大小为1k的很多都Evited了。并且上次value大小为4k的数据虽然已经删除了,但是page大多数还处于被分配状态。

STAT slab_reassign_running 0
STAT slabs_moved 2

在stats里面看到,slabs也出现了reassign(就是在启动参数里面指定了slabs_reassign和slabs_automove),但是和我们要的差距有点大。

在1.4.11的ReleaseNote里面看到:

Slab Reassign
Long running instances of memcached may run into an issue where all available memory has been assigned to a specific slab class (say items of roughly size 100 bytes). Later the application starts storing more of its data into a different slab class (items around 200 bytes). Memcached could not use the 100 byte chunks to satisfy the 200 byte requests, and thus you would be able to store very few 200 byte items.
1.4.11 introduces the ability to reassign slab pages. This is a beta feature and the commands may change for the next few releases, so please keep this in mind. When the commands are finalized they will be noted in the release notes.
Slab reassignment can only be enabled at start time:
```$ memcached -o slab_reassign}}}
Once all memory has been assigned and used by items, you may use a command to reassign memory.




<div class="se-preview-section-delimiter">div>

```$ echo "slabs reassign 1 4" | nc localhost 11211}}}
That will return an error code indicating success, or a need to retry later. Success does not mean that the slab was moved, but that a background thread will attempt to move the memory as quickly as it can.


Slab Automove
While slab reassign is a manual feature, there is also the start of an automatic memory reassignment algorithm.
```$ memcached -o slab_reassign,slab_automove}}}
The above enables it on startup. slab_automove requires slab_reassign first be enabled.
automove itself may also be enabled or disabled at runtime:




<div class="se-preview-section-delimiter">div>

```$ echo "slabs automove 0" | nc localhost 11211}}}
The algorithm is slow and conservative. If a slab class is seen as having the highest eviction count 3 times 10 seconds apart, it will take a page from a slab class which has had zero evictions in the last 30 seconds and move the memory.
There are lots of cases where this will not be sufficient, and we invite the community to help improve upon the algorithm. Included in the source directory is scripts/mc_slab_mover. See perldoc for more information:
```$ perldoc ./scripts/mc_slab_mover}}}
It implements the same algorithm as built into memcached, and you may modify it to better suit your needs and improve on the script or port it to other languages. Please provide patches!
Slab Reassign Implementation
Slab page reassignment requires some tradeoffs:
All items larger than 500k (even if they're under 730k) take 1MB of space

When memory is reassigned, all items that were in the 1MB page are evicted

When slab reassign is enabled, an extra background thread is used





<div class="se-preview-section-delimiter">div>

试一下$ echo "slabs reassign 1 4" | nc localhost 11211
写满1k数据:





<div class="se-preview-section-delimiter">div>

#  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
12     1.2K         6s       7    6195     yes   193356        1    0
18     4.4K         0s      58       0     yes        0        0    0




<div class="se-preview-section-delimiter">div>

有4个page迁移了
$ echo "slabs reassign 1 10" | nc localhost 11211
再写一遍1k数据:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 12     1.2K        50s      14   12390     yes   254268       31    0
 18     4.4K         0s      51       0     yes        0        0    0

STAT slab_reassign_running 0
STAT slabs_moved 13
STAT bytes 13232520

写一遍2k的数据:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 12     1.2K       223s      13   11505     yes   254268       31    0
 15     2.3K        15s       2     902     yes    32651        4    0
 18     4.4K         0s      51       0     yes        0        0    0

STAT slab_reassign_running 0
STAT slabs_moved 14
STAT bytes 14154480

能看到reassign的速度变快了。但还是和我们要的差距有点大。我们不能经常手动执行slabs reassign.

我们用的mc是1.4.13,新版本的mc是不是解决了这个问题?

于是下载最新的1.4.33,重复上面的测试。

3.3 新版本测试

下载最新版本1.4.33,重试上面的测试:

用4k的value数据写满:

#  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
18     4.4K        11s      64   14720     yes     2056        1    0

删除所有数据:

#  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
18     4.4K         0s      64       0     yes        0        0    0

用1k的value写满:

#  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
12     1.2K         8s       1     885     yes    66222        1    0
18     4.4K         0s      64       0     yes        0        0    0

貌似没什么变化啊,

赶紧下载最新的代码看看在申请空间的时候怎么做的。

item *do_item_alloc(char *key, const size_t nkey, const unsigned int flags,
                    const rel_time_t exptime, const int nbytes) {
    //计算一共占用的空间
    size_t ntotal = item_make_header(nkey + 1, flags, nbytes, suffix, &nsuffix);
    if (settings.use_cas) {
        ntotal += sizeof(uint64_t);
    }

    //根据占用的空间取得slabs_clsid
    unsigned int id = slabs_clsid(ntotal);
    if (id == 0)
        return 0;

    /* If no memory is available, attempt a direct LRU juggle/eviction */
    /* This is a race in order to simplify lru_pull_tail; in cases where
     * locked items are on the tail, you want them to fall out and cause
     * occasional OOM's, rather than internally work around them.
     * This also gives one fewer code path for slab alloc/free
     */
    /* TODO: if power_largest, try a lot more times? or a number of times
     * based on how many chunks the new object should take up?
     * or based on the size of an object lru_pull_tail() says it evicted?
     * This is a classical GC problem if "large items" are of too varying of
     * sizes. This is actually okay here since the larger the data, the more
     * bandwidth it takes, the more time we can loop in comparison to serving
     * and replacing small items.
     */
    for (i = 0; i < 10; i++) {
        uint64_t total_bytes;
        /* Try to reclaim memory first */
        if (!settings.lru_maintainer_thread) {//如果没有开maintainer线程,先从coldlru里面看看有没有空间
            lru_pull_tail(id, COLD_LRU, 0, 0);
        }
        it = slabs_alloc(ntotal, id, &total_bytes, 0);//尝试申请slab

        if (settings.expirezero_does_not_evict)
            total_bytes -= noexp_lru_size(id);

        if (it == NULL) {
            if (settings.lru_maintainer_thread) {
                lru_pull_tail(id, HOT_LRU, total_bytes, 0);
                lru_pull_tail(id, WARM_LRU, total_bytes, 0);
                if (lru_pull_tail(id, COLD_LRU, total_bytes, LRU_PULL_EVICT) <= 0)
                    break;
            } else {
                if (lru_pull_tail(id, COLD_LRU, 0, LRU_PULL_EVICT) <= 0)
                    break;
            }
        } else {
            break;
        }
    }

    if (i > 0) {//多次尝试才获取到空间
        pthread_mutex_lock(&lru_locks[id]);
        itemstats[id].direct_reclaims += i;
        pthread_mutex_unlock(&lru_locks[id]);
    }

    if (it == NULL) {
        pthread_mutex_lock(&lru_locks[id]);
        itemstats[id].outofmemory++;
        pthread_mutex_unlock(&lru_locks[id]);
        return NULL;
    }

    assert(it->slabs_clsid == 0);
    //assert(it != heads[id]);

    /* Refcount is seeded to 1 by slabs_alloc() */
    it->next = it->prev = 0;

    /* Items are initially loaded into the HOT_LRU. This is '0' but I want at
     * least a note here. Compiler (hopefully?) optimizes this out.
     */
    if (settings.lru_maintainer_thread) {
        if (exptime == 0 && settings.expirezero_does_not_evict) {
            id |= NOEXP_LRU;
        } else {
            id |= HOT_LRU;
        }
    } else {
        /* There is only COLD in compat-mode */
        id |= COLD_LRU;
    }
    it->slabs_clsid = id;
    ...
    return it;
}

这段代码就是当新增kv却没有足够的空间时的操作,做了10次尝试使用回收空间。

static void *lru_maintainer_thread(void *arg) {
    rel_time_t last_crawler_check = 0;
    struct crawler_expired_data cdata;
    memset(&cdata, 0, sizeof(struct crawler_expired_data));
    pthread_mutex_init(&cdata.lock, NULL);
    cdata.crawl_complete = true; // kick off the crawler.

    /*线程里面一个死循环,回收指定的slab_clsid或者所有的slab_clsid空间*/
    while (do_run_lru_maintainer_thread) {
        ...
        /* We were asked to immediately wake up and poke a particular slab
         * class due to a low watermark being hit
         * did_moves:移动slab的数量,根据此数量判断maintainer_thread每次sleep时间的长短
        */
        if (lru_maintainer_check_clsid != 0) {
            did_moves = lru_maintainer_juggle(lru_maintainer_check_clsid);
            lru_maintainer_check_clsid = 0;
        } else {
            for (i = POWER_SMALLEST; i < MAX_NUMBER_OF_SLAB_CLASSES; i++) {
                did_moves += lru_maintainer_juggle(i);
            }
        }
        ...
        /* Once per second at most */
        if (settings.lru_crawler && last_crawler_check != current_time) {
            lru_maintainer_crawler_check(&cdata);
            last_crawler_check = current_time;
        }
    }

    return NULL;
}

//================

/* Loop up to N times:
 * 如果HOT_LRU里面item太多,移到COLD_LRU
 * 如果WARM_LRU里面item太多,移到COLD_LRU
 * 如果COLD_LRU里面item太多,放到COLD_LRU尾部
 * 1000 loops with 1ms min sleep gives us under 1m items shifted/sec. The
 * locks can't handle much more than that. Leaving a TODO for how to
 * autoadjust in the future.
 */
static int lru_maintainer_juggle(const int slabs_clsid) {
    int i;
    int did_moves = 0;
    bool mem_limit_reached = false;
    uint64_t total_bytes = 0;
    unsigned int chunks_perslab = 0;
    unsigned int chunks_free = 0;
    /* TODO: if free_chunks below high watermark, increase aggressiveness */
    chunks_free = slabs_available_chunks(slabs_clsid, &mem_limit_reached,
            &total_bytes, &chunks_perslab);//slabclass和全局变量里面保存了这些数值,取出来用来判断是否需要回收
    if (settings.expirezero_does_not_evict)
        total_bytes -= noexp_lru_size(slabs_clsid);

    /* 如果在任何级别上启用slab automove,并且我们在此类中有超过2.5页的空闲块,将此类的页面重新分配回全局池
     */
    if (settings.slab_automove > 0 && chunks_free > (chunks_perslab * 2.5)) {
        slabs_reassign(slabs_clsid, SLAB_GLOBAL_PAGE_POOL);//把空闲快从slabs_clsid移到SLAB_GLOBAL_PAGE_POOL全局池,并且保证src slabs_clsid里面至少两个slab.
    }

    /* Juggle HOT/WARM up to N times */
    for (i = 0; i < 1000; i++) {
        int do_more = 0;
        if (lru_pull_tail(slabs_clsid, HOT_LRU, total_bytes, LRU_PULL_CRAWL_BLOCKS) ||
            lru_pull_tail(slabs_clsid, WARM_LRU, total_bytes, LRU_PULL_CRAWL_BLOCKS)) {
            do_more++;
        }
        do_more += lru_pull_tail(slabs_clsid, COLD_LRU, total_bytes, LRU_PULL_CRAWL_BLOCKS);
        if (do_more == 0)
            break;
        did_moves++;
    }
    return did_moves;
}

//============

/*** LRU MAINTENANCE THREAD ***/

/* Returns number of items remove, expired, or evicted.
 * Callable from worker threads or the LRU maintainer thread */
//两处调用:maintainer线程。另一个调用点是do_item_alloc给item分配内存的时候。
static int lru_pull_tail(const int orig_id, const int cur_lru,
        const uint64_t total_bytes, uint8_t flags) {
    item *it = NULL;
    int id = orig_id;
    int removed = 0;
    if (id == 0)
        return 0;

    int tries = 5;
    item *search;
    item *next_it;
    void *hold_lock = NULL;
    unsigned int move_to_lru = 0;
    uint64_t limit = 0;

    id |= cur_lru;//根据hot,warm,cold来判断slab_clsid
    pthread_mutex_lock(&lru_locks[id]);
    search = tails[id];
    /* We walk up *only* for locked items, and if bottom is expired. */
    for (; tries > 0 && search != NULL; tries--, search=next_it) {
        /* we might relink search mid-loop, so search->prev isn't reliable */
        next_it = search->prev;
        if (search->nbytes == 0 && search->nkey == 0 && search->it_flags == 1) {
            /* We are a crawler, ignore it. */
            if (flags & LRU_PULL_CRAWL_BLOCKS) {
                pthread_mutex_unlock(&lru_locks[id]);
                return 0;
            }
            tries++;
            continue;
        }
        uint32_t hv = hash(ITEM_key(search), search->nkey);
        /* Attempt to hash item lock the "search" item. If locked, no
         * other callers can incr the refcount. Also skip ourselves. */
        if ((hold_lock = item_trylock(hv)) == NULL)
            continue;
        /* Now see if the item is refcount locked */
        if (refcount_incr(&search->refcount) != 2) {
            /* Note pathological case with ref'ed items in tail.
             * Can still unlink the item, but it won't be reusable yet */
            itemstats[id].lrutail_reflocked++;
            /* In case of refcount leaks, enable for quick workaround. */
            /* WARNING: This can cause terrible corruption */
            if (settings.tail_repair_time &&
                    search->time + settings.tail_repair_time < current_time) {
                itemstats[id].tailrepairs++;
                search->refcount = 1;
                /* This will call item_remove -> item_free since refcnt is 1 */
                do_item_unlink_nolock(search, hv);
                item_trylock_unlock(hold_lock);
                continue;
            }
        }

        /* Expired or flushed */
        if ((search->exptime != 0 && search->exptime < current_time)
            || item_is_flushed(search)) {
            itemstats[id].reclaimed++;
            if ((search->it_flags & ITEM_FETCHED) == 0) {
                itemstats[id].expired_unfetched++;
            }
            /* refcnt 2 -> 1 */
            do_item_unlink_nolock(search, hv);
            /* refcnt 1 -> 0 -> item_free */
            do_item_remove(search);
            item_trylock_unlock(hold_lock);
            removed++;

            /* If all we're finding are expired, can keep going */
            continue;
        }

        /* If we're HOT_LRU or WARM_LRU and over size limit, send to COLD_LRU.
         * If we're COLD_LRU, send to WARM_LRU unless we need to evict
         */
        switch (cur_lru) {
            case HOT_LRU:
                limit = total_bytes * settings.hot_lru_pct / 100;
            case WARM_LRU:
                if (limit == 0)
                    limit = total_bytes * settings.warm_lru_pct / 100;
                if (sizes_bytes[id] > limit) {
                    itemstats[id].moves_to_cold++;
                    move_to_lru = COLD_LRU;
                    do_item_unlink_q(search);
                    it = search;
                    removed++;
                    break;
                } else if ((search->it_flags & ITEM_ACTIVE) != 0) {
                    /* Only allow ACTIVE relinking if we're not too large. */
                    itemstats[id].moves_within_lru++;
                    search->it_flags &= ~ITEM_ACTIVE;
                    do_item_update_nolock(search);
                    do_item_remove(search);
                    item_trylock_unlock(hold_lock);
                } else {
                    /* Don't want to move to COLD, not active, bail out */
                    it = search;
                }
                break;
            case COLD_LRU:
                it = search; /* No matter what, we're stopping */
                if (flags & LRU_PULL_EVICT) {
                    if (settings.evict_to_free == 0) {
                        /* Don't think we need a counter for this. It'll OOM.  */
                        break;
                    }
                    itemstats[id].evicted++;
                    itemstats[id].evicted_time = current_time - search->time;
                    if (search->exptime != 0)
                        itemstats[id].evicted_nonzero++;
                    if ((search->it_flags & ITEM_FETCHED) == 0) {
                        itemstats[id].evicted_unfetched++;
                    }
                    LOGGER_LOG(NULL, LOG_EVICTIONS, LOGGER_EVICTION, search);
                    do_item_unlink_nolock(search, hv);
                    removed++;
                    if (settings.slab_automove == 2) {
                        slabs_reassign(-1, orig_id);
                    }
                } else if ((search->it_flags & ITEM_ACTIVE) != 0
                        && settings.lru_maintainer_thread) {
                    itemstats[id].moves_to_warm++;
                    search->it_flags &= ~ITEM_ACTIVE;
                    move_to_lru = WARM_LRU;
                    do_item_unlink_q(search);
                    removed++;
                }
                break;
        }
        if (it != NULL)
            break;
    }

    pthread_mutex_unlock(&lru_locks[id]);

    if (it != NULL) {
        if (move_to_lru) {
            it->slabs_clsid = ITEM_clsid(it);
            it->slabs_clsid |= move_to_lru;
            item_link_q(it);
        }
        do_item_remove(it);
        item_trylock_unlock(hold_lock);
    }

    return removed;
}

这段基本就是lru_maintainer线程回收空间的核心代码了。

概括一下就是:

maintainer线程处于一个while循环中,不断对所有的slabs_cls进行循环,看看哪些slabs_cls里面空闲空间的>2.5个page,就标记一下到slab_reb里面,等待回收。

并且不断对lru表维护,如果hot,warm lru占有内存超过限定额度,将hot lru的item移至warm lru, warm lru的item移至cold lru,以及对cold lru里面对象的回收等等.

slabclass_t对应三条lru队列,即hot,warm,cold lru,最终内存不足的时候会有优先删除cold lru的数据。

另外,最新的mc里面也支持一个crawler线程和maintainer线程配合。crawler线程用来检查当前memcache里面的所有item是否过期等。

3.4 新版本再测试

1.启动参数:

增加lru_maintainer参数

2.写4k数据:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 18     4.4K        27s      64   14720     yes     2056        4    0

3.全部删除:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 18     4.4K         0s       2       0     yes        0        0    0

4.写入8k数据:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 18     4.4K         0s       2       0     yes        0        0    0
 21     8.7K        27s      62    7316     yes     1071        1    0

5.删除8k数据:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 18     4.4K         0s       2       0     yes        0        0    0
 21     8.7K         0s       2       0     yes        0        0    0

6.写满6k数据:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 18     4.4K         0s       2       0     yes        0        0    0
 20     6.9K         7s      60    8820     yes     2363        3    0
 21     8.7K         0s       2       0     yes        0        0    0

7.写入4k无过期数据,8k有过期的数据600s

#  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
18     4.4K       484s      11    2000     yes        0        0    0
21     8.7K         9s      26    2999     yes        0        0    0

8.写入5k的不过期数据:

  #  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
 18     4.4K         0s       2       0     yes        0        0    0
 19     5.5K         5s      33    5999     yes        0        0    0
 21     8.7K         0s       2       0     yes        0        0    0

9.写入5k无过期数据,8k有过期的数据600s

#  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
18     4.4K         0s       2       0     yes        0        0    0
19     5.5K        57s      18    3000     yes        0        0    0
21     8.7K         8s      26    2999     yes        0        0    0

10.写入4k带过期数据:

#  Item_Size  Max_age   Pages   Count   Full?  Evicted Evict_Time OOM
18     4.4K         5s       9    1999     yes        0        0    0
19     5.5K       176s      18    3000     yes        0        0    0
21     8.7K       127s      10    1000     yes        0        0    0

删除了8k的过期数据。

加入lru_maintainer线程之后效果大好,另外,如果增加crawler线程的话会占用锁,可能会影响mc的性能(需要性能测试)

4.结论

通过上面的实验看出,1.4.33的mc在page分配完成后的回收上效果很好。

如果集群已经出现了page分配完的情况,如果使用新版的mc,一方面会缓存之前1.4.13版本写不进去的数据,提高在slab钙化情况下的空间利用率,提高mc命中率。另一方面因为将数据分别放在hot,warm,cold lru里面,能快速的找到替换的空间,大大降低查找已经过期的空间回收时间,进一步提高性能。

5.升级新版本

所有mc都已经升级到1.4.33版本。

6.Ref

https://github.com/memcached/memcached
https://github.com/memcached/memcached/wiki/ReleaseNotes1411

你可能感兴趣的:(nosql)