在执行lmbench内存压力测试时,发生了好几次slab异常,每次异常都是在这个地方:
kernel BUG at mm/slab.c:3067! Unable to handle kernel NULL pointer dereference at virtual address 00000000 [<c003c6ec>] (__dabt_svc+0x4c/0x60) from [<c00405a4>] (__bug+0x1c/0x28) [<c00405a4>] (__bug+0x1c/0x28) from [<c00c9d0c>] (cache_alloc_refill+0x3a0/0x654) [<c00c9d0c>] (cache_alloc_refill+0x3a0/0x654) from [<c00ca174>] (kmem_cache_alloc+0xb0/0xc4) [<c00ca174>] (kmem_cache_alloc+0xb0/0xc4) from [<c0057b04>] [color=Red](copy_process+0x9c/0xdbc)[/color] [<c0057b04>] (copy_process+0x9c/0xdbc) from [<c0058890>] (do_fork+0x48/0x288) [<c0058890>] (do_fork+0x48/0x288) from [<c003cc40>] (ret_fast_syscall+0x0/0x30)对应的语句是cache_alloc_refill函数:
/* * The slab was either on partial or free list so * there must be at least one object available for * allocation. */ BUG_ON(slabp->inuse >= cachep->num);根据函数的逻辑,代码走到这里的时候,说明刚从cache的partial或者free链表里取得了一个可用slab,
为确认在BUG_ON前,打印slabp->inuse和cachep->num的值,出错时:
slabp->inuse:5,cachep->num:5
在正常情况下,打印task_struct_cache的num上限确实是5。
代码走到了这里,并触发了BUG_ON,起先怀疑:
1)这个版本的内核(3.0.74)是否有slab方面的bug?
2)是否这个arm架构的spinlock的实现有问题?
但后来一想,内核不可能做的这么糟糕,被lmbench内存压力小测一下就出异常,应该是其他什么地方出问题了。
为了进一步分析,先把该slab的内容都打印一下,看有没有什么线索:
partial.next:df913000,free.next:df802df0 slabs_partial pointer:df802de0,slabs_free pointer:df802df0 slab get from partial cache 'task_struct'(5), slabp df913000(inuse:5,free:-257),kmem_map vaddr:0xdf91301c Hexdump: 000: 00 50 90 df e0 2d 80 df 40 00 00 00 40 30 91 df 010: 05 00 00 00 ff fe ff ff 00 00 ad de 03 00 00 00 020: ff ff ff ff 00 00 00 00 ff fe ff ff ff ff ff ff上面有两个重要信息,首先是从task_struct_cache的 partial链表里取出了一个slab,然后该slab的free字段是非法的-257(0xfffffeff)
struct slab { union { struct { struct list_head list; 0xdf905000,0xdf802de0 unsigned long colouroff; 0x40 void *s_mem; /* including colour offset */ 0xdf913040 unsigned int inuse; /* num of objs active in slab */ 0x5 kmem_bufctl_t free; 0xfffffeff:typedef unsigned int kmem_bufctl_t; unsigned short nodeid; }; struct slab_rcu __slab_cover_slab_rcu; }; };问题就是,这个 0xfffffeff 是怎么出来的?slab描述符周围的值看上去都是正常的。
因此不加入full list,而加入了partial_list。下一次再分配slab的时候,
就从partial里取了一个非法的slab,导致分配时cache_alloc_refill里的BUG_ON被触发。
/* move slabp to correct slabp list: */ list_del(&slabp->list); if (slabp->free == BUFCTL_END) list_add(&slabp->list, &l3->slabs_full); else list_add(&slabp->list, &l3->slabs_partial);后续了解到,这个DDR是超频到533M使用的,于是将其恢复到正常频率400M,测试lmbench再没有发生异常。