slabinfo -v的误报

误报说明

开发的嵌入式设备出现稳定性问题,开启SLUB DEBUG,同时在起机脚本中增加slabinfo -v命令,进行上下电拷机,希望能早点扫到问题,结果拷机频繁出现以下SLUB异常打印,其特征是:

  • 伴生检测出2次redzone改写,第1次均是slabinfo -v触发,第2次是slab分配接口触发
  • 从打印出来的信息来看,redzone的内容是统一的,只是0xBB和0xCC的差异,其他SLUB结构部分都是正常的
[ 146.886465] =============================================================================
[ 146.984467] BUG kmalloc-128 (Not tainted): Redzone overwritten
[ 147.054298] -----------------------------------------------------------------------------
[ 147.054298]
[ 147.169959] Disabling lock debugging due to kernel taint
[ 147.233544] INFO: 0x878fac00-0x878fac7f. First byte 0xbb instead of 0xcc
[ 147.313844] INFO: Allocated in spi_mem_exec_op+0x7c/0x404 age=44 cpu=3 pid=60
[ 147.399325] ___slab_alloc.constprop.35+0x3fc/0x4ac
[ 147.457693] __slab_alloc.constprop.34+0x20/0x44
[ 147.512926] __kmalloc+0x140/0x2e0
[ 147.553580] spi_mem_exec_op+0x7c/0x404
[ 147.599478] spinand_load_page_op+0x68/0x7c
[ 147.649515] spinand_mtd_read+0x230/0x494
[ 147.697459] spinand_mtd_compat_read+0x3c/0x54
[ 147.750632] part_read+0x60/0xb8
[ 147.789200] mtd_read+0x78/0xc0
[ 147.826747] ubi_io_read+0x180/0x32c
[ 147.869480] ubi_eba_read_leb+0x2fc/0x454
[ 147.917423] ubi_leb_read+0x74/0xd4
[ 147.959128] gluebi_read+0x88/0xe4
[ 147.999778] mtd_read+0x78/0xc0
[ 148.037317] mtdblock_readsect+0xdc/0x130
[ 148.085261] mtd_blktrans_work+0x2d0/0x3f4
[ 148.134275] INFO: Freed in spi_mem_exec_op+0x3c0/0x404 age=48 cpu=3 pid=60
[ 148.216596] __slab_free+0xb8/0x550
[ 148.258292] kfree+0x1f8/0x20c
[ 148.294778] spi_mem_exec_op+0x3c0/0x404
[ 148.341690] spinand_load_page_op+0x68/0x7c
[ 148.391719] spinand_mtd_read+0x230/0x494
[ 148.439668] spinand_mtd_compat_read+0x3c/0x54
[ 148.492824] part_read+0x60/0xb8
[ 148.531389] mtd_read+0x78/0xc0
[ 148.568919] ubi_io_read+0x180/0x32c
[ 148.611656] ubi_eba_read_leb+0x2fc/0x454
[ 148.659599] ubi_leb_read+0x74/0xd4
[ 148.701294] gluebi_read+0x88/0xe4
[ 148.741947] mtd_read+0x78/0xc0
[ 148.779477] mtdblock_readsect+0xdc/0x130
[ 148.827425] mtd_blktrans_work+0x2d0/0x3f4
[ 148.876416] process_one_work+0x280/0x474
[ 148.924367] INFO: Slab 0x81107320 objects=16 used=16 fp=0x (null) flags=0x8101
[ 149.011893] INFO: Object 0x878fac80 @offset=3200 fp=0x (null)
[ 149.011893]
[ 149.099435] Redzone 878fac00: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 149.203633] Redzone 878fac10: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 149.307831] Redzone 878fac20: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 149.412029] Redzone 878fac30: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 149.516221] Redzone 878fac40: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 149.620420] Redzone 878fac50: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 149.724612] Redzone 878fac60: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 149.828804] Redzone 878fac70: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
[ 149.932998] Object 878fac80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 150.036153] Object 878fac90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 150.139303] Object 878faca0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 150.242453] Object 878facb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 150.345604] Object 878facc0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 150.448754] Object 878facd0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 150.551911] Object 878face0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 150.655068] Object 878facf0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
[ 150.758219] Redzone 878fad00: bb bb bb bb ....
[ 150.849911] Padding 878fada8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 150.954110] Padding 878fadb8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 151.058307] Padding 878fadc8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 151.162500] Padding 878fadd8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 151.266699] Padding 878fade8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 151.370896] Padding 878fadf8: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
[ 151.466764] CPU: 0 PID: 3063 Comm: slabinfo Tainted: G B 4.14.90+ #1
[ 151.556374] Stack : 00000000 8fc03ea4 7f759e78 8017dc58 8069c81c 00000009 00000000 00000000
[ 151.656446] 80655304 83eddcd4 83b6ee24 806d2d07 8064e274 00000001 83eddc78 595fa677
[ 151.756517] 00000000 00000000 80760000 00010000 00000000 20202020 000002f0 00000008
[ 151.856581] 00000000 00000000 00000006 ffffffff 80000000 80700000 00000000 81107320
[ 151.956646] 878fac80 8065dc98 878fac7f 8fc03ea4 00000000 00000000 00000000 80750000
[ 152.056710] ...
[ 152.085904] Call Trace:
[ 152.115129] [<8010d1c8>] show_stack+0x58/0x100
[ 152.168321] [<80561fa4>] dump_stack+0xe4/0x120
[ 152.221483] [<8020fef4>] check_bytes_and_report+0xc4/0x10c
[ 152.287138] [<8020ffa0>] check_object+0x64/0x2f0
[ 152.342379] [<802106c8>] validate_slab_slab+0x268/0x29c
[ 152.404919] [<802157e4>] validate_store+0x110/0x1c0
[ 152.463296] [<8028ec60>] kernfs_fop_write+0x13c/0x200
[ 152.523762] [<8021ddf4>] __vfs_write+0x28/0x15c
[ 152.577966] [<8021e0c8>] vfs_write+0xc0/0x18c
[ 152.630078] [<8021e2dc>] SyS_write+0x58/0xc4
[ 152.681174] [<80117598>] syscall_common+0x34/0x58
[ 152.737462] FIX kmalloc-128: Restoring 0x878fac00-0x878fac7f=0xcc
[ 152.737462]
[ 154.896257] =============================================================================
[ 154.994278] BUG kmalloc-128 (Tainted: G B ): Redzone overwritten
[ 155.078747] -----------------------------------------------------------------------------
[ 155.078747]
[ 155.194490] INFO: 0x878fac00-0x878fac7f. First byte 0xcc instead of 0xbb
[ 155.274844] INFO: Allocated in spi_mem_exec_op+0x7c/0x404 age=840 cpu=3 pid=60
[ 155.361414] ___slab_alloc.constprop.35+0x3fc/0x4ac
[ 155.419866] __slab_alloc.constprop.34+0x20/0x44
[ 155.475192] __kmalloc+0x140/0x2e0
[ 155.515950] spi_mem_exec_op+0x7c/0x404
[ 155.561926] spinand_load_page_op+0x68/0x7c
[ 155.612072] spinand_mtd_read+0x230/0x494
[ 155.660114] spinand_mtd_compat_read+0x3c/0x54
[ 155.713361] part_read+0x60/0xb8
[ 155.752011] mtd_read+0x78/0xc0
[ 155.789638] ubi_io_read+0x180/0x32c
[ 155.832478] ubi_eba_read_leb+0x2fc/0x454
[ 155.880520] ubi_leb_read+0x74/0xd4
[ 155.922312] gluebi_read+0x88/0xe4
[ 155.963045] mtd_read+0x78/0xc0
[ 156.000682] mtdblock_readsect+0xdc/0x130
[ 156.048732] mtd_blktrans_work+0x2d0/0x3f4
[ 156.097840] INFO: Freed in spi_mem_exec_op+0x3c0/0x404 age=922 cpu=3 pid=60
[ 156.181287] __slab_free+0xb8/0x550
[ 156.223071] kfree+0x1f8/0x20c
[ 156.259660] spi_mem_exec_op+0x3c0/0x404
[ 156.306674] spinand_load_page_op+0x68/0x7c
[ 156.356803] spinand_mtd_read+0x230/0x494
[ 156.404835] spinand_mtd_compat_read+0x3c/0x54
[ 156.458077] part_read+0x60/0xb8
[ 156.496718] mtd_read+0x78/0xc0
[ 156.534343] ubi_io_read+0x180/0x32c
[ 156.577148] ubi_eba_read_leb+0x2fc/0x454
[ 156.625163] ubi_leb_read+0x74/0xd4
[ 156.666943] gluebi_read+0x88/0xe4
[ 156.707668] mtd_read+0x78/0xc0
[ 156.745281] mtdblock_readsect+0xdc/0x130
[ 156.793306] mtd_blktrans_work+0x2d0/0x3f4
[ 156.842373] process_one_work+0x280/0x474
[ 156.890383] INFO: Slab 0x81107320 objects=16 used=16 fp=0x (null) flags=0x8100
[ 156.977970] INFO: Object 0x878fac80 @offset=3200 fp=0x (null)
[ 156.977970]
[ 157.065574] Redzone 878fac00: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
[ 157.169841] Redzone 878fac10: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
[ 157.274105] Redzone 878fac20: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
[ 157.378369] Redzone 878fac30: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
[ 157.482632] Redzone 878fac40: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
[ 157.586896] Redzone 878fac50: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
[ 157.691160] Redzone 878fac60: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
[ 157.795423] Redzone 878fac70: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
[ 157.899691] Object 878fac80: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 158.002918] Object 878fac90: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 158.106144] Object 878faca0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 158.209366] Object 878facb0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 158.312588] Object 878facc0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 158.415810] Object 878facd0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 158.519030] Object 878face0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
[ 158.622254] Object 878facf0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5 kkkkkkkkkkkkkkk.
[ 158.725478] Redzone 878fad00: bb bb bb bb ....
[ 158.817243] Padding 878fada8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 158.921508] Padding 878fadb8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 159.025773] Padding 878fadc8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 159.130038] Padding 878fadd8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 159.234302] Padding 878fade8: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ
[ 159.338564] Padding 878fadf8: 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZ
[ 159.434499] CPU: 1 PID: 58 Comm: kworker/1:1 Tainted: G B 4.14.90+ #1
[ 159.525233] Workqueue: mtdblock13 mtd_blktrans_work
[ 159.583652] Stack : 8064e1e0 8f5ab77c 806d0000 8f1cb280 8064e274 81107320 878fac80 8065dc98
[ 159.683787] 878fac7f 8017de28 8f1d8484 806d2d07 8064e274 00000001 8f5ab740 595fa677
[ 159.783914] 00000000 00000000 806d14c0 00010000 00000000 64746d20 00000344 00000008
[ 159.884044] 00000000 00000000 00000006 ffffffff 00000000 80700000 00000000 81107320
[ 159.984173] 878fac80 8065dc98 878fac7f 8fc04280 00000002 00000000 00000004 80750004
[ 160.084305] ...
[ 160.113544] Call Trace:
[ 160.142790] [<8010d1c8>] show_stack+0x58/0x100
[ 160.195999] [<80561fa4>] dump_stack+0xe4/0x120
[ 160.249177] [<8020fef4>] check_bytes_and_report+0xc4/0x10c
[ 160.314853] [<8020ffa0>] check_object+0x64/0x2f0
[ 160.370111] [<8021305c>] alloc_debug_processing+0xf4/0x1e0
[ 160.435782] [<80213544>] ___slab_alloc.constprop.35+0x3fc/0x4ac
[ 160.506659] [<80213614>] __slab_alloc.constprop.34+0x20/0x44
[ 160.574410] [<80213778>] __kmalloc+0x140/0x2e0
[ 160.627605] [<80443e34>] spi_mem_exec_op+0x7c/0x404
[ 160.686017] [<80426d24>] spinand_read_from_cache_op+0x130/0x22c
[ 160.756899] [<80427eb0>] spinand_mtd_read+0x250/0x494
[ 160.817387] [<80428130>] spinand_mtd_compat_read+0x3c/0x54
[ 160.883081] [<80401c2c>] part_read+0x60/0xb8
[ 160.934169] [<803fe0a0>] mtd_read+0x78/0xc0
[ 160.984243] [<804345f4>] ubi_io_read+0x180/0x32c
[ 161.039505] [<80431f10>] ubi_eba_read_leb+0x2fc/0x454
[ 161.099970] [<80430a90>] ubi_leb_read+0x74/0xd4
[ 161.154194] [<8043dbd8>] gluebi_read+0x88/0xe4
[ 161.207363] [<803fe0a0>] mtd_read+0x78/0xc0
[ 161.257426] [<804088a0>] mtdblock_readsect+0xdc/0x130
[ 161.317894] [<80407384>] mtd_blktrans_work+0x2d0/0x3f4
[ 161.379406] [<80149890>] process_one_work+0x280/0x474
[ 161.439870] [<80149dfc>] worker_thread+0x378/0x608
[ 161.497223] [<8014ffcc>] kthread+0x168/0x17c
[ 161.548315] [<801071d8>] ret_from_kernel_thread+0x14/0x1c
[ 161.612953] FIX kmalloc-128: Restoring 0x878fac00-0x878fac7f=0xbb
[ 161.612953]
[ 161.703620] FIX kmalloc-128: Marking all objects used |

原因分析

从多次出现的异常打印入手,都是标准Linux内核的函数,出错可能性较低,所以整体上还是怀疑是否误报。从SLUB DEBUG的检测原理出来,走读了下内核这部分代码,发现SLAB对象的在申请时的其DEBUG的处理实际是没有严格进行多核保持的。因此,推测误报的过程如下:

CPU0 CPU1
业务申请SLAB对象,先将SLAB object从freelist摘下来,此时完成分配动作的上半部
触发slabinfo -v,遍历所有slab缓冲池:这里会根据slab的freelist来确定SLAB对象的是否已被分配(代码里是ACTIVE和INACTIVE)。由于刚分配的SLAB对象已经从freelist里摘除,因此check_object期望的redzone填充值为0xCC。但由于分配流程未走完,实际值仍为0xBB,因此,误报了第1次的“Redzone overwritten”,并强制修正redzone为0xCC
继续SLAB对象分配流程,先调用check_object进行检查,这时期望的redzone填充值为0xBB。但由于被上一步的slabinfo -v触发改成了0xCC,因此,误报第2次的"Redzone overwritten"

网上查找,发现这个是已知的情况,具体原因同上面分析。由于解决方法会影响SLUB性能,因此,最终一直没有改变。

Christoph LameterNov. 22, 2011, 4:20 p.m. UTC | #16
Argh. The Redzoning (and the general object pad initialization) is outside of the slab_lock now. So I get wrong positives on those now. That is already in 3.1 as far as I know.
To solve that we would have to cover a much wider area in the alloc and free with the slab lock.

结论总结

SLUB DEBUG本身机制是无锁的,因此,原则上是不能并发操作,slabinfo -v命令相当于增加了误报的可能。
slabinfo -v命令主动激活主动检查没问题,只是要注意排除掉这种误报。

你可能感兴趣的:(slabinfo -v的误报)