Linux内核中Lockdep死锁检测

目录

一、死锁

检测技术:Lockdep

二、Lockdep 内核配置

输出的报告

三、死锁检测实例

1、试验一:隐藏的加锁

2、试验二:AB-BA锁

四、锁统计

五、lockdep编程的建议

六、lockdep 使用可能存在的问题


一、死锁

死锁是两个或者多个进程/线程竞争资源造成相互等待的现象。

举例:如A进程需要资源X,进程B需要资源Y,但X资源被B所占用,Y资源被A占用,且都不释放,造成死锁。

常见的死锁:

        1、递归死锁

        2、 AB-BA死锁

检测技术:Lockdep

原理:其跟踪每个锁的自身状态和各个锁之间的依赖关系,经过规则验证来保证依赖的关系正确。

二、Lockdep 内核配置

自旋锁与互斥锁

在内核文件lib/Kconfig.debug中有详细的描述

CONFIG_DEBUG_LOCKDEP                              在死锁发生,内核报告相应的死锁

CONFIG_PROVE_LOCKING=y

CONFIG_LOCK_STAT                                          追踪锁竞争的点,解释的更详细

CONFIG_DEBUG_RI_MUTEXES                         实时互斥锁语义相关的死锁

CONFIG_DEBUG_LOCK_ALLOC                        检测不正确的活锁(live lock)释放

CONFIG_DEBUG_ATOMIC_SLEEP                     检测原子内睡眠

CONFIG_DEBUG_LOCKING_API_SELFTESTS 锁API引导时间自检

CONFIG_LOCK_TORTURE_TEST                       锁的测试

Kernel hacking->Lock Debugging 

Linux内核中Lockdep死锁检测_第1张图片

将这些全打开在内核调试模式下是可以的,但是在生产环境中最好不要打开,因为占用太多内存,牺牲内核的速度。

输出的报告

  • WARN*()
  • deadlocks/lock inversion scenarios,
  • circular lock dependencies,
  • and hard IRQ/soft IRQ safe/unsafe locking bugs

三、死锁检测实例

1、试验一:隐藏的加锁

1)程序的简化版

do_each_thread(g, t) { /* 'g' : process ptr; 't': thread ptr */
 task_lock(t);
 [ ... ]
 get_task_comm(tasknm, t);
 task_unlock(t);
}

使用迭代的方式获取线程的数据结构信息。先上锁,获取任务信息,再解锁。看起来没有问题。

2)内核输出

#insmod thrd_showall_buggy.ko 
[ 1404.479012] thrd_showall_buggy: loading out-of-tree module taints kernel.
[ 1404.484444] thrd_showall_buggy: module verification failed: signature and/or required key missing - tainting kernel
[ 1404.510962] thrd_showall_buggy: inserted
[ 1404.516417] ============================================
[ 1404.517142] WARNING: possible recursive locking detected
[ 1404.517826] 5.0.0+ #2 Tainted: G           OE    
[ 1404.518250] --------------------------------------------
[ 1404.518432] insmod/1348 is trying to acquire lock:
[ 1404.519375] 000000001759de9e (&(&p->alloc_lock)->rlock){+.+.}, at: __get_task_comm+0x38/0x88
[ 1404.521885] 
[ 1404.521885] but task is already holding lock:
[ 1404.522282] 000000001759de9e (&(&p->alloc_lock)->rlock){+.+.}, at: showthrds_buggy+0x9c/0x504 [thrd_showall_buggy]
[ 1404.523738] 
[ 1404.523738] other info that might help us debug this:
[ 1404.524108]  Possible unsafe locking scenario:
[ 1404.524108] 
[ 1404.524359]        CPU0
[ 1404.524451]        ----
[ 1404.524588]   lock(&(&p->alloc_lock)->rlock);
[ 1404.524774]   lock(&(&p->alloc_lock)->rlock);
[ 1404.525658] 
[ 1404.525658]  *** DEADLOCK ***
[ 1404.525658] 
[ 1404.526054]  May be due to missing lock nesting notation
[ 1404.526054] 
[ 1404.526665] 1 lock held by insmod/1348:
[ 1404.527124]  #0: 000000001759de9e (&(&p->alloc_lock)->rlock){+.+.}, at: showthrds_buggy+0x9c/0x504 [thrd_showall_buggy]
[ 1404.528286] 
[ 1404.528286] stack backtrace:
[ 1404.529195] CPU: 1 PID: 1348 Comm: insmod Kdump: loaded Tainted: G           OE     5.0.0+ #2
[ 1404.530369] Hardware name: linux,dummy-virt (DT)
[ 1404.531230] Call trace:
[ 1404.531459]  dump_backtrace+0x0/0x52c

3)输出分析

  • WARNING: possible recursive locking detected  循环锁检测
  • [ 1404.518432] insmod/1348 is trying to acquire lock://尝试获取锁

        000000001759de9e (&(&p->alloc_lock)->rlock){+.+.}, at: __get_task_comm+0x38/0x88
        函数名 偏移值以及函数大小,方便进行定位

        [ 1404.521885] but task is already holding lock://已经上锁
        at: showthrds_buggy+0x9c/0x504 [thrd_showall_buggy]//任务

  • 符号{+.+.}的含义

        '+' 意味着在启用 IRQ 的情况下获取锁定

        '.' 意味着在禁用 IRQ 的情况下获取锁定,而不是在 IRQ 上下文中获取

        具体含义参考:https://www.kernel.org/doc/Documentation/ locking/lockdep-design.txt

通过上面的分析看出,get_task_comm函数尝试获取相同锁,导致死锁,查看这个内核函数,果然有进行上锁。

#define get_task_comm(buf, tsk) ({			\
	BUILD_BUG_ON(sizeof(buf) != TASK_COMM_LEN);	\
	__get_task_comm(buf, sizeof(buf), tsk);		\
})


char *__get_task_comm(char *buf, size_t buf_size, struct task_struct *tsk)
{
	task_lock(tsk);
	strncpy(buf, tsk->comm, buf_size);
	task_unlock(tsk);
	return buf;
}
EXPORT_SYMBOL_GPL(__get_task_comm);

4)解决方法:简化版

do_each_thread(g, t) {   
       task_lock(t); 
       ...
       task_unlock(t);
       get_task_comm(tasknm, t);
       task_lock(t);
       ...    
       task_unlock(t);
}

2、试验二:AB-BA锁

1)模型

Linux内核中Lockdep死锁检测_第2张图片

2)程序中的锁的顺序

线程1 :
         spin_lock(&lockA);           
         spin_lock(&lockB);

         spin_unlock(&lockB);
         spin_unlock(&lockA);

线程2:
         spin_lock(&lockB);
         spin_lock(&lockA);

         spin_unlock(&lockA);
         spin_unlock(&lockB);

3)内核检测输出

insmod deadlock_eg_AB-BA.ko lock_ooo=1
key missing - tainting kernel
[  190.895374] deadlock_eg_AB-BA: inserted (param: lock_ooo=1)
[  190.924925] thrd_work():115: *** thread PID 1616 on cpu 0 now ***
[  190.936420] thrd_work():115: *** thread PID 1617 on cpu 1 now ***
[  190.937541]  iteration #0 on cpu #1
[  190.938060]  Thread #0: locking: we do: lockA --> lockB
[  190.939223]  Thread #1: locking: we do: lockB --> lockA
[  190.941822]  iteration #0 on cpu #0
[  190.946014] B
[  190.946185] A
[  190.946231] B
[  190.949057] A
[  190.949818] A
[  190.950828] irq event stamp: 12493
[  190.950846] 
[  190.952232] hardirqs last  enabled at (12493): [] kmem_cache_free+0x6b0/0x1178
[  190.953328] hardirqs last disabled at (12492): [] kmem_cache_free+0x660/0x1178
[  190.953951] ======================================================
[  190.953983] WARNING: possible circular locking dependency detected
[  190.955155] softirqs last  enabled at (12436): [] fpsimd_restore_current_state+0x4fc/0x53c
[  190.957546] 5.0.0+ #2 Tainted: G           OE    
[  190.957646] ------------------------------------------------------
[  190.957880] softirqs last disabled at (12434): [] fpsimd_restore_current_state+0x328/0x53c
[  190.960741] thrd_0/0/1616 is trying to acquire lock:
[  190.964268] (____ptrval____) (lockB){+.+.}, at: thrd_work+0x1e8/0x6c0 [deadlock_eg_AB_BA]
[  190.973906] 
[  190.973906] but task is already holding lock:
[  190.975638] (____ptrval____) (lockA){+.+.}, at: thrd_work+0x130/0x6c0 [deadlock_eg_AB_BA]
[  190.979383] 
[  190.979383] which lock already depends on the new lock.
[  190.979383] 
[  190.984836] 
[  190.984836] the existing dependency chain (in reverse order) is:
[  190.989808] 
[  190.989808] -> #1 (lockA){+.+.}:
[  190.991925]        validate_chain+0x1250/0x14a0
[  190.992364]        __lock_acquire+0xae4/0xc08
[  190.993684]        lock_acquire+0x664/0x6b8
[  190.998583]        _raw_spin_lock+0x54/0xb0
[  190.999824]        thrd_work+0x3f0/0x6c0 [deadlock_eg_AB_BA]
[  191.001019]        kthread+0x3c0/0x3cc
[  191.002301] 
[  191.002301] -> #0 (lockB){+.+.}:
[  191.006355]        check_prevs_add+0x148/0x2cc
[  191.007502]        validate_chain+0x1250/0x14a0
[  191.010230]        __lock_acquire+0xae4/0xc08
[  191.011146]        lock_acquire+0x664/0x6b8
[  191.012896]        _raw_spin_lock+0x54/0xb0
[  191.016753]        thrd_work+0x1e8/0x6c0 [deadlock_eg_AB_BA]
[  191.020368]        kthread+0x3c0/0x3cc
[  191.022408] 
[  191.022408] other info that might help us debug this:
[  191.022408] 
[  191.025625]  Possible unsafe locking scenario:
[  191.025625] 
[  191.030342]        CPU0                    CPU1
[  191.034011]        ----                    ----
[  191.035514]   lock(lockA);
[  191.037973]                                lock(lockB);
[  191.042529]                                lock(lockA);
[  191.045536]   lock(lockB);
[  191.047178] 
[  191.047178]  *** DEADLOCK ***
[  191.047178] 
[  191.051286] 1 lock held by thrd_0/0/1616:
[  191.053763]  #0: (____ptrval____) (lockA){+.+.}, at: thrd_work+0x130/0x6c0 [deadlock_eg_AB_BA]
[  191.058936] 
[  191.058936] stack backtrace:
[  191.060266] CPU: 0 PID: 1616 Comm: thrd_0/0 Kdump: loaded Tainted: G           OE     5.0.0+ #2
[  191.061426] Hardware name: linux,dummy-virt (DT)
[  191.062226] Call trace:
[  191.062582]  dump_backtrace+0x0/0x52c
[  191.063168]  show_stack+0x24/0x30

4)分析

  • WARNING: possible circular locking dependency detected
  • Possible unsafe locking scenario:

[  191.030342]        CPU0                    CPU1
[  191.034011]        ----                    ----
[  191.035514]   lock(lockA);
[  191.037973]                                lock(lockB);
[  191.042529]                                lock(lockA);
[  191.045536]   lock(lockB);

四、锁统计

内核提供锁统计信息,以便轻松识别竞争激烈的锁。

锁可以被争用,也就是说,当上下文想要获取锁,但它已经被占用了,所以它必须等待解锁发生。激烈的争用可能会造成严重的性能瓶颈;

内核配置 CONFIG_LOCK_STAT

命令行

清空锁的状态:echo 0 > /proc/lock_stat

使能锁:echo 1 > /proc/sys/kernel/lock_stat

不使能锁:echo 0 > /proc/sys/kernel/lock_stat

五、lockdep编程的建议

使用lockdep_assert_held宏  源码位置// include/linux/lockdep.h

#define lockdep_assert_held(l)	do {				\
		WARN_ON(debug_locks && !lockdep_is_held(l));	\
	} while (0)

#define lockdep_assert_held_write(l)	do {			\
		WARN_ON(debug_locks && !lockdep_is_held_type(l, 0));	\
	} while (0)

#define lockdep_assert_held_read(l)	do {				\
		WARN_ON(debug_locks && !lockdep_is_held_type(l, 1));	\
	} while (0)

#define lockdep_assert_held_once(l)	do {				\
		WARN_ON_ONCE(debug_locks && !lockdep_is_held(l));	\
	} while (0)

如果断言失败,则会通过WARN_ON发出警告。

六、lockdep 使用可能存在的问题

存在的问题

  1. 重复加载和卸载模块可能会导致超出 lockdep 的内部锁定类限制。实际上,要么不要重复加载/卸载模块,要么重置系统。
  2. 在数据结构比较大的情况下,需要巨大的锁,未能正确初始化每个锁都可能导致lockdep溢出。
  3. 提示信息:*WARNING* lock debugging disabled!! - possibly due to a lockdep warning. 这可能是由于lockdep提前发出警告而发生的。

解决办法

  • 重新启动系统并重试。
  • KCSAN能检测到数据竞争
  • deadlock  eBPF提供的脚本
  • helgrind ,TSan 工具 检查多线程应用程序中的数据争用


你可能感兴趣的:(linux内核分析,linux内核调试与追踪,lockdep,死锁检测,1024程序员节)