Spin_lock是Linux内核的一种同步机制。内核代码可以通过获得spin_lock宣称对某一资源的占有,直到其释放该spin_lock;如果内核代码试图获得一个已经锁定的spin_lock,则这部分代码会一直忙等待,直到获得该spin_lock。
Spin_lock的kernel中的实现对单核(UP),多核(SMP)有不同的处理方式。对单核来说,如果spin_lock不处于中断上下文,则spin_lock锁定的代码丢失CPU拥有权,只会在内核抢占的时候发生。所以,对于单核来说,只需要在spin_lock获得锁的时候禁止抢占,释放锁的时候开放抢占。对多核来说,存在两段代码同时在多核上执行的情况,这时候才需要一个真正的锁来宣告代码对资源的占有。
在include/linux/spinlock.h文件中,给出了UP,SMP所涉及的不同的头文件,也很清楚的将两者实现的不同体现出来。
/*
* include/linux/spinlock.h - generic spinlock/rwlock declarations
here's the role of the various spinlock/rwlock related include files:
*
* on SMP builds:
*
*
asm/spinlock_types.h: contains the arch_spinlock_t/arch_rwlock_t and the
initializers
*
* linux/spinlock_types.h:
defines the generic type and initializers
*
* asm/spinlock.h: contains the arch_spin_*)/etc. lowlevel
implementations, mostly inline assembly code
(also included on UP-ebug builds:)
*
* linux/spinlock_api_smp.h:
contains the prototypes for the _spin_*() APIs.
*
* linux/spinlock.h: builds the final spin_*) APIs.
*
* on UP builds:
*
* linux/spinlock_type_up.h:
contains the generic, simplified UP spinlock type.
(which is an empty structure on non-debug builds)
*
* linux/spinlock_types.h:
defines the generic type and initializers
*
* linux/spinlock_up.h:
contains the arch_spin_*)/etc. version of UP
builds. (which are NOPs on non-debug, non-preempt
builds)
*
* (included on UP-non-debug builds:)
*
* linux/spinlock_api_up.h:
builds the _spin_*() APIs.
*
* linux/spinlock.h: builds the final spin_*() APIs.
下面代码表明了
UP
和
SMP
是通过
CONFIG_SMP
选项来区分,从而编译不同的头文件。
- /* * Pull the _spin_*()/_read_*()/_write_*() functions/declarations: */
- #if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
- # include <linux/spinlock_api_smp.h>
- #else# include <linux/spinlock_api_up.h>
- #endif
- static inline void spin_lock(spinlock_t *lock)
- { raw_spin_lock(&lock->rlock);
- }
- #define raw_spin_lock(lock) _raw_spin_lock(lock)
UP中spin_lock的实现
实现在include/linux/spinlock_api_up.h
- /* * In the UP-nondebug case there's no real locking going on, so the * only thing we have to do is tokeep the preempt counts and irq * flags straight, to suppress compiler warnings of unused lock * variables, and to add the proper checker annotations: */
- #define __LOCK(lock) \
- do {
- preempt_disable();
- __acquire(lock);
- (void)(lock);
- } while (0)
- #define _raw_spin_lock(lock) __LOCK(lock)
代码表明,spin_lock在UP中实际上被处理为三个语句:
preempt_disable();
__acquire(lock);
(void)(lock);
Preempt_disable()将当前进程的preempt_count加1,表示禁止内核抢占,那么内核从中断上下文返回时不会发生进程调度。
__acquire(lock)只是使用sparse工具对lock进行检查,否则该宏为空。
另在make 的参数,则会导致编译时进行sparse检查。
(void)(lock)仅仅是为了防止编译器对lock的未使用报警。
SMP中spin_lock的实现
实现在include/linux/spinlock_api_smp.h
- static inline void __raw_spin_lock(raw_spinlock_t *lock)
- {
- preempt_disable();
- spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
- LOCK_CONTENDED(lock, do_raw_spin_trylock, do_raw_spin_lock);}
同样,SMP上的实现被分解为三句话。
Preempt_disable()不用解释
Spin_acquire()同样是sparse检查需要
LOCK_CONTENDED()是一个宏,如果不考虑CONFIG_LOCK_STAT(该宏是为了统计lock的操作),则:
#define LOCK_CONTENDED(_lock, try, lock) \
lock (_lock)
则第三句话等同于:
do_raw_spin_lock(lock)
而do_raw_spin_lock()则可以从spinlock.h中找到痕迹:
static inline int do_raw_spin_trylock(raw_spinlock_t *lock)
{
return arch_spin_trylock (& (lock)->raw_lock);
}
看到arch,我们明白这个函数是体系相关的。下面分别分析ARM和x86体现结构下该函数的实现。
ARM中spin_lock的实现
- static inline void arch_spin_lock(arch_spinlock_t *lock)
- {
- unsigned long tmp;
- __asm__ __volatile__("
- 1: ldrex %0, [%1]\n"
- @将&lock->lock地址中的值,即lock->lock加载到tmp中,并设置&lock->lock为独占访问"
- teq %0, #0\n"
- @测试tmp是否为0
- WFE("ne")
- @不为0,则执行WFE指令。不为0,代表锁已被锁定,则通过WFE指令进入suspend mode(clock停止),直到该锁被释放时发出的SEV指令,CPU才会跳出suspend mode"
- strexeq %0, %2, [%1]\n"
- @将lock->lock加1,并解除lock->lock的锁定状态,tmp中存入返回状态"
- teqeq %0, #0\n"
- @如果执行成功,则tmp为0,成功获得所"
- bne 1b"
- @如果执行不成功,则tmp不为0,跳转到标号1处,继续获得锁。
- : "=&r" (tmp)
- : "r" (&lock->lock), "r" (1) : "cc");
- smp_mb(); }
代码是一段内联汇编。Tmp为输出,放在寄存器中,在代码中以%0表示,&lock->lock为输入参数1,放在寄存器中,在代码中以%1表示,常数1为输入参数2,放在寄存器中,在代码中以2%表示。
代码中,ldrex/strex以及WFE指令是关键。因lock->lock放在内存中,那么将lock->lock加1这一操作会经过读取内存,+1,写内存的操作,这一过程如果不是原子操作,那么其他核有可能在这一过程中访问lock->lock,造成错误。Ldrex/strex是ARM在arm v6中新增的指令,用于对内存区域的独占访问,WFE指令则可以在空等时间内暂停CPU的时钟,以达到省电的目的。
X86中spin_lock的实现
X86中的实现在arch/x86/include/asm/spin_lock.h:
- static __always_inline void __ticket_spin_lock(arch_spinlock_t *lock){
- short inc = 0x0100;
- asm volatile (
- LOCK_PREFIX "xaddw %w0, %1\n" @对SMP内核来说,LOCK_PREFIX为”\n\tlock” Lock是一个指令前缀,表示在接下来的一个指令内,LOCK信号被ASSERT,指令所访问的内存区域将为独占访问。具体实现或是BUS锁定,或是Cache一致性操作。可参考intel system program guide 8.1 另:这一实现是最新的实现,名为ticket实现,即每个希望获得锁的代码都会得到一张ticket,ticket按顺序增长,锁内部会维护一个当前使用锁的ticket号owner,和下一个使用锁的ticket号next,各一个字节。当锁处于释放状态时,owner=next,如果锁处于锁定状态,则next=owner+1。获得锁的时候,将next+1,释放锁的时候将owner+1。 "1:\t"
- "cmpb %h0, %b0\n\t" "je 2f\n\t"
- "rep ; nop\n\t"
- "movb %1, %b0\n\t"
- /* don't need lfence here, because loads are in-order */
- "jmp 1b\n" "2:"
- : "+Q" (inc), "+m" (lock->slock) :
- : "memory", "cc");}