Windows Via C/C++:用户模式下的线程同步——原子操作:Interlocked函数族

// Define a global variable
long g_x = 0;

DWORD WINAPI ThreadFunc1(PVOID pvParam)
  g_x ++;
  return 0;

DWORD WINAPI ThreadFunc2(PVOID pvParam)
  g_x ++;
  return 0;

MOV EAX, [g_x] ; move the value in g_x into a register

INC EAX ; increase the value in the register

MOV [g_x], EAX ; store the new value back into g_x

MOV EAX, [g_x] ; thread 1: move 0 into a register

INC EAX ; thread 1: increase the register to 1

MOV EAX, [g_x] ; thread 2: move 0 into the register again

MOV [g_x], EAX ; thread 1: move 0 into g_x

INC EAX ; thread 2: increase the register to 1

MOV [g_x], EAX ; thread 2: move 1 into g_x


LONG InterlockedExchangeAdd(
  PLONG volatile plAddend,
  LONG lIncrement);
LONGLONG InterlockedExchangeAdd64(
  PLONGLONG volatile pllAddend,
  LONGLONG lIncrement);


long g_x = 0;

DWORD WINAPI ThreadFunc1(PVOID pvParam){
  InterlockedExchangeAdd(&g_x, 1);
  return 0;

DWORD WINAPI ThreadFunc2(PVOID pvParam){
  InterlockedExchangeAdd(&g_x, 1);
  return 0;


// the long variable shared by many threads
LONG g_x;

// Incorrect way to increase the variable
g_x ++;

// Correct way to increase the variable

Interlocked函数族的实现细节取决于当前CPU平台。在x86架构的CPU上,Interlocked函数会在总线上设置硬件信号以阻止其它线程访问被锁定的内存区域。Interlocked函数族的实现细节并不重要,重要的是它们保证了对参数的操作是原子的,不受编译器和CPU架构、数量的影响。要注意传递给Interlocked函数族的变量地址必须按32位/64位对齐,C运行时提供了_aligned_malloc和aligned_realloc函数来创建/重新分配按指定位数对齐的内存块:void * _aligned_malloc(size_t size, size_t alignment),size表示要分配的字节数,alignment是该对齐该内存块的字节边界,alignment必须是2的倍数。此外,Interlocked函数族的执行速度相当快,通常不超过50个CPU周期,且无需在用户模式和内核模式间转换(执行这种转换通常所需的CPU周期数通常大于1000)。


LONG InterlockedExchange(
  PLONG volatile plTarget,
  LONG lValue);

LONGLONG InterlockedExchange64(
  PLONGLONG volatile plTarget,
  LONGLONG lValue);

PVOID InterlockedExchangePointer(
  PVOID* volatile ppvTarget,
  PVOID pvValue);


// Global variable indicating whether a shared resource is in user or not
BOOL g_fResourceInUser = FALSE;

void Func1() {
  // wait to access the resource
  while(InterlockedExchange(&g_fResourceInUse, TRUE) == TRUE)
  // access the resource
  // reset the flag
  InterlockedExchange(&g_fResourceInUse, FALSE);


使用自旋锁时,线程访问受保护资源的时间应该尽可能的短。更为有效的方法是首先使用自旋锁等待,经过一段时间后若依然无法访问受保护的资源,则转入内核模式等待(此时线程将被挂起而不会消耗CPU时间)直到资源被其它线程释放,这就是临界区(Critical Section)的实现原理。


LONG InterlockedCompareExchange(
  PLONG plDestination,
  LONG lExchange,
  LONG lCompared);

PVOID InterlockedCompareExchangePointer(
  PVOID* ppvDestination,
  PVOID pvExchange,
  PVOID pvCompared);



LONG InterlockedIncrement(PLONG plAddend);

LONG InterlockedDecrement(PLONG plAddend);


LONGLONG InterlockedAnd64(LONGLONG* Destination, LONGLONG value) {
  LONGLONG old = *Destination;
  do {
    old = *Destination;
  } while(InterlockedCompareExchange64(Destination, old&value, old) != old);
  return old;

LONGLONG InterlockedAnd64(LONGLONG* Destination, LONGLONG value) {
  return InterlockedCompareExchange64(Destination, (*Destination)&value, *Destination);

从Windows XP开始,除了对整数和布尔值的原子操作,开发人员可以使用新的函数操作一种被称为“Interlocked Singly Linked List”的堆栈。在该栈上的每一种操作,如压栈、弹出等都是原子操作,下表列出了这些函数:

函数名 描述
InitializeSListHead Creates an empty stack
InterlockedPushEntrySList Adds an element on top of the stack
InterlockedPopEntrySList Removes the top element of the stack and returns it
InterlockedFlushSList Empties the stack
QueryDepthSList Returns the number of elements stored in the stack

