本博文为原创,遵循CC3.0协议,转载请注明出处:http://blog.csdn.net/lux_veritas/article/details/8977510
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
per_cpu宏在linux kernel中很常见,此处结合代码简析一下per_cpu宏实现的功能,基于linux2.6.36的kernel版本:
在开启CONFIG_SMP情况下per_cpu宏的实现如下,其功能是根据cpu的值计算其内部数据偏移(per_cpu_offset(cpu)),返回var指针增加该偏移后的地址:
#define per_cpu(var, cpu) \
(*SHIFT_PERCPU_PTR(&(var), per_cpu_offset(cpu)))
/* Weird cast keeps both GCC and sparse happy. */
#define SHIFT_PERCPU_PTR(__p, __offset) ({ \
__verify_pcpu_ptr((__p)); \ //1
RELOC_HIDE((typeof(*(__p)) __kernel __force *)(__p), (__offset)); \ //2
})
//1验证指针的宏:
/*
* Macro which verifies @ptr is a percpu pointer without evaluating
* @ptr. This is to be used in percpu accessors to verify that the
* input parameter is a percpu pointer.
*/
#define __verify_pcpu_ptr(ptr) do { \
const void __percpu *__vpp_verify = (typeof(ptr))NULL; \
(void)__vpp_verify; \
} while (0)
//2使指针增量的宏:/*
* This macro obfuscates arithmetic on a variable address so that gcc
* shouldn't recognize the original var, and make assumptions about it.
*
* This is needed because the C standard makes it undefined to do
* pointer arithmetic on "objects" outside their boundaries and the
* gcc optimizers assume this is the case. In particular they
* assume such arithmetic does not wrap.
*
* A miscompilation has been observed because of this on PPC.
* To work around it we hide the relationship of the pointer and the object
* using this macro.
*
* Versions of the ppc64 compiler before 4.1 had a bug where use of
* RELOC_HIDE could trash r30. The bug can be worked around by changing
* the inline assembly constraint from =g to =r, in this particular
* case either is valid.
*/
#define RELOC_HIDE(ptr, off) \
({ unsigned long __ptr; \
__asm__ ("" : "=r"(__ptr) : "0"(ptr)); \
//__ptr存放在寄存器中,ptr存放在__ptr所在寄存器中,即完成ptr到__ptr的赋值,最后将__ptr的值返回
(typeof(ptr)) (__ptr + (off)); })
//整个实现将ptr的值加上off,返回给ptr
关于per_cpu_offset的宏定义如下,per_cpu_offset(x)的值实际上是trap_block数组以x作为数组下标的数组项中结构体trap_per_cpu的__per_cpu_base的值:
/*
* per_cpu_offset() is the offset that has to be added to a
* percpu variable to get to the instance for a certain processor.
*
* Most arches use the __per_cpu_offset array for those offsets but
* some arches have their own ways of determining the offset (x86_64, s390).
*/
#define per_cpu_offset(x) (__per_cpu_offset(x))
#define __per_cpu_offset(__cpu) \
(trap_block[(__cpu)].__per_cpu_base)
struct trap_per_cpu trap_block[NR_CPUS];
Reference:
[1]http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#s6
[2]http://www.yubo.org/xen/code.php?file=linux-2.6.18-xen.hg/include/linux/compiler-gcc.h&begin=20&end=23&lang=CPP&num=1&version=3.4&title=RELOC_HIDE()