此博客写的匆忙并且全为个人见解,错误之处请大家赶快指出。
linux内核rcu机制,是在多读线程和一个写线程并发情况下对指针的保户机制,多写并发需要加锁。
常应用于链表的数据操作中。
linux内核rculist.h是对链表带rcu保护机制操作函数的实现和封装,位于kernel/include/linux/ruclist.h
下面介绍rculist中的函数或宏
1、list_next
/*
* return the ->next pointer of a list_head in an rcu safe
* way, we must not access it directly
*/
#define list_next_rcu(list) (*((struct list_head __rcu **)(&(list)->next)))
此宏目的是获取list->next成员的地址
# define __rcu __attribute__((noderef, address_space(4))) 标示在地址空间4,并且地址有效
此宏是判断该地址是否有效
2、list_add
/**
* list_add_rcu - add a new entry to rcu-protected list
* @new: new entry to be added
* @head: list head to add it after
*
* Insert a new entry after the specified head.
* This is good for implementing stacks.
*
* The caller must take whatever precautions are necessary
* (such as holding appropriate locks) to avoid racing
* with another list-mutation primitive, such as list_add_rcu()
* or list_del_rcu(), running on this same list.
* However, it is perfectly legal to run concurrently with
* the _rcu list-traversal primitives, such as
* list_for_each_entry_rcu().
*/
static inline void list_add_rcu(struct list_head *new, struct list_head *head)
{
__list_add_rcu(new, head, head->next);
}
此函数是添加新条目到一个条目之后,是基于__list_add_rcu的上层函数。
/*
* Insert a new entry between two known consecutive entries.
*
* This is only for internal list manipulation where we know
* the prev/next entries already!
*/
static inline void __list_add_rcu(struct list_head *new,
struct list_head *prev, struct list_head *next)
{
new->next = next;
new->prev = prev;
rcu_assign_pointer(list_next_rcu(prev), new);
next->prev = new;
}
此函数添加一个新条目到两个条目之间
new->next = next;
new->prev = prev;
必须是先给新条目next、prev值,然后再修改旧next、prev条目值,这样让已经获取prev条目值的并发读遍历线程不会中断
rcu_assign_pointer(list_next_rcu(prev), new); 这是给rcu机制保护的指针赋值,list_next_rcu(prev)这是获取前条目的next地址,rcu_assign_pointer这是rcu的发布订阅机制,保证读线程要不就是获取前条目next的旧值,要不就是赋值完成后的新值,这个rcu保护机制是防止编译器优化时将语句顺序优化,从而导致读线程获取的地址是未赋值的或不完全的,出现错误,这个机制是通过smp内存屏障实现的,内存屏障保证指针的内存数据已经赋值完成。
3、/**
* list_add_tail_rcu - add a new entry to rcu-protected list
* @new: new entry to be added
* @head: list head to add it before
*
* Insert a new entry before the specified head.
* This is useful for implementing queues.
*
* The caller must take whatever precautions are necessary
* (such as holding appropriate locks) to avoid racing
* with another list-mutation primitive, such as list_add_tail_rcu()
* or list_del_rcu(), running on this same list.
* However, it is perfectly legal to run concurrently with
* the _rcu list-traversal primitives, such as
* list_for_each_entry_rcu().
*/
static inline void list_add_tail_rcu(struct list_head *new,
struct list_head *head)
{
__list_add_rcu(new, head->prev, head);
}
此函数添加一个新条目到旧条目之前。也是基于__list_add_rcu函数的上层函数。
4、/**
* list_del_rcu - deletes entry from list without re-initialization
* @entry: the element to delete from the list.
*
* Note: list_empty() on entry does not return true after this,
* the entry is in an undefined state. It is useful for RCU based
* lockfree traversal.
*
* In particular, it means that we can not poison the forward
* pointers that may still be used for walking the list.
*
* The caller must take whatever precautions are necessary
* (such as holding appropriate locks) to avoid racing
* with another list-mutation primitive, such as list_del_rcu()
* or list_add_rcu(), running on this same list.
* However, it is perfectly legal to run concurrently with
* the _rcu list-traversal primitives, such as
* list_for_each_entry_rcu().
*
* Note that the caller is not permitted to immediately free
* the newly deleted entry. Instead, either synchronize_rcu()
* or call_rcu() must be used to defer freeing until an RCU
* grace period has elapsed.
*/
static inline void list_del_rcu(struct list_head *entry)
{
__list_del(entry->prev, entry->next);
entry->prev = LIST_POISON2;
}
此函数将一个条目从链表删除。__list_del链接两个条目
entry->prev = LIST_POISON2; 给条目的prev指针赋值
此处没给条目next赋值,就是为了保证读线程的完整性,已经获取此条目地址的遍历程序不会中断,没有获取此条目的会获得新的next地址。
这里也符合rcu理念,保证链表遍历完整性。
__list_del
在kernel/include/linux/list.h
/*
* Delete a list entry by making the prev/next entries
* point to each other.
*
* This is only for internal list manipulation where we know
* the prev/next entries already!
*/
static inline void __list_del(struct list_head * prev, struct list_head * next)
{
next->prev = prev;
prev->next = next;
}
5、/**
* list_replace_rcu - replace old entry by new one
* @old : the element to be replaced
* @new : the new element to insert
*
* The @old entry will be replaced with the @new entry atomically.
* Note: @old should not be empty.
*/
static inline void list_replace_rcu(struct list_head *old,
struct list_head *new)
{
new->next = old->next;
new->prev = old->prev;
rcu_assign_pointer(list_next_rcu(new->prev), new);
new->next->prev = new;
old->prev = LIST_POISON2;
}
此函数是用新条目替代旧条目
new->next = old->next;
new->prev = old->prev;首先给新的条目赋值
rcu_assign_pointer(list_next_rcu(new->prev), new); 这里用rcu保护机制给prev的next地址赋值,保证next地址的完整性,这里防止某些编译器优化代码执行顺序,保证读线程获取的next地址要么是旧的,要么是已经赋好值得新指针。
old->prev = LIST_POISON2;给就得prev赋值,但是使旧的next为空,防止已经获取其地址的读线程遍历中断,这也是rcu机制完整性思想。
6、/**
* list_splice_init_rcu - splice an RCU-protected list into an existing list.
* @list: the RCU-protected list to splice
* @head: the place in the list to splice the first list into
* @sync: function to sync: synchronize_rcu(), synchronize_sched(), ...
*
* @head can be RCU-read traversed concurrently with this function.
*
* Note that this function blocks.
*
* Important note: the caller must take whatever action is necessary to
* prevent any other updates to @head. In principle, it is possible
* to modify the list as soon as sync() begins execution.
* If this sort of thing becomes necessary, an alternative version
* based on call_rcu() could be created. But only if -really-
* needed -- there is no shortage of RCU API members.
*/
static inline void list_splice_init_rcu(struct list_head *list,
struct list_head *head,
void (*sync)(void))
{
struct list_head *first = list->next;
struct list_head *last = list->prev;
struct list_head *at = head->next;
if (list_empty(head))
return;
/* "first" and "last" tracking list, so initialize it. */
INIT_LIST_HEAD(list);
/*
* At this point, the list body still points to the source list.
* Wait for any readers to finish using the list before splicing
* the list body into the new list. Any new readers will see
* an empty list.
*/
sync();
/*
* Readers are finished with the source list, so perform splice.
* The order is important if the new list is global and accessible
* to concurrent RCU readers. Note that RCU readers are not
* permitted to traverse the prev pointers without excluding
* this function.
*/
last->next = at;
rcu_assign_pointer(list_next_rcu(head), first);
first->prev = head;
at->prev = last;
}
此函数将某个链表拼接到一个链表指定位置。
struct list_head *first = list->next;
struct list_head *last = list->prev;
struct list_head *at = head->next; 上面是获取被拼接链表的头和尾,以及拼接点
if (list_empty(head)) 判断拼接点是否为空
INIT_LIST_HEAD(list); 将被拼接链表头置空
sync();一般是synchronize_rcu()函数,为rcu宽限期,等待并发读线程们结束。
last->next = at 给被拼接链表尾赋next值
rcu_assign_pointer(list_next_rcu(head), first); 通过rcu保护机制给拼接点的next赋值到被拼接点的头。
first->prev = head; 给被拼接点头的prev赋值到拼接点
at->prev = last; 给拼接点next的prev赋值到被拼接点的尾
7、
/**
* list_entry_rcu - get the struct for this entry
* @ptr: the &struct list_head pointer.
* @type: the type of the struct this is embedded in.
* @member: the name of the list_struct within the struct.
*
* This primitive may safely run concurrently with the _rcu list-mutation
* primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
*/
#define list_entry_rcu(ptr, type, member) \
({typeof (*ptr) __rcu *__ptr = (typeof (*ptr) __rcu __force *)ptr; \
container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \
})
此宏目的获取链表节点嵌入到外部结构体的指针。读线程遍历链表时调用。
typeof (*ptr) __rcu *__ptr = (typeof (*ptr) __rcu __force *)ptr; 给__ptr赋值,此用__rcu 判断地址有效性。
rcu_dereference_raw(__ptr) 获取rcu机制保护的链表节点指针,该机制是发布-订阅机制。确保获取地址完整性。
container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member);
8、
/**
* list_first_entry_rcu - get the first element from a list
* @ptr: the list head to take the element from.
* @type: the type of the struct this is embedded in.
* @member: the name of the list_struct within the struct.
*
* Note, that list is expected to be not empty.
*
* This primitive may safely run concurrently with the _rcu list-mutation
* primitives such as list_add_rcu() as long as it's guarded by rcu_read_lock().
*/
#define list_first_entry_rcu(ptr, type, member) \
list_entry_rcu((ptr)->next, type, member)
此宏目的获取该节点下一个节点的外部结构体指针。
9、
/**
* list_for_each_entry_rcu -iterate over rcu list of given type
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the list_struct within the struct.
*
* This list-traversal primitive may safely run concurrently with
* the _rcu list-mutation primitives such as list_add_rcu()
* as long as the traversal is guarded by rcu_read_lock().
*/
#define list_for_each_entry_rcu(pos, head, member) \
for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \
&pos->member != (head); \
pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
迭代整个链表节点外部结构体,head为链表头。
10、
/**
* list_for_each_continue_rcu
* @pos: the &struct list_head to use as a loop cursor.
* @head: the head for your list.
*
* Iterate over an rcu-protected list, continuing after current point.
*
* This list-traversal primitive may safely run concurrently with
* the _rcu list-mutation primitives such as list_add_rcu()
* as long as the traversal is guarded by rcu_read_lock().
*/
#define list_for_each_continue_rcu(pos, head) \
for ((pos) = rcu_dereference_raw(list_next_rcu(pos)); \
(pos) != (head); \
(pos) = rcu_dereference_raw(list_next_rcu(pos)))
迭代整个链表节点
11、
/**
* list_for_each_entry_continue_rcu - continue iteration over list of given type
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the list_struct within the struct.
*
* Continue to iterate over list of given type, continuing after
* the current position.
*/
#define list_for_each_entry_continue_rcu(pos, head, member) \
for (pos = list_entry_rcu(pos->member.next, typeof(*pos), member); \
&pos->member != (head);\
pos = list_entry_rcu(pos->member.next, typeof(*pos), member))
从指定位置迭代整个链表外部结构体
未完待续。。。