softirq and tasklets

 tasklet is something like a very small thread that has neither stack, not context of its own. Such “threads” work quickly and completely.

tasklets  are atomic ,so we cannot use  sleep() and such synchronize primitive as mutexes semaphoresand so on

but we can use spin_lock 

 they are called  in a “softer” context ,than ISR,in this context hardware interrupt are allowed ,They displace tasklets for the time of the ISR execution. This context is called softirq in the Linux kernel and in addition to the running of tasklets, it is also used by several subsystems;

a tasklet runs on the same core that schedules it. Or rather, has been the first one to schedule it by calling softirq, the handlers of which are always bound to the calling kernel;

different tasklets can be running in parallel. But at the same time, a tasklet cannot be called concurrently with itself, as it runs on one kernel only: the kernel that has scheduled its execution;不能再不同CPU上同时运行相同的tasklets ,但是可以运行不同tasklets,说明此时tasklets不必是可重入的;

tasklets are executed by the principle of non-preemptive scheduling, one by one, in turn. We can schedule them with two different priorities:normal and high.

struct tasklet_struct
{
	struct tasklet_struct *next;/* The next tasklet in line for scheduling */
	unsigned long state; /* TASKLET_STATE_SCHED or TASKLET_STATE_RUN */
	atomic_t count;  /* Responsible for the tasklet being activated or not */
	void (*func)(unsigned long);/* The main function of the tasklet */
	unsigned long data; /* The parameter func is started with */
};
the main date  structure used to response to softirq is softirq_vec array ,which  includes 32 elements of type softirq_action; the priority of soft is index of corresponding softirq_action element inside the array,

struct softirq_action
{
	void	(*action)(struct softirq_action *);
};// in  linux  3.0.08

Another critical field used to keep track both of kernel preemption and of nesting of
kernel control paths is the 32-bit preempt_count field stored in the thread_info field
of each process descriptor


对于:preemption 字段表示禁用本地CPU内核抢占的次数;0表示允许内核抢占;

softirq counter 表示可延迟函数;值为0表示可延迟函数处于激活状态

hardirq counter 表示本地CPU中断处理程序嵌套次数;specifies the number of nested interrupt handlers on the
local CPU ;the value is increased by irq_enter() and decreased by irq_exit(); 见前面中断章节

PREEMPT_ACTIVE标志的本意是表明正在进行内核抢占,设置了之后preempt_counter就不再为0,从而达到禁止内核抢占的效果,使得执行抢占工作的代码不会被再抢占;它的一个重要用途是防止非Running状态的进程被抢占过程错误地从Run Queue中移除所谓抢占,就是从一个正在运行的进程手上把CPU抢过来,可是既然进程已经不是Running状态了,怎么会还在CPU上,还被抢占?
这是因为进程从Running变成非Running要经过几个步骤:在把自己放进Wait Queue、状态置成非Running之后,最后调用schedule()把自己从Run Queue中移除、并把CPU交给其他进程。设想一下,一个进程恰好在调用schedule()之前就被抢占了,此时它仍然还在CPU上运行。这就是为什么非Running状态的进程也会被抢占的原因。对这样的进程,抢占流程不能擅自将之从Run Queue中移除,因为它的切换过程没有完成,应该让它有机会自己回头接着做完。比如以下的代码,是一个典型的休眠过程:

for (;;) {
    prepare_to_wait(&wq, &__wait, TASK_UNINTERRUPTIBLE);
    if (condition)
        break;
    schedule();
}
如果在第2行被抢占,刚把进程状态设置为TASK_UNINTERRUPTIBLE,本来马上就要测试条件是否满足了,这时被抢占,而抢占过程必定包含调用schedule()的步骤,导致该进程被移出运行队列,失去了运行机会,随后的条件判断语句就无法执行了,假如此时condition条件是满足的,它本来会跳出for循环、而不会去调用schedule()进入休眠,然而却被抢占过程错误地调用schedule()导致它休眠了,也因此错过了那个条件判断语句,也许就永远没有被唤醒的机会了。正确的做法是:进程被抢占后还留在Run Queue中,下次还有机会继续运行,恢复运行后继续判断condition,如果条件不满足,在随后主动调用的schedule()中会被移出运行队列,这是不能由抢占代劳的。
   在内核里,进程从运行态进入休眠态的最后一步是呼叫调度器schedule()——把自己从Run Queue中移除,把CPU交给其他进程,这在不支持内核抢占的时代没有问题,因为整个过程不会被打断,然而内核抢占的出现使情况变复杂了,现在从运行态进入休眠态的过程可能会被抢占所打断,而抢占过程中会调用schedule(),导致schedule()的调用提前发生,有可能形成race condition。为了避免这种情况,内核抢占过程中不能直接呼叫schedule()调度器,而是呼叫preempt_schedule(),再通过它来调用schedule(),preempt_schedule()会在调用schedule()之前设置PREEMPT_ACTIVE标志,调用之后再清除这个标志。而schedule()会检查这个标志,如果设置了PREEMPT_ACTIVE标志,意味着这是从抢占过程中进入schedule(),对于不是TASK_RUNNING(state != 0)的进程,就不会调用deactivate_task()把进程从Run Queue移除。源码如下:

kernel preemptability has to be disabled either when it has been explicitly disabled by the kernel code (preemption counter not zero)or when the kernel is running in interrupt context. Thus,to determine whether the current process can be preempted, the kernel quickly
checks for a zero value in the preempt_count field.即内核在中断上下文必须禁用内核抢占功能。
in_interrupt()macro checks the hardirq and softirq counters in thecurrent_thread_info()->preempt_countfield. If either one of these two counters is positive,the macro yields a nonzero value, otherwise it yields the value zero.

软中断的实现依赖于cpu 的一个32bits掩码;存放在irq_cpustat_t数据结构的__softirq_pengding 字段中。

typedef struct {
	unsigned int __softirq_pending;
	unsigned int __nmi_count;	/* arch dependent */
	unsigned int irq0_irqs;
#ifdef CONFIG_X86_LOCAL_APIC
	unsigned int apic_timer_irqs;	/* arch dependent */
	unsigned int irq_spurious_count;
#endif
	unsigned int x86_platform_ipis;	/* arch dependent */
	unsigned int apic_perf_irqs;
	unsigned int apic_irq_work_irqs;
#ifdef CONFIG_SMP
	unsigned int irq_resched_count;
	unsigned int irq_call_count;
	unsigned int irq_tlb_count;
#endif
#ifdef CONFIG_X86_THERMAL_VECTOR
	unsigned int irq_thermal_count;
#endif
#ifdef CONFIG_X86_MCE_THRESHOLD
	unsigned int irq_threshold_count;
#endif
#define local_softirq_pending()	percpu_read(irq_stat.__softirq_pending)
#define or_softirq_pending(x)	percpu_or(irq_stat.__softirq_pending, (x))
local_softirq_pengding 宏选择本地cpu的软中断位掩码

raise_softirq 宏中会调用or_softirq_penging();

void raise_softirq(unsigned int nr)
{
	unsigned long flags;
	local_irq_save(flags);//保存eflags寄存器IF标志并禁用本地CPU
	raise_softirq_irqoff(nr);会调用 or_softirq_pending(x)
	local_irq_restore(flags);
}
/*
 * This function must run with irqs disabled!
 */
inline void raise_softirq_irqoff(unsigned int nr)
{
	__raise_softirq_irqoff(nr);

	/*
	 * If we're in an interrupt or softirq, we're done
	 * (this also catches softirq-disabled code). We will
	 * actually run the softirq once we return from
	 * the irq or softirq.
	 *
	 * Otherwise we wake up ksoftirqd to make sure we
	 * schedule the softirq soon.
	 */
	if (!in_interrupt())
		wakeup_softirqd();//激活本地CPU的ksoftirqd线程
}

start_kernel函数中会调用void __init softirq_init(void),函数中

asmlinkage __visible void __init start_kernel(void)
{
    ......
    init_IRQ();
    ......
    softirq_init();
    ......
    time_init();
    ......
    rest_init();
}
 
static noinline void __init_refok rest_init(void)
{
    ......
    kernel_thread(kernel_init, NULL, CLONE_FS);
    ......
}
 
static int __ref kernel_init(void *unused)
{
    ......
    kernel_init_freeable();
    ......
}
 
static noinline void __init kernel_init_freeable(void)
{
    ......
    do_basic_setup();
    ......
}
 
static void __init do_basic_setup(void)
{
    cpuset_init_smp();
    usermodehelper_init();
    shmem_init();
    driver_init();
    init_irq_proc();
    do_ctors();
    usermodehelper_enable();
    do_initcalls();
    random_int_secret_init();
}
 
static void __init do_initcalls(void)
{
    int level;
 
    for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++)
        do_initcall_level(level);会调用softirq
}
 
//subsys_initcall(net_dev_init)会的注册net_dev_init,所以其中的一个level会的执行net_dev_init。


在启动内核时启动的ksoftirqd(linux低版本linux2.6.11)、run_ksoftirqd(linux3.0)内核线程

p = kthread_create_on_node(run_ksoftirqd, hcpu, cpu_to_node(hotcpu),   "ksoftirqd/%d", hotcpu);

static int run_ksoftirqd(void * __bind_cpu)
{
	set_current_state(TASK_INTERRUPTIBLE);

	while (!kthread_should_stop()) {
		preempt_disable();
		if (!local_softirq_pending()) {
			preempt_enable_no_resched();
			schedule();
			preempt_disable();
		}

		__set_current_state(TASK_RUNNING);

		while (local_softirq_pending()) {
			/* Preempt disable stops cpu going offline.
			   If already offline, we'll be on wrong CPU:
			   don't process */
			if (cpu_is_offline((long)__bind_cpu))
				goto wait_to_die;
			local_irq_disable();
			if (local_softirq_pending())/******************有中断则执行中断处理函数******************/
				__do_softirq();
			local_irq_enable();
			preempt_enable_no_resched();
			cond_resched();
			preempt_disable();
			rcu_note_context_switch((long)__bind_cpu);
		}
		preempt_enable();
		set_current_state(TASK_INTERRUPTIBLE);// 休眠
	}
	__set_current_state(TASK_RUNNING);
	return 0;

wait_to_die:
	preempt_enable();
	/* Wait for kthread_stop */
	set_current_state(TASK_INTERRUPTIBLE);
	while (!kthread_should_stop()) {
		schedule();
		set_current_state(TASK_INTERRUPTIBLE);
	}
	__set_current_state(TASK_RUNNING);
	return 0;
}

每个cpu都有自己的ksoftirq内核线程;
同时 在是在启动softirq内核线程之前初始化softirq_vec[]数组中的action中断服务函数;
void __init softirq_init(void)
{
	int cpu;

	for_each_possible_cpu(cpu) {
		int i;

		per_cpu(tasklet_vec, cpu).tail =
			&per_cpu(tasklet_vec, cpu).head;
		per_cpu(tasklet_hi_vec, cpu).tail =
			&per_cpu(tasklet_hi_vec, cpu).head;
		for (i = 0; i < NR_SOFTIRQS; i++)
			INIT_LIST_HEAD(&per_cpu(softirq_work_list[i], cpu));
	}

	register_hotcpu_notifier(&remote_softirq_cpu_notifier);

	open_softirq(TASKLET_SOFTIRQ, tasklet_action);//向softirq_vec[]数组中注册其对应软中断函数 tasklet_action
           open_softirq(HI_SOFTIRQ, tasklet_hi_action);
}
void open_softirq(int nr, void (*action)(struct softirq_action *))
{
softirq_vec[nr].action = action;

}


所以在唤醒软中断时;softirq内核线程会执行__do_softirq函数处理中断;

asmlinkage void __do_softirq(void)
{
	struct softirq_action *h;
	__u32 pending;
	int max_restart = MAX_SOFTIRQ_RESTART;
	int cpu;

	pending = local_softirq_pending();//把Copies the softirq bit mask of the local CPU
	account_system_vtime(current);

	__local_bh_disable((unsigned long)__builtin_return_address(0),//Invokes local_bh_disable() to increase the softirq counter
				SOFTIRQ_OFFSET);
	lockdep_softirq_enter();

	cpu = smp_processor_id();
restart:
	/* Reset the pending bitmask before enabling irqs */
	set_softirq_pending(0);//Clears the softirq bitmap of the local CPU, so that new softirqs can be activated

	local_irq_enable();//enable local interrupts.

	h = softirq_vec;
/*******************For each bit set in the pending local variable, it executes the corresponding softirq function; recall that the function address for 
the softirq with index n is storedin softirq_vec[n]->action.********************************/
	do {
		if (pending & 1) {
			unsigned int vec_nr = h - softirq_vec;
			int prev_count = preempt_count();

			kstat_incr_softirqs_this_cpu(vec_nr);

			trace_softirq_entry(vec_nr);
			h->action(h);//call server handle
			trace_softirq_exit(vec_nr);
			if (unlikely(prev_count != preempt_count())) {
				printk(KERN_ERR "huh, entered softirq %u %s %p"
				       "with preempt_count %08x,"
				       " exited with %08x?\n", vec_nr,
				       softirq_to_name[vec_nr], h->action,
				       prev_count, preempt_count());
				preempt_count() = prev_count;
			}

			rcu_bh_qs(cpu);
		}
		h++;
		pending >>= 1;
	} while (pending);

	local_irq_disable();

	pending = local_softirq_pending();
	if (pending && --max_restart)// softirq  activted again;<span style="font-family: Birka; font-size: 10pt;">ince the start of t</span><span style="font-size: 10pt; font-family: Birka;">he last iteration—and the iteration counter is still positive</span><span style="font-family: Birka; font-size: 10pt;"><br style="orphans: 2; text-align: -webkit-auto; widows: 2;" /></span>
		goto restart;

	if (pending)//循环次数执行完后,如果还有中断则唤醒softirq
		wakeup_softirqd();
//Subtracts 1 from the softirq counter, thus reenabling the deferrable functions.
	lockdep_softirq_exit();

	account_system_vtime(current);
	__local_bh_enable(SOFTIRQ_OFFSET);
}

该函数中首先将循环数设为10;int max_restart = MAX_SOFTIRQ_RESTART;

这种原因是因为正在执行的软中断函数时可能出现新的软中断挂起,为了保证可延迟函数低延迟性,do_softirq函数一直到执行所有挂起的中断;但是这样可能会使do_dsoftirq运行很长一定时间,导致用户态进程延迟执行;所以do_softirq每次执行固定次数;

对于调用local_bh_disable()禁用软中断:其主要是原因如下:It is somewhat counterintuitive that deferrable functions should be disabled before starting to execute them, but it really makes a lot of sense. Because the deferrable functions mostly run with interrupts enabled, an interrupt can be raised in the middle of the _ _do_softirq() function. When do_IRQ() executes the irq_exit() macro,
another instance of the _ _do_softirq() function could be started. This has to be avoided, because deferrable functions must execute serially on the CPU. Thus,

the first instance of _ _do_softirq() disables deferrable functions, so that every new instance of the function will exit at step 1 of do_softirq().

对run_softirq线程中;执行完do_softirq后,会执行cond_resched();

#define cond_resched() ({			\
	__might_sleep(__FILE__, __LINE__, 0);	\
	_cond_resched();			\
})
static inline int should_resched(void)
{
      return need_resched() && !(preempt_count() & PREEMPT_ACTIVE);
}


static void __cond_resched(void)
{
   add_preempt_count(PREEMPT_ACTIVE); //禁止内核抢占一直到下面的sub_preempt_count(PREEMPT_ACTIVE),这期间不能再抢占这个进程,
  //如果非要抢占,出了下面的sub_preempt_count(PREEMPT_ACTIVE)也不迟
   __schedule();
   sub_preempt_count(PREEMPT_ACTIVE);
}


int __sched _cond_resched(void)
{
       if (should_resched()) {
         __cond_resched();?????????????
        return 1;
        }
      return 0;
}
static inline int need_resched(void)
{
       return unlikely(test_thread_flag(TIF_NEED_RESCHED));
}

对于软中断 会执行其对应的tasklets_action ;通过h->action(h);回调其对应服务函数;根据下面:open_softirq可知:

open_softirq(TASKLET_SOFTIRQ, tasklet_action);//向softirq_vec[]数组中注册其对应软中断函数 tasklet_action
open_softirq(HI_SOFTIRQ, tasklet_hi_action);

其对应回调函数为

tasklet_action 以及tasklet_hi_action等

在linux2.6中softirq主要有如下几个部分;


以tasklets_action为例:

static void tasklet_action(struct softirq_action *a)
{
	struct tasklet_struct *list;

	local_irq_disable();禁用本地irq
	list = __this_cpu_read(tasklet_vec.head);获取本地CPU的tasklet_vec[n]指向的链表头;
	__this_cpu_write(tasklet_vec.head, NULL);已调度的tasklets描述符链表被清空;
	__this_cpu_write(tasklet_vec.tail, &__get_cpu_var(tasklet_vec).head);
	local_irq_enable();

	while (list) {
		struct tasklet_struct *t = list;

		list = list->next;

		if (tasklet_trylock(t)) {
			if (!atomic_read(&t->count)) {
				if (!test_and_clear_bit(TASKLET_STATE_SCHED, &t->state))
					BUG();//tasklets被激活则其标志位
				t->func(t->data);
				tasklet_unlock(t);
				continue;
			}
			tasklet_unlock(t);
		}

		local_irq_disable();
		t->next = NULL;
		*__this_cpu_read(tasklet_vec.tail) = t;
		__this_cpu_write(tasklet_vec.tail, &(t->next));
		__raise_softirq_irqoff(TASKLET_SOFTIRQ);
		local_irq_enable();
	}
}
Trigger soft interrupt is the most common form in the interrupt processing program, interrupt program execution related to the operation of processing hardware, then triggers the corresponding soft interrupt, last exit. The kernel in the execution of the interrupt handler, immediately call do_softirq () function. So, soft interrupt began to implement the interrupt handler for it to complete the remaining tasks.

 Tasklet is through the soft interrupt, so they are also the soft interrupt. It is composed of two types of soft interrupt: HI_SOFTIRQ and TASKLET_SOFTIRQ. differ in that the former will precede the latter Executive. Tasklets represented by the tasklet_struct structure, each structure alone represents a tasklet, defined in the linux/interrupt.h:

1
2
3
4
5
6
7
structtasklet_struct{
                                 struct tasklet_struct *next;
                                 unsigned long state;
                                 atomic_t count;
                                 void(*func)(unsignedlong);
                                 unsigned long data;
};

从tasklet_action函数中可以看出:依次执行tasklets_struct->func();

比如对于定时器timer;

执行

static void run_timer_softirq(struct softirq_action *h)
{
	struct tvec_base *base = __this_cpu_read(tvec_bases);


	hrtimer_run_pending();


	if (time_after_eq(jiffies, base->timer_jiffies))
		__run_timers(base);
}

Before using a tasklet, initialize it below

/* By default, the tasklet is activated */
void tasklet_init(struct tasklet_struct *t, void (*func)(unsigned long), unsigned long data);
DECLARE_TASKLET
(name, func, data);/*declare and defne a tasklet;
DECLARE_TASKLET_DISABLED
(name, func, data); /* the deactivated tasklet */

Tasklets are scheduled easily: the tasklet is placed into one queue out of two, depending on the priority. Queues are organized as singly-linked lists. At that, each CPU has its own queues. We can do it with the help of the following functions:调度自己的小任务通过调用tasklet_schedule()函数并传递给它相应的tasklt_struct指针,该小任务就会被调度以便适当的时候执行:

void tasklet_schedule(struct tasklet_struct *t);           /* with normal priority */
void tasklet_hi_schedule(struct tasklet_struct *t);        /* with high priority */
void tasklet_hi_schedule_first(struct tasklet_struct *t);  /* out of the queue */

We can also kill tasklets. Like this:

void tasklet_kill(struct tasklet_struct *t);
对于对tasklets_schedule()分析:

void __tasklet_schedule(struct tasklet_struct *t)
{
	unsigned long flags;

	local_irq_save(flags);
	t->next = NULL;
	*__this_cpu_read(tasklet_vec.tail) = t;
	__this_cpu_write(tasklet_vec.tail, &(t->next));
将tasklets插入tasklets_vec中并唤醒run_softirq线程;
	raise_softirq_irqoff(TASKLET_SOFTIRQ);
	local_irq_restore(flags);
}
void __tasklet_hi_schedule(struct tasklet_struct *t)
{
unsigned long flags;


local_irq_save(flags);
t->next = NULL;
<*__this_cpu_read(tasklet_hi_vec.tail) = t;
__this_cpu_write(tasklet_hi_vec.tail,  &(t->next));
raise_softirq_irqoff(HI_SOFTIRQ);
local_irq_restore(flags);
}


你可能感兴趣的:(softirq and tasklets)