网络中断处理函数的工作:
1,将帧拷贝到sk_buff数据结构,如果该设备使用DMA,驱动程序就只需要初始化一个指针(不需要作拷贝)
2,对一些sk_buff参数做初始化,以便在稍后由上面的网络层使用。
3,更新其他一些该设备的私用参数。
为NET_RX_SOFTIRQ软中断调度以准备执行,借此通知内核新帧的事,由于设备发出中断事件的理由各部相同(新帧已接收,帧已成功传输等等),内核代码会配合中断通知信息,使得设备驱动程序处理例程可以按类型处理中断事件。
NAPI的核心思想:混合使用中断事件和轮询,而不使用纯粹的中断事件驱动模型。如果接收到新帧时,内核还没完成处理前几个帧的工作,驱动程序就没有必要产生其他中断事件:让内核移植处理设备输入队列中的数据会比较简单一点(设备的中断功能关闭),然后改队列为空时在重新开启中断功能。从内核处理的观点来看,以下是NAPI方法的一些优点:减少了CPU的负载(因为中断事件变少了);设备的处理更为公平,一些设备的入口队列中若有数据,就会以相当公平的循环方式予以访问。这样就能确保其他设备的负载都很高时,低流量的设备所体验到的延时依然处于可接受的范围之内。
1: struct napi_struct {
2: /* The poll_list must only be managed by the entity which
3: * changes the state of the NAPI_STATE_SCHED bit. This means
4: * whoever atomically sets that bit can add this napi_struct
5: * to the per-cpu poll_list, and whoever clears that bit
6: * can remove from the list right before clearing the bit.
7: */
8: struct list_head poll_list;
9:
10: unsigned long state;
11: int weight;
12: int (*poll)(struct napi_struct *, int);
13: #ifdef CONFIG_NETPOLL
14: spinlock_t poll_lock;
15: int poll_owner;
16: #endif
17:
18: unsigned int gro_count;
19:
20: struct net_device *dev;
21: struct list_head dev_list;
22: struct sk_buff *gro_list;
23: struct sk_buff *skb;
24: };
该结构用于管理轮询表上的设备。
poll,这个虚函数可用于把缓冲区从设备的输入队列中退出。此队列是使用NAPI设备的私有队列。而softnet_data->input_pkt_queue供其他设备使用。
poll_list,这是设备列表,其中的设备就是在入口队列中有新帧等待被处理的设备。这些设备就是所谓的处于轮询状态。此列表中的设备都处于中断功能关闭状态,而内核当前正在予以轮询。
理解NAPI的驱动程序与不理解NAPI的驱动程序的行为差异。
拥塞管理
拥塞管理是输入帧处理任务中的重要部分。超负荷的CPU会变得不稳定,对系统造成很大的延时。因此需要拥塞管理机制以确保系统的稳定性,使得在高网络负载下不会受到拖累。在高网络流量负载下降低CPU负载的常见方式包括:
1: static void net_rx_action(struct softirq_action *h)
2: {
3: struct softnet_data *sd = &__get_cpu_var(softnet_data);
4: unsigned long time_limit = jiffies + 2;
5: int budget = netdev_budget;
6: void *have;
7:
8: local_irq_disable();
9:
10: while (!list_empty(&sd->poll_list)) {
11: struct napi_struct *n;
12: int work, weight;
13:
14: /* If softirq window is exhuasted then punt.
15: * Allow this to run for 2 jiffies since which will allow
16: * an average latency of 1.5/HZ.
17: */
18: if (unlikely(budget <= 0 || time_after(jiffies, time_limit)))
19: goto softnet_break;
20:
21: local_irq_enable();
22:
23: /* Even though interrupts have been re-enabled, this
24: * access is safe because interrupts can only add new
25: * entries to the tail of this list, and only ->poll()
26: * calls can remove this head entry from the list.
27: */
28: n = list_first_entry(&sd->poll_list, struct napi_struct, poll_list);
29:
30: have = netpoll_poll_lock(n);
31:
32: weight = n->weight;
33:
34: /* This NAPI_STATE_SCHED test is for avoiding a race
35: * with netpoll's poll_napi(). Only the entity which
36: * obtains the lock and sees NAPI_STATE_SCHED set will
37: * actually make the ->poll() call. Therefore we avoid
38: * accidentally calling ->poll() when NAPI is not scheduled.
39: */
40: work = 0;
41: if (test_bit(NAPI_STATE_SCHED, &n->state)) {
42: work = n->poll(n, weight);
43: trace_napi_poll(n);
44: }
45:
46: WARN_ON_ONCE(work > weight);
47:
48: budget -= work;
49:
50: local_irq_disable();
51:
52: /* Drivers must not modify the NAPI state if they
53: * consume the entire weight. In such cases this code
54: * still "owns" the NAPI instance and therefore can
55: * move the instance around on the list at-will.
56: */
57: if (unlikely(work == weight)) {
58: if (unlikely(napi_disable_pending(n))) {
59: local_irq_enable();
60: napi_complete(n);
61: local_irq_disable();
62: } else
63: list_move_tail(&n->poll_list, &sd->poll_list);
64: }
65:
66: netpoll_poll_unlock(have);
67: }
68: out:
69: net_rps_action_and_irq_enable(sd);
70:
71: #ifdef CONFIG_NET_DMA
72: /*
73: * There may not be any more sk_buffs coming right now, so push
74: * any pending DMA copies to hardware
75: */
76: dma_issue_pending_all();
77: #endif
78:
79: return;
80:
81: softnet_break:
82: sd->time_squeeze++;
83: __raise_softirq_irqoff(NET_RX_SOFTIRQ);
84: goto out;
85: }