网络虚拟化中的 offload 技术:LSO/LRO、GSO/GRO、TSO/UFO、VXLAN
现在,越来越多的网卡设备支持 offload 特性,来提升网络收/发性能。offload 是将本来该操作系统进行的一些数据包处理(如分片、重组等)放到网卡硬件中去做,降低系统 CPU 消耗的同时,提高处理的性能。包括 LSO/LRO、GSO/GRO、TSO/UFO 等。
GSO:所谓的GSO,实际上是对TSO的增强。TSO将tcp协议的一些处理下放到网卡完成以减轻协议栈处理占用CPU的负载。通常以太网的MTU是1500Bytes,除去IP头(标准情况下20Bytes)、TCP头(标准情况下20Bytes),TCP的MSS (Max Segment Size)大小是1460Bytes。当应用层下发的数据超过了mss时,协议栈会对这样的payload进行分片,保证生成的报文长度不超过MTU的大小。但是对于支持TSO/GSO的网卡而言,就没这个必要了,可以把最多64K大小的payload直接往下传给协议栈,此时IP层也不会进行分片,一直会传给网卡驱动,支持TSO/GSO的网卡会自己生成TCP/IP包头和帧头,这样可以offload很多协议栈上的内存操作,checksum计算等原本靠CPU来做的工作都移给了网卡。
GRO流程
ixgbe_rx_skb
网卡驱动从rx ring里收到包后,调用ixgbe_rx_skb上送协议栈,ixgbe_rx_skb判断上层socket是否有在对队列polling,如果没有,则进入gro合并入口函数napi_gro_receive;
static void ixgbe_rx_skb(struct ixgbe_q_vector *q_vector,
struct sk_buff *skb)
{
skb_mark_napi_id(skb, &q_vector->napi);
if (ixgbe_qv_busy_polling(q_vector))
netif_receive_skb(skb);
else
napi_gro_receive(&q_vector->napi, skb);
}
https://android.googlesource.com/kernel/common/+/refs/heads/android12-5.10/net/core/dev.c
gro_result_t napi_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
{
gro_result_t ret;
skb_mark_napi_id(skb, napi);
trace_napi_gro_receive_entry(skb);
skb_gro_reset_offset(skb, 0);
ret = napi_skb_finish(napi, skb, dev_gro_receive(napi, skb));
trace_napi_gro_receive_exit(ret);
return ret;
}
EXPORT_SYMBOL(napi_gro_receive);
https://android.googlesource.com/kernel/common/+/refs/heads/android12-5.10/net/core/dev.c#5962
static enum gro_result dev_gro_receive(struct napi_struct *napi, struct sk_buff *skb)
{
pp = INDIRECT_CALL_INET(ptype->callbacks.gro_receive,
ipv6_gro_receive, inet_gro_receive,
gro_head, skb);
}
https://android.googlesource.com/kernel/common/+/refs/heads/android12-5.10/net/ipv4/af_inet.c#1437
struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
{
pp = indirect_call_gro_receive(tcp4_gro_receive, udp4_gro_receive,
ops->callbacks.gro_receive, head, skb);
}
https://android.googlesource.com/kernel/common/+/refs/heads/android12-5.10/net/ipv4/tcp_offload.c#309
struct sk_buff *tcp4_gro_receive(struct list_head *head, struct sk_buff *skb)
{
/* Don't bother verifying checksum if we're going to flush anyway. */
if (!NAPI_GRO_CB(skb)->flush &&
skb_gro_checksum_validate(skb, IPPROTO_TCP,
inet_gro_compute_pseudo)) {
NAPI_GRO_CB(skb)->flush = 1;
return NULL;
}
return tcp_gro_receive(head, skb);
}
https://android.googlesource.com/kernel/common/+/refs/heads/android12-5.10/net/ipv4/tcp_offload.c#180
struct sk_buff *tcp_gro_receive(struct list_head *head, struct sk_buff *skb)
{
if (flush || skb_gro_receive(p, skb)) {
mss = 1;
goto out_check_final;
}
}
https://android.googlesource.com/kernel/common/+/refs/heads/android12-5.10/net/core/skbuff.c#4134
int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb)
{