随手记之Linux内核Backlog笔记

零。前言

有些东西总是很容易遗忘，一时记得了，过两天就真正还给周公了。零零碎碎的不如一并记下来，以后可以直接拿过来查询即可。

以下内容基于Linux 2.6.18内核。

一。listen方法传入的backlog参数，net.core.somaxconn

这个参数具体意义，先看看Linux Socket的listen解释

man listen

   #include <sys/socket.h>

   int listen(int sockfd, int backlog);

int类型的backlog参数，listen方法的backlog意义为，已经完成三次握手、已经成功建立连接的套接字将要进入队列的长度。

一般我们自己定义设定backlog值，若我们设置的backlog值大于net.core.somaxconn值，将被置为net.core.somaxconn值大小。若不想直接硬性指定，跟随系统设定，则需要读取/proc/sys/net/core/somaxconn。

net\Socket.c :

/*
 *  Perform a listen. Basically, we allow the protocol to do anything
 *  necessary for a listen, and if that works, we mark the socket as
 *  ready for listening.
 */

int sysctl_somaxconn = SOMAXCONN;

asmlinkage long sys_listen(int fd, int backlog)
{
    struct socket *sock;
    int err, fput_needed;

    if ((sock = sockfd_lookup_light(fd, &err, &fput_needed)) != NULL) {
        if ((unsigned) backlog > sysctl_somaxconn)
            backlog = sysctl_somaxconn;

        err = security_socket_listen(sock, backlog);
        if (!err)
            err = sock->ops->listen(sock, backlog);

        fput_light(sock->file, fput_needed);
    }
    return err;
}

比如经常使用的netty(4.0)框架，在Linux下启动时，会直接读取/proc/sys/net/core/somaxconn值然后作为listen的backlog参数进行调用Linux系统的listen进行初始化等。

int somaxconn = 3072;
BufferedReader in = null;
try {
    in = new BufferedReader(new FileReader("/proc/sys/net/core/somaxconn"));
    somaxconn = Integer.parseInt(in.readLine());
    logger.debug("/proc/sys/net/core/somaxconn: {}", somaxconn);
} catch (Exception e) {
    // Failed to get SOMAXCONN
} finally {
    if (in != null) {
        try {
            in.close();
        } catch (Exception e) {
            // Ignored.
        }
    }
}

SOMAXCONN = somaxconn;
......
private volatile int backlog = NetUtil.SOMAXCONN;

一般稍微增大net.core.somaxconn值就显得很有必要。

设置其值方法：

sysctl -w net.core.somaxconn=65535

较大内存的Linux，65535数值一般就可以了。

若让其生效，sysctl -p 即可，然后重启你的Server应用即可。

二。网卡设备将请求放入队列的长度，netdev_max_backlog

内核代码中sysctl.c文件解释：

number of unprocessed input packets before kernel starts dropping them, default 300

我所理解的含义，每个网络接口接收数据包的速率比内核处理这些包的速率快时，允许送到队列的最大数目，一旦超过将被丢弃。

所起作用处，net/core/Dev.c：

int netif_rx(struct sk_buff *skb)
{
    struct softnet_data *queue;
    unsigned long flags;

    /* if netpoll wants it, pretend we never saw it */
    if (netpoll_rx(skb))
        return NET_RX_DROP;

    if (!skb->tstamp.off_sec)
        net_timestamp(skb);

    /*
     * The code is rearranged so that the path is the most
     * short when CPU is congested, but is still operating.
     */
    local_irq_save(flags);
    queue = &__get_cpu_var(softnet_data);

    __get_cpu_var(netdev_rx_stat).total++;
    if (queue->input_pkt_queue.qlen <= netdev_max_backlog) {
        if (queue->input_pkt_queue.qlen) {
enqueue:
            dev_hold(skb->dev);
            __skb_queue_tail(&queue->input_pkt_queue, skb);
            local_irq_restore(flags);
            return NET_RX_SUCCESS;
        }

        netif_rx_schedule(&queue->backlog_dev);
        goto enqueue;
    }

    __get_cpu_var(netdev_rx_stat).dropped++;
    local_irq_restore(flags);

    kfree_skb(skb);
    return NET_RX_DROP;
}

以上代码看一下，大概会明白netdev_max_backlog会在什么时候起作用。

随手记之Linux内核Backlog笔记