使用Go和C实例来探究Linux TCP之listen backlog

最近在看Go语言的tcp连接,由于涉及知识很多很杂,先零零碎碎记录一些,日后在整理。

目录

理论

测试

c语言版本

Go语言版本

总结

参考文章


理论

有关TCP三次握手和传输数据作者之前也写过一篇,可以也阅读一下。

Go语言中TCP、UDP都在net库里面封装好了,对应底层调用的函数都是Linux系统函数。这里我们主要关注TCP协议中listen函数中backlog参数。

(图来自https://blog.csdn.net/ordeder/article/details/21551567#commentBox)

从上图可知:

  • 内核为client与server建立的TCP连接维护2个队列:未完成连接队列和已完成连接队列。
  • 当client开始向server发送sync时,将conn加入到未完成连接队列。
  • 当server向client发送sync,ack时,将未完成连接队列中的conn状态改为SYN_REVD。
  • 当client向server发送ack,server将conn从未完成连接队列移到未满的已完成连接队列。
  • accept函数不参与三次握手。
  • 当调用accept函数,则从已完成连接队列获取一个,同时把这个连接从队列中清除。

注意:图中说“两队列之和不超过backlog”,这是有问题的,并且作者测试也不是这个结果,在3.10.0内核版本man listen如下:

使用Go和C实例来探究Linux TCP之listen backlog_第1张图片

所以backlog指的是已经完成连接正等待应用程序接收的套接字队列的长度而不是未完成连接的数目。未完成连接套接字队列的最大长度默认为tcp_max_syn_backlog。

如果backlog设置比较小,但是同一时间有大量连接并且server也未及时accept,可能会导致client调用connect阻塞。下面会使用c语言和Go语言分别来测试:

前提环境:
[root@localhost tcptest]# uname -a
Linux localhost.localdomain 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost tcptest]# cat /etc/redhat-release 
CentOS Linux release 7.6.1810 (Core) 

测试前需要了解Linux系统net中一些默认配置:

  • /proc/sys/net/core/somaxconn 指定backlog默认大小(128),当然我们通过listen设置就会忽略这个值,但是Go语言中,listen的backlog参数使用的是这个文件指定的值。
  • /proc/sys/net/ipv4/tcp_syn_retries SYN重试次数,默认大小6。
  • /proc/sys/net/ipv4/tcp_synack_retries SYN+ACK重试次数,默认大小5。
  • /proc/sys/net/ipv4/tcp_max_syn_backlog 就是上面提到的tcp_max_syn_backlog(未完成连接队列)的默认值(128)。
  • /proc/sys/net/ipv4/tcp_abort_on_overflow 默认值是0。如果设置1,当服务器已完成队列满了,新的连接不能加入,则直接返回给客户端RST宣告连接失败,同时客户端返回错误是errno=104(Connection reset by peer);如果设置0,当服务器已完成队列满了,新的连接不能加入,但是服务器重新给客户端发送SYN+ACK,这时客户端认为上一次ACK丢失并开始重传,这时已完成队列还是满的,服务器继续重传SYN+ACK,直到客户端功能加入已完成队列或者服务器重传SYN+ACK次数达到tcp_synack_retries返回RST包给客户端宣告连接失败,同时客户端返回错误是errno=110(Connection timed out)。

临时修改上面这些值,执行:

sysctl -w net.core.somaxconn=2048
sysctl -w net.ipv4.tcp_syn_retries=7
sysctl -w net.ipv4.tcp_synack_retries=6
sysctl -w net.ipv4.tcp_overflow=0
sysctl -w net.ipv4.tcp_max_syn_backlog=65535

永久修改的话,直接修改相应文件。

测试

下面为了测试backlog较小,大量客户端请求时,accept不及时调用,导致其余连接连接失败,直接将tcp_abort_on_overflow 设置为1。

c语言版本

//client.c
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define SER "127.0.0.1"
#define PORT 8888
 
typedef struct sockaddr SA;
typedef struct sockaddr_in SA_IN;
 
char* getDateTime()
{
        static char nowtime[20];
        time_t rawtime;
        struct tm* ltime;
        time(&rawtime);
        ltime = localtime(&rawtime);
        strftime(nowtime, 20, "%Y-%m-%d %H:%M:%S", ltime);
        return nowtime;
}

void* loopprint(void* ptr)
{
    while(1){
        if(errno == 111)
            printf("errno:%d %s ",errno,strerror(errno));
    }
}
 
void* dosomething(void *ptr)
{
    int sockfd;
    SA_IN server,addr;
    int flags=1;

    socklen_t addrlen=sizeof(SA_IN);
    
    //socket1
    sockfd=socket(AF_INET,SOCK_STREAM,0);
    
    bzero(&server,sizeof(SA_IN));
    server.sin_family=AF_INET;
    server.sin_addr.s_addr=inet_addr(SER);
    server.sin_port=htons(PORT);
    
    if(setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&flags,sizeof(int)) ==-1)
     printf("setsockopt sockfd\n");
    
    
    if(connect(sockfd,(SA *)&server,sizeof(SA_IN)) == -1){
        printf("errno:%d %s ",errno,strerror(errno));
        int* cnt = (int *)ptr;
        char* nowtime = getDateTime();
        printf("%s thread:%d\n",nowtime,*cnt);
        return NULL;
    }else{
        printf("connect ok ");
    }
   
    int* cnt = (int *)ptr;
    char* nowtime = getDateTime();
    printf("%s thread:%d\n",nowtime,*cnt);

    while(1);
    close(sockfd);
    return NULL;
}

#define MAX 30
int main()
{
        pthread_t pt[MAX],ptt;
        int ret[MAX];
        int param[MAX];
        int i;

        // 为了捕捉ECONNREFUSED错误,但是没有捕捉到
        //pthread_create(&ptt, NULL, (void *)&loopprint, NULL);

        for(i=0;i < MAX;i++){
            sleep(1); //加不加其实结论都是一样的
            param[i] = i;
            ret[i] = pthread_create(&pt[i], NULL, (void *)&dosomething, (void*)¶m[i]);
        }

        for(i=0;i < MAX;i++){
                if(ret[i] != 0){
                        printf("thread[%d] create fail\n",i);
                }
        }

        void *retval;
        for(i=0;i < MAX;i++){
                pthread_join(pt[i], &retval);
        }
        return 0;
}
//server.c
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define SER "127.0.0.1"
#define PORT 8888
 

 
typedef struct sockaddr SA;
typedef struct sockaddr_in SA_IN;
 
int main(int argc,char **argv)
{
 int sockfd;
 SA_IN server,addr;
 int flags=1;

 //socket1
 sockfd=socket(AF_INET,SOCK_STREAM,0);
 
 bzero(&server,sizeof(SA_IN));
 server.sin_family=AF_INET;
 server.sin_addr.s_addr=inet_addr(SER);
 server.sin_port=htons(PORT);
 
 if(setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,&flags,sizeof(int)) ==-1)
     printf("setsockopt sockfd fail\n");
 
 if(bind(sockfd,(SA *)&server,sizeof(SA_IN)) == -1)
     printf("bind sockfd fail\n");
 if(listen(sockfd,20) == -1)
     printf("listen fail\n");
 
 //struct sockaddr_in clnt_addr;
 //socklen_t clnt_addr_size = sizeof(clnt_addr);
 //int clnt_sock = accept(serv_sock, (struct sockaddr*)&clnt_addr, &clnt_addr_size);
 
 
 while(1);
 
 close(sockfd);
 
 return 0;
}

运行结果:

connect ok 2019-11-13 14:30:06 thread:0
connect ok 2019-11-13 14:30:07 thread:1
connect ok 2019-11-13 14:30:08 thread:2
connect ok 2019-11-13 14:30:09 thread:3
connect ok 2019-11-13 14:30:10 thread:4
connect ok 2019-11-13 14:30:11 thread:5
connect ok 2019-11-13 14:30:12 thread:6
connect ok 2019-11-13 14:30:13 thread:7
connect ok 2019-11-13 14:30:14 thread:8
connect ok 2019-11-13 14:30:15 thread:9
connect ok 2019-11-13 14:30:16 thread:10
connect ok 2019-11-13 14:30:17 thread:11
connect ok 2019-11-13 14:30:18 thread:12
connect ok 2019-11-13 14:30:19 thread:13
connect ok 2019-11-13 14:30:20 thread:14
connect ok 2019-11-13 14:30:21 thread:15
connect ok 2019-11-13 14:30:22 thread:16
connect ok 2019-11-13 14:30:23 thread:17
connect ok 2019-11-13 14:30:24 thread:18
connect ok 2019-11-13 14:30:25 thread:19
connect ok 2019-11-13 14:30:26 thread:20
errno:104 Connection reset by peer 2019-11-13 14:30:28 thread:21
errno:104 Connection reset by peer 2019-11-13 14:30:29 thread:22
errno:104 Connection reset by peer 2019-11-13 14:30:29 thread:23
errno:104 Connection reset by peer 2019-11-13 14:30:31 thread:24
errno:104 Connection reset by peer 2019-11-13 14:30:32 thread:25
errno:104 Connection reset by peer 2019-11-13 14:30:33 thread:26
errno:104 Connection reset by peer 2019-11-13 14:30:33 thread:27
errno:104 Connection reset by peer 2019-11-13 14:30:35 thread:28
errno:104 Connection reset by peer 2019-11-13 14:30:36 thread:29

上面listen设置backlog为20,但是连接成功有21个,剩下的直接返回失败。

Go语言版本

首先,Go语言直接为我们设置好了backlog的大小:

func maxListenerBacklog() int {
	fd, err := open("/proc/sys/net/core/somaxconn")
	if err != nil {
		return syscall.SOMAXCONN
	}
	defer fd.close()
	l, ok := fd.readLine()
	if !ok {
		return syscall.SOMAXCONN
	}
	f := getFields(l)
	n, _, ok := dtoi(f[0])
	if n == 0 || !ok {
		return syscall.SOMAXCONN
	}
	// Linux stores the backlog in a uint16.
	// Truncate number to avoid wrapping.
	// See issue 5030.
	if n > 1<<16-1 {
		n = 1<<16 - 1
	}
	return n
}
[root@localhost tcptest-go]# cat /proc/sys/net/core/somaxconn 
128

如果想修改这个值,就直接修改somaxconn文件中的值。这里我们把改成20。

//client.go
func establishConn(i int) net.Conn {
        //conn, err := net.Dial("tcp", ":8888")
        conn,err := net.DialTimeout("tcp",":8888",10000*time.Second)
        if err != nil {
                log.Printf("%d: dial error: %s", i, err)
                return nil
        }
        log.Println(i, ":connect to server ok")
        for{
            time.Sleep(time.Second)  
        }
        return conn
}

func main() {
        for i := 1; i <= 30; i++ {
                go establishConn(i)
        }

        time.Sleep(time.Second * 10000)
}
//server.go
func main() {
        l, err := net.Listen("tcp", ":8888")
        if err != nil {
                log.Println("error listen:", err)
                return
        }
        defer l.Close()
        log.Println("listen ok")

        //var i int
        for {
                time.Sleep(time.Second * 100000)
                //log.Printf("%d: accept a new connection\n",i)
                //if _, err := l.Accept(); err != nil {
                //      log.Println("accept error:", err)
                //      break
                //}
                //i++
                //log.Printf("%d: accept a new connection\n", i)
        }
}

运行结果:

2019/11/13 15:44:30 30: dial error: dial tcp :8888: connect: connection reset by peer
2019/11/13 15:44:30 29: dial error: dial tcp :8888: connect: connection reset by peer
2019/11/13 15:44:30 28: dial error: dial tcp :8888: connect: connection reset by peer
2019/11/13 15:44:30 27: dial error: dial tcp :8888: connect: connection reset by peer
2019/11/13 15:44:30 26: dial error: dial tcp :8888: connect: connection reset by peer
2019/11/13 15:44:30 25: dial error: dial tcp :8888: connect: connection reset by peer
2019/11/13 15:44:30 24: dial error: dial tcp :8888: connect: connection reset by peer
2019/11/13 15:44:30 23: dial error: dial tcp :8888: connect: connection reset by peer
2019/11/13 15:44:30 22: dial error: dial tcp :8888: connect: connection reset by peer
2019/11/13 15:44:30 21 :connect to server ok
2019/11/13 15:44:30 20 :connect to server ok
2019/11/13 15:44:30 19 :connect to server ok
2019/11/13 15:44:30 18 :connect to server ok
2019/11/13 15:44:30 17 :connect to server ok
2019/11/13 15:44:30 16 :connect to server ok
2019/11/13 15:44:30 15 :connect to server ok
2019/11/13 15:44:30 14 :connect to server ok
2019/11/13 15:44:30 13 :connect to server ok
2019/11/13 15:44:30 12 :connect to server ok
2019/11/13 15:44:30 11 :connect to server ok
2019/11/13 15:44:30 10 :connect to server ok
2019/11/13 15:44:30 9 :connect to server ok
2019/11/13 15:44:30 8 :connect to server ok
2019/11/13 15:44:30 7 :connect to server ok
2019/11/13 15:44:30 6 :connect to server ok
2019/11/13 15:44:30 5 :connect to server ok
2019/11/13 15:44:30 4 :connect to server ok
2019/11/13 15:44:30 3 :connect to server ok
2019/11/13 15:44:30 2 :connect to server ok
2019/11/13 15:44:30 1 :connect to server ok

同样listen设置backlog为20,但是连接成功有21个,剩下的直接返回失败。

总结

  1. 已完成队列大小 = min(backlog,somaxconn)。
  2. 在实际工作中,将 somaxconn 设置一个合理大小,并且及时accept,对于一般的高并发是没有什么问题的。
  3. 疑问:以上例子backlog设置为20(作者尝试改成10,会有10+1连接成功),但是为什么有backlog+1个连接成功?

参考文章

Go语言TCP Socket编程
深入理解Linux TCP backlog
Linux SYN Backlog and somaxconn
Linux errno定义

你可能感兴趣的:(Go,网络编程,tcp,listen,backlog)