FD_SET设置的文件描述符超过1024引发coredump

FD_SET设置的文件描述符超过1024引发coredump

在开发过程中,遇到一个coredump的问题,最后排查到是FD_SET的文件描述符大于1023

1. 写一个例子来触发

#include
#include
#include
#include
#include
#include
#include

int do_loop(){
        int buf[10];
        for (int i = 0;i<10;i++){
                buf[i] = 1;
        }
        return 0;
}

void* do_func(void* arg){

        while(1){
                do_loop();
        }
        return NULL;
}


int main(int argc, char *argv[])
{

        fd_set rdfds;
        FD_ZERO(&rdfds);                
        unsigned int n = 0;
        n = atoi(argv[1]);//通过输入参数来设置文件最大的文件描述符大小
        printf("n = %d\n",n);
        for(unsigned int i = 0; i < n; i++)
        {
                FD_SET(i, &rdfds);
        }

        printf("to select\n");
        pthread_t thread;
        pthread_create(&thread,NULL,do_func,NULL);
        while(1){
                sleep(5);
        }
        return 0;
}

}

2、开始执行

[root@xxx-14121 test]# ./a.out 1
n = 1
to select
^C
[root@xxx-14121 test]# ./a.out 1024
n = 1024
to select
^C
[root@xxx-14121 test]#
[root@xxx-14121 test]#
[root@xxx-14121 test]# ./a.out 1027
n = 1027
to select
^C
[root@xxx-14121 test]# ./a.out 1100
n = 1100
to select
^C
[root@xxx-14121 test]# ./a.out 1200
n = 1200
Segmentation fault (core dumped)

这种问题就更坑人了,并不是只要超过1023就会必现,到1200就快复现了

3、gdb调试

gdb ./a.out /tmp/core-a.out-1397671-1691736607
GNU gdb (GDB) openEuler 11.1-1.oe2203
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-openEuler-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./a.out...
[New LWP 1397671]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib64/libthread_db.so.1".
Core was generated by `▒▒▒▒▒▒▒▒▒▒▒▒▒'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000401231 in main (argc=2, argv=0x7ffc27a031e8) at fd_set.c:36
36                      FD_SET(i, &rdfds);
(gdb) bt
#0  0x0000000000401231 in main (argc=2, argv=0x7ffc27a031e8) at fd_set.c:36

这里还好,最起码gdb报的行数是在36,在FD_SET这行,在自己的开发环境就没这么好了

3、内核里面 FD_SET的定义

static __inline__ void __FD_SET(unsigned long fd, __kernel_fd_set *fdsetp)
{
    unsigned long _tmp = fd / __NFDBITS;
    unsigned long _rem = fd % __NFDBITS;
    fdsetp->fds_bits[_tmp] |= (1UL<<_rem);
}
 
#define __NFDBITS    (8 * sizeof(unsigned long))
 
typedef struct {
    unsigned long fds_bits [__FDSET_LONGS];
} __kernel_fd_set;
 
#define __FDSET_LONGS   (__FD_SETSIZE/__NFDBITS)
 
#define __FD_SETSIZE    1024

__NFDBITS是一个定值64 ,在64位机上
__FDSET_LONGS 是一个定值(1024/64)在64位机上,fds_bits数组的长度就是(1024/64)
当fd大于 1023的时候就数组越界了,就会出现稀奇古怪的问题了

总结:FD_SET(i, &rdfds);其中i不能大于1023,要不然就会出现不可控的问题,如果一个进程打开的文件描述符超过了1023,又要用select进行监听,就会踩到这个坑

http://www.biegral.com/index/View/03974639-1614-4a5a-bec7-dbb72d19ca24

你可能感兴趣的:(服务器,c语言)