添加到epoll后,worker进程会进入ngx_epoll_process_events函数,epoll_wait等待客户端发起连接请求,触发事件。而在对读写事件进行操作时,都会出现一个instance,这个变量到底是何用意?
我们首先看看man手册关于epoll的注解:
*If there is a large amount of I/O space, it is possible that by trying to drain it the other files will not get processed causing
starvation.
This is not specific to epoll. The solution is to maintain a ready list and mark the file descriptor as ready in its associated data structure, thereby allowing the application to remember which files need to be processed but still round robin amongst all the ready files. This also supports ignoring subsequent events you receive for fd’s that are already ready.
o If using an event cache…
If you use an event cache or store all the fd’s returned from epoll_wait(2), then make sure to provide a way to mark its closure dynamically (ie- caused by a previous event’s processing). Suppose you receive 100 events from epoll_wait(2), and in event #47 a condition causes event #13 to be closed. If you remove the structure and close() the fd for event #13, then your event cache might still say there are events waiting for that fd causing confusion.
One solution for this is to call, during the processing of event 47, epoll_ctl(EPOLL_CTL_DEL) to delete fd 13 and close(), then mark its associated data structure as removed and link it to a cleanup list. If you find another event for fd 13 in your batch processing, you will discover the fd had been previously removed and there will be no confusion.*
大概意思如下:
当epoll中存在大量的监听fd,可能会出现因为处理别的fd监听事件阻塞、超时而导致后面的fd饿死的情况。
解决这个问题的方法是可以用一个list标记所有需要被触发的fd,并且循环调用。同时记录已经失效的fd,在下次调用时不再触发。
如果使用事件缓存或者保存epoll_wait返回的所有待触发事件,然后动态确认是否有事件已被关闭。假设有100个事件,47号事件在某个情况下回关闭13号事件,如果你从缓存中删除了结构体,close了fd,但是如果还有别的事件也需要用到13号事件的话,事件缓存还是可能会触发13号事件。这会引起混乱。
解决这个问题的一个方法就是在47号事件被处理的时候调用epoll_ctl(EPOLL_CTL_DEL),从epoll队列中删除13号事件。同时从事件缓存中删除,放入空闲列表中。如果你再发现有别的事件需要用到13号事件的话,你就会发现13号事件已经被删除了,不会有混乱。
而nginx使用了更巧妙的方法来解决这个问题,那就是ngx_event_t 里的instance变量。
typedef union epoll_data {
void *ptr;
int fd;
uint32_t u32;
uint64_t u64;
} epoll_data_t;
struct epoll_event {
uint32_t events;
epoll_data_t data;
};
ngx_connection_t *
ngx_get_connection(ngx_socket_t s, ngx_log_t *log)
{
ngx_uint_t instance;
ngx_event_t *rev, *wev;
ngx_connection_t *c;
instance = rev->instance;
ngx_memzero(rev, sizeof(ngx_event_t));
ngx_memzero(wev, sizeof(ngx_event_t));
//在获取空闲connection对象的时候,将instance对象赋值为!instance
rev->instance = !instance;
wev->instance = !instance;
}
ngx_event_t 结构体中定义了instance,占用1bit位。这个变量从free connections取出时,赋值!x。由于系统的指针对齐,所以末尾最后一位一般为0。nginx将读写事件的标志位instance存储于最后一位,这样就不需要反复调用epoll_ctl(EPOLL_CTL_DEL),只需要通过标志位就能反映出一个读写事件是否继续可用,性能更加高效。
static ngx_int_t
ngx_epoll_add_connection(ngx_connection_t *c)
{
struct epoll_event ee;
ee.events = EPOLLIN|EPOLLOUT|EPOLLET|EPOLLRDHUP;
ee.data.ptr = (void *) ((uintptr_t) c | c->read->instance);
ngx_log_debug2(NGX_LOG_DEBUG_EVENT, c->log, 0,
"epoll add connection: fd:%d ev:%08XD", c->fd, ee.events);
if (epoll_ctl(ep, EPOLL_CTL_ADD, c->fd, &ee) == -1) {
ngx_log_error(NGX_LOG_ALERT, c->log, ngx_errno,
"epoll_ctl(EPOLL_CTL_ADD, %d) failed", c->fd);
return NGX_ERROR;
}
c->read->active = 1;
c->write->active = 1;
return NGX_OK;
}
ngx_epoll_process_events是每个worker进程的处理函数,用来等待事件响应,并且判断读写事件是否已经无效。
ngx_epoll_process_events
events = epoll_wait(ep, event_list, (int) nevents, timer);
for (i = 0; i < events; i++) {
c = event_list[i].data.ptr;
//从data.ptr指针中取出instance变量
instance = (uintptr_t) c & 1;
c = (ngx_connection_t *) ((uintptr_t) c & (uintptr_t) ~1);
rev = c->read;
//判断该事件是否已经无效,无效则跳过,不进行下面操作
if (c->fd == -1 || rev->instance != instance) {
/*
* the stale event from a file descriptor
* that was just closed in this iteration
*/
ngx_log_debug1(NGX_LOG_DEBUG_EVENT, cycle->log, 0,
"epoll: stale event %p", c);
continue;
}