最近想学习一下libevent,就先翻译一下libevent的官方文档吧.
英文原文链接:http://www.wangafu.net/~nickm/libevent-book/01_intro.html
大部分编程初学者都是从阻塞IO开始的。何谓阻塞IO?,即你进行一个IO调用时,除非这个操作完成,或者超时网络协议栈放弃了,否则这个调用是不返回的.比如你对TCP连接调用“connect()”时,你的操作系统将发送一个SYN包给TCP连接的对端,除非收到对端发送的SYN ACK包或者是超时了,否则connect()将不会返回。
这里是一个简单的使用阻塞网络调用的客户端例子.客户端连接到www.google.com,发起一个HTTP请求,把响应打印到标准输出.
/* For sockaddr_in */ #include <netinet/in.h> /* For socket functions */ #include <sys/socket.h> /* For gethostbyname */ #include <netdb.h> #include <unistd.h> #include <string.h> #include <stdio.h> int main(int c, char **v) { const char query[] = "GET / HTTP/1.0\r\n" "Host: www.google.com\r\n" "\r\n"; const char hostname[] = "www.google.com"; struct sockaddr_in sin; struct hostent *h; const char *cp; int fd; ssize_t n_written, remaining; char buf[1024]; /* Look up the IP address for the hostname. Watch out; this isn't threadsafe on most platforms. */ h = gethostbyname(hostname); if (!h) { fprintf(stderr, "Couldn't lookup %s: %s", hostname, hstrerror(h_errno)); return 1; } if (h->h_addrtype != AF_INET) { fprintf(stderr, "No ipv6 support, sorry."); return 1; } /* Allocate a new socket */ fd = socket(AF_INET, SOCK_STREAM, 0); if (fd < 0) { perror("socket"); return 1; } /* Connect to the remote host. */ sin.sin_family = AF_INET; sin.sin_port = htons(80); sin.sin_addr = *(struct in_addr*)h->h_addr; if (connect(fd, (struct sockaddr*) &sin, sizeof(sin))) { perror("connect"); close(fd); return 1; } /* Write the query. */ /* XXX Can send succeed partially? */ cp = query; remaining = strlen(query); while (remaining) { n_written = send(fd, cp, remaining, 0); if (n_written <= 0) { perror("send"); return 1; } remaining -= n_written; cp += n_written; } /* Get an answer back. */ while (1) { ssize_t result = recv(fd, buf, sizeof(buf), 0); if (result == 0) { break; } else if (result < 0) { perror("recv"); close(fd); return 1; } fwrite(buf, 1, result, stdout); } close(fd); return 0; }
上面代码中所有的网络调用都是阻塞的,gethostbyname()在解析www.google.com成功或失败前不会返回,connect()在链路建立链接之前不会返回,recv()在接收到数据或是链接关闭请求之前不会返回,send()在数据发送到内核的写缓冲区之前不会返回.
阻塞IO也不是完全有害.如果你的程序在上述函数阻塞期间没啥想干的,用阻塞IO也没啥问题.但是想象一下,你现在需要写一个同时处理多个链接的程序,比如你要从2条链接里读数据,但是你并不知道哪条链接会先来数据.你可以写这样一个程序:
Bad Example
1 /* This won't work. */ 2 char buf[1024]; 3 int i, n; 4 while (i_still_want_to_read()) { 5 for (i=0; i<n_sockets; ++i) { 6 n = recv(fd[i], buf, sizeof(buf), 0); 7 if (n==0) 8 handle_close(fd[i]); 9 else if (n<0) 10 handle_error(fd[i], errno); 11 else 12 handle_input(fd[i], buf, n); 13 } 14 }
为什么说这个程序很不好呢?因为如果fd[2]上数据先来了,程序在fd[0]和fd[1]上数据来了并处理完成之前,根本就不会去尝试从fd[2]读数据,因为这时候还阻塞在recv(fd[0], buf, sizeof(buf),0)这里呢。
有时候我们通过多线程或者多进程解决这个问题.最简单的一种处理方式就是每一个链接用一个进程(或线程)来处理.由于每条链接都有自己的进程,所以一条链接上的阻塞IO阻塞了并不会影响到别的链接处理进程.
下面是另一个例子。这是一个比较繁琐的服务器程序,在端口40713上等待tcp链接,从到来的数据中每次读一行,并将这一行的ROT13加密(其实就是简单的字符变换,比如把'a'变成'n',‘b’变成'o')数据输出.程序用了UNIX下的fork()来为每一条链接创建一个进程.
Example: Forking ROT13 serve
/* For sockaddr_in */ #include <netinet/in.h> /* For socket functions */ #include <sys/socket.h> #include <unistd.h> #include <string.h> #include <stdio.h> #include <stdlib.h> #define MAX_LINE 16384 char rot13_char(char c) { /* We don't want to use isalpha here; setting the locale would change * which characters are considered alphabetical. */ if ((c >= 'a' && c <= 'm') || (c >= 'A' && c <= 'M')) return c + 13; else if ((c >= 'n' && c <= 'z') || (c >= 'N' && c <= 'Z')) return c - 13; else return c; } void child(int fd) { char outbuf[MAX_LINE+1]; size_t outbuf_used = 0; ssize_t result; while (1) { char ch; result = recv(fd, &ch, 1, 0); if (result == 0) { break; } else if (result == -1) { perror("read"); break; } /* We do this test to keep the user from overflowing the buffer. */ if (outbuf_used < sizeof(outbuf)) { outbuf[outbuf_used++] = rot13_char(ch); } if (ch == '\n') { send(fd, outbuf, outbuf_used, 0); outbuf_used = 0; continue; } } } void run(void) { int listener; struct sockaddr_in sin; sin.sin_family = AF_INET; sin.sin_addr.s_addr = 0; sin.sin_port = htons(40713); listener = socket(AF_INET, SOCK_STREAM, 0); #ifndef WIN32 { int one = 1; setsockopt(listener, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one)); } #endif if (bind(listener, (struct sockaddr*)&sin, sizeof(sin)) < 0) { perror("bind"); return; } if (listen(listener, 16)<0) { perror("listen"); return; } while (1) { struct sockaddr_storage ss; socklen_t slen = sizeof(ss); int fd = accept(listener, (struct sockaddr*)&ss, &slen); if (fd < 0) { perror("accept"); } else { if (fork() == 0) { child(fd); exit(0); } } } } int main(int c, char **v) { run(); return 0; }
那么我们现在对多条链接的处理有了完美的解决方案吗?我是不是可以不写这本书而去干点别的了呢?不是这样的。首先,创建进程(线程)在某些平台上会是一笔不小的开销。在实际应用中,你可能更想使用一个线程池来代替它。但是从根本上讲,多线程并不能够达到你所期望的那种扩展性。如果你程序需要同时处理成千上万个连接,对于每个CPU仅能处理很少的线程的情况,处理成千上万个线程效率并不高。
如果线程不是处理多条链接的答案,那什么才是呢?在Unix范例中,你可以设置socket为noblocking(非阻塞)。Unix中完成这个设置的调用如下:
fcntl(fd, F_SETFL, O_NONBLOCK);
fd是代表socket的文件描述符.一旦你设置fd(也就是socket)为非阻塞的,你在fd上进行的网络调用要么立刻完成,要么返回一个特定的错误码,告诉你“我现在没法处理,再试一遍吧”.基于此,我们的处理两条socket的程序可以这样写:
Bad Example: busy-polling all sockets
/* This will work, but the performance will be unforgivably bad. */ int i, n; char buf[1024]; for (i=0; i < n_sockets; ++i) fcntl(fd[i], F_SETFL, O_NONBLOCK); while (i_still_want_to_read()) { for (i=0; i < n_sockets; ++i) { n = recv(fd[i], buf, sizeof(buf), 0); if (n == 0) { handle_close(fd[i]); } else if (n < 0) { if (errno == EAGAIN) ; /* The kernel didn't have any data for us to read. */ else handle_error(fd[i], errno); } else { handle_input(fd[i], buf, n); } } }
现在我们使用了非阻塞socket,上述代码可以工作...但也仅限于此了。上面代码性能很差,有两点原因:一.如果两个连接上都没有数据,那么就相当于在执行一个死循环,占据了所有的cpu。二.如果你通过这种方法处理多条以上的链接,那么对每一条链接来说,都需要执行内核调用(译者注:也就是上述代码中的recv),不管链接上有没有数据.所以说我们需要的是这样一种机制来告诉内核:“一直等着,直到某条链接上有数据了就告诉我是哪条链接上来数据了”。
下面是一个使用select的例子:
Example: Using select
/* If you only have a couple dozen fds, this version won't be awful */ fd_set readset; int i, n; char buf[1024]; while (i_still_want_to_read()) { int maxfd = -1; FD_ZERO(&readset); /* Add all of the interesting fds to readset */ for (i=0; i < n_sockets; ++i) { if (fd[i]>maxfd) maxfd = fd[i]; FD_SET(fd[i], &readset); } /* Wait until one or more fds are ready to read */ select(maxfd+1, &readset, NULL, NULL, NULL); /* Process all of the fds that are still set in readset */ for (i=0; i < n_sockets; ++i) { if (FD_ISSET(fd[i], &readset)) { n = recv(fd[i], buf, sizeof(buf), 0); if (n == 0) { handle_close(fd[i]); } else if (n < 0) { if (errno == EAGAIN) ; /* The kernel didn't have any data for us to read. */ else handle_error(fd[i], errno); } else { handle_input(fd[i], buf, n); } } } }
下面是一个使用select()重新实现ROT13 server的例子
Example: select()-based ROT13 server
/* For sockaddr_in */ #include <netinet/in.h> /* For socket functions */ #include <sys/socket.h> /* For fcntl */ #include <fcntl.h> /* for select */ #include <sys/select.h> #include <assert.h> #include <unistd.h> #include <string.h> #include <stdlib.h> #include <stdio.h> #include <errno.h> #define MAX_LINE 16384 char rot13_char(char c) { /* We don't want to use isalpha here; setting the locale would change * which characters are considered alphabetical. */ if ((c >= 'a' && c <= 'm') || (c >= 'A' && c <= 'M')) return c + 13; else if ((c >= 'n' && c <= 'z') || (c >= 'N' && c <= 'Z')) return c - 13; else return c; } struct fd_state { char buffer[MAX_LINE]; size_t buffer_used; int writing; size_t n_written; size_t write_upto; }; struct fd_state * alloc_fd_state(void) { struct fd_state *state = malloc(sizeof(struct fd_state)); if (!state) return NULL; state->buffer_used = state->n_written = state->writing = state->write_upto = 0; return state; } void free_fd_state(struct fd_state *state) { free(state); } void make_nonblocking(int fd) { fcntl(fd, F_SETFL, O_NONBLOCK); } int do_read(int fd, struct fd_state *state) { char buf[1024]; int i; ssize_t result; while (1) { result = recv(fd, buf, sizeof(buf), 0); if (result <= 0) break; for (i=0; i < result; ++i) { if (state->buffer_used < sizeof(state->buffer)) state->buffer[state->buffer_used++] = rot13_char(buf[i]); if (buf[i] == '\n') { state->writing = 1; state->write_upto = state->buffer_used; } } } if (result == 0) { return 1; } else if (result < 0) { if (errno == EAGAIN) return 0; return -1; } return 0; } int do_write(int fd, struct fd_state *state) { while (state->n_written < state->write_upto) { ssize_t result = send(fd, state->buffer + state->n_written, state->write_upto - state->n_written, 0); if (result < 0) { if (errno == EAGAIN) return 0; return -1; } assert(result != 0); state->n_written += result; } if (state->n_written == state->buffer_used) state->n_written = state->write_upto = state->buffer_used = 0; state->writing = 0; return 0; } void run(void) { int listener; struct fd_state *state[FD_SETSIZE]; struct sockaddr_in sin; int i, maxfd; fd_set readset, writeset, exset; sin.sin_family = AF_INET; sin.sin_addr.s_addr = 0; sin.sin_port = htons(40713); for (i = 0; i < FD_SETSIZE; ++i) state[i] = NULL; listener = socket(AF_INET, SOCK_STREAM, 0); make_nonblocking(listener); #ifndef WIN32 { int one = 1; setsockopt(listener, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one)); } #endif if (bind(listener, (struct sockaddr*)&sin, sizeof(sin)) < 0) { perror("bind"); return; } if (listen(listener, 16)<0) { perror("listen"); return; } FD_ZERO(&readset); FD_ZERO(&writeset); FD_ZERO(&exset); while (1) { maxfd = listener; FD_ZERO(&readset); FD_ZERO(&writeset); FD_ZERO(&exset); FD_SET(listener, &readset); for (i=0; i < FD_SETSIZE; ++i) { if (state[i]) { if (i > maxfd) maxfd = i; FD_SET(i, &readset); if (state[i]->writing) { FD_SET(i, &writeset); } } } if (select(maxfd+1, &readset, &writeset, &exset, NULL) < 0) { perror("select"); return; } if (FD_ISSET(listener, &readset)) { struct sockaddr_storage ss; socklen_t slen = sizeof(ss); int fd = accept(listener, (struct sockaddr*)&ss, &slen); if (fd < 0) { perror("accept"); } else if (fd > FD_SETSIZE) { close(fd); } else { make_nonblocking(fd); state[fd] = alloc_fd_state(); assert(state[fd]);/*XXX*/ } } for (i=0; i < maxfd+1; ++i) { int r = 0; if (i == listener) continue; if (FD_ISSET(i, &readset)) { r = do_read(i, state[i]); } if (r == 0 && FD_ISSET(i, &writeset)) { r = do_write(i, state[i]); } if (r) { free_fd_state(state[i]); state[i] = NULL; close(i); } } } } int main(int c, char **v) { setvbuf(stdout, NULL, _IONBF, 0); run(); return 0; }
用select()就解决了我们之前提到的问题吗?不,还没完呢.当socket的数量很大时,select()调用的性能会很差.因为生成与读取select()的位数组的时间正比于你向select()提供的最大的fd值。
不同的操作系统提供了不同的select()的替代函数。包括poll(),epoll(),kqueue(),evports以及/dev/poll.上述所有函数都比select()的性能要好,并且除了poll(),对增加一条socket,移除一条socket,通知一条socket已经做好IO准备而言,这些函数的时间复杂度都是O(1)。
然而不幸地是,这些更为有效的接口并没有一个统一的标准.Linux有epoll(),BSDs(包括Darwin)有kqueue(),Solaris有evports和/dev/poll...而且这些操作系统没一个自身之外的系统的上述接口. 所以你想写一个可移植(跨平台)的高性能异步应用程序的话,你就需要一个包含上述所有接口的抽象。
这正是Libevent API可以为你提供的最基本的功能.Libevent根据你的系统,选择最高效的select()的替代函数,并提供统一的接口.
下面是异步ROT13 server的另一个版本.这一次我们使用Libevent 2替代select().请注意现在fd_sets没啦,取而代之的是,我们使用结构event_base与events进行关联以及解除关联.event_base根据select(),poll(),epoll(),kqueue()等实现.
Example: A low-level ROT13 server with Libevent
/* For sockaddr_in */ #include <netinet/in.h> /* For socket functions */ #include <sys/socket.h> /* For fcntl */ #include <fcntl.h> #include <event2/event.h> #include <assert.h> #include <unistd.h> #include <string.h> #include <stdlib.h> #include <stdio.h> #include <errno.h> #define MAX_LINE 16384 void do_read(evutil_socket_t fd, short events, void *arg); void do_write(evutil_socket_t fd, short events, void *arg); char rot13_char(char c) { /* We don't want to use isalpha here; setting the locale would change * which characters are considered alphabetical. */ if ((c >= 'a' && c <= 'm') || (c >= 'A' && c <= 'M')) return c + 13; else if ((c >= 'n' && c <= 'z') || (c >= 'N' && c <= 'Z')) return c - 13; else return c; } struct fd_state { char buffer[MAX_LINE]; size_t buffer_used; size_t n_written; size_t write_upto; struct event *read_event; struct event *write_event; }; struct fd_state * alloc_fd_state(struct event_base *base, evutil_socket_t fd) { struct fd_state *state = malloc(sizeof(struct fd_state)); if (!state) return NULL; state->read_event = event_new(base, fd, EV_READ|EV_PERSIST, do_read, state); if (!state->read_event) { free(state); return NULL; } state->write_event = event_new(base, fd, EV_WRITE|EV_PERSIST, do_write, state); if (!state->write_event) { event_free(state->read_event); free(state); return NULL; } state->buffer_used = state->n_written = state->write_upto = 0; assert(state->write_event); return state; } void free_fd_state(struct fd_state *state) { event_free(state->read_event); event_free(state->write_event); free(state); } void do_read(evutil_socket_t fd, short events, void *arg) { struct fd_state *state = arg; char buf[1024]; int i; ssize_t result; while (1) { assert(state->write_event); result = recv(fd, buf, sizeof(buf), 0); if (result <= 0) break; for (i=0; i < result; ++i) { if (state->buffer_used < sizeof(state->buffer)) state->buffer[state->buffer_used++] = rot13_char(buf[i]); if (buf[i] == '\n') { assert(state->write_event); event_add(state->write_event, NULL); state->write_upto = state->buffer_used; } } } if (result == 0) { free_fd_state(state); } else if (result < 0) { if (errno == EAGAIN) // XXXX use evutil macro return; perror("recv"); free_fd_state(state); } } void do_write(evutil_socket_t fd, short events, void *arg) { struct fd_state *state = arg; while (state->n_written < state->write_upto) { ssize_t result = send(fd, state->buffer + state->n_written, state->write_upto - state->n_written, 0); if (result < 0) { if (errno == EAGAIN) // XXX use evutil macro return; free_fd_state(state); return; } assert(result != 0); state->n_written += result; } if (state->n_written == state->buffer_used) state->n_written = state->write_upto = state->buffer_used = 1; event_del(state->write_event); } void do_accept(evutil_socket_t listener, short event, void *arg) { struct event_base *base = arg; struct sockaddr_storage ss; socklen_t slen = sizeof(ss); int fd = accept(listener, (struct sockaddr*)&ss, &slen); if (fd < 0) { // XXXX eagain?? perror("accept"); } else if (fd > FD_SETSIZE) { close(fd); // XXX replace all closes with EVUTIL_CLOSESOCKET */ } else { struct fd_state *state; evutil_make_socket_nonblocking(fd); state = alloc_fd_state(base, fd); assert(state); /*XXX err*/ assert(state->write_event); event_add(state->read_event, NULL); } } void run(void) { evutil_socket_t listener; struct sockaddr_in sin; struct event_base *base; struct event *listener_event; base = event_base_new(); if (!base) return; /*XXXerr*/ sin.sin_family = AF_INET; sin.sin_addr.s_addr = 0; sin.sin_port = htons(40713); listener = socket(AF_INET, SOCK_STREAM, 0); evutil_make_socket_nonblocking(listener); #ifndef WIN32 { int one = 1; setsockopt(listener, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one)); } #endif if (bind(listener, (struct sockaddr*)&sin, sizeof(sin)) < 0) { perror("bind"); return; } if (listen(listener, 16)<0) { perror("listen"); return; } listener_event = event_new(base, listener, EV_READ|EV_PERSIST, do_accept, (void*)base); /*XXX check it */ event_add(listener_event, NULL); event_base_dispatch(base); } int main(int c, char **v) { setvbuf(stdout, NULL, _IONBF, 0); run(); return 0; }
(代码中需要注意的:我们没把socket定义为"int"型,我们用了evutil_socket_t.我们没有使用fcntl(O_NONBLOCK)去设置socket为非阻塞的,我们使用了evutil_make_socket_nonblocking.上述这些改变使得我们的代码兼容Win32的网络API)
What about convenience? (and what about Windows?)
你可能注意到了,在我们的代码变得更高效的同时,也变得更复杂了.回顾一下我们使用fork的版本,我们不必为每一条链接都管理一个缓冲区:我们为每一个进程都有一个独立的栈上分配的缓冲区.
我们不必精确地知道哪一个socket在读或者写:这是隐含在代码中的.(that was implicit in our location in the code).我们也不需要一个结构来跟踪每一个操作都完成了多少:我们使用循环和栈变量就好了.
此外,如果你对windows网络编程非常有经验的话,你会意识到,我们在上述例子中对libevent的用法并不会有最好的性能.在windows上,最快的异步IO方式使用的不是select()-like的接口:它用的是IOCP(IO Completion Ports:IO完成端口) API.不像别的高效的网络API,IOCP并不在一个socket已经对你的程序需要做的某些操作做好准备的时候就通知你的程序.取而代之的是,你的程序告诉windows网络栈开始网络操作,当操作完成的时候IOCP再通知程序.
幸运的是,Libevent2的"bufferevents"接口解决了上述这些问题:这使得我们的程序更易于编写,并且可以高效地在Windows和Unix下运行.
下面是我们使用bufferevents API的最新的ROT13 server。
Example: A simpler ROT13 server with Libevent
/* For sockaddr_in */ #include <netinet/in.h> /* For socket functions */ #include <sys/socket.h> /* For fcntl */ #include <fcntl.h> #include <event2/event.h> #include <event2/buffer.h> #include <event2/bufferevent.h> #include <assert.h> #include <unistd.h> #include <string.h> #include <stdlib.h> #include <stdio.h> #include <errno.h> #define MAX_LINE 16384 void do_read(evutil_socket_t fd, short events, void *arg); void do_write(evutil_socket_t fd, short events, void *arg); char rot13_char(char c) { /* We don't want to use isalpha here; setting the locale would change * which characters are considered alphabetical. */ if ((c >= 'a' && c <= 'm') || (c >= 'A' && c <= 'M')) return c + 13; else if ((c >= 'n' && c <= 'z') || (c >= 'N' && c <= 'Z')) return c - 13; else return c; } void readcb(struct bufferevent *bev, void *ctx) { struct evbuffer *input, *output; char *line; size_t n; int i; input = bufferevent_get_input(bev); output = bufferevent_get_output(bev); while ((line = evbuffer_readln(input, &n, EVBUFFER_EOL_LF))) { for (i = 0; i < n; ++i) line[i] = rot13_char(line[i]); evbuffer_add(output, line, n); evbuffer_add(output, "\n", 1); free(line); } if (evbuffer_get_length(input) >= MAX_LINE) { /* Too long; just process what there is and go on so that the buffer * doesn't grow infinitely long. */ char buf[1024]; while (evbuffer_get_length(input)) { int n = evbuffer_remove(input, buf, sizeof(buf)); for (i = 0; i < n; ++i) buf[i] = rot13_char(buf[i]); evbuffer_add(output, buf, n); } evbuffer_add(output, "\n", 1); } } void errorcb(struct bufferevent *bev, short error, void *ctx) { if (error & BEV_EVENT_EOF) { /* connection has been closed, do any clean up here */ /* ... */ } else if (error & BEV_EVENT_ERROR) { /* check errno to see what error occurred */ /* ... */ } else if (error & BEV_EVENT_TIMEOUT) { /* must be a timeout event handle, handle it */ /* ... */ } bufferevent_free(bev); } void do_accept(evutil_socket_t listener, short event, void *arg) { struct event_base *base = arg; struct sockaddr_storage ss; socklen_t slen = sizeof(ss); int fd = accept(listener, (struct sockaddr*)&ss, &slen); if (fd < 0) { perror("accept"); } else if (fd > FD_SETSIZE) { close(fd); } else { struct bufferevent *bev; evutil_make_socket_nonblocking(fd); bev = bufferevent_socket_new(base, fd, BEV_OPT_CLOSE_ON_FREE); bufferevent_setcb(bev, readcb, NULL, errorcb, NULL); bufferevent_setwatermark(bev, EV_READ, 0, MAX_LINE); bufferevent_enable(bev, EV_READ|EV_WRITE); } } void run(void) { evutil_socket_t listener; struct sockaddr_in sin; struct event_base *base; struct event *listener_event; base = event_base_new(); if (!base) return; /*XXXerr*/ sin.sin_family = AF_INET; sin.sin_addr.s_addr = 0; sin.sin_port = htons(40713); listener = socket(AF_INET, SOCK_STREAM, 0); evutil_make_socket_nonblocking(listener); #ifndef WIN32 { int one = 1; setsockopt(listener, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one)); } #endif if (bind(listener, (struct sockaddr*)&sin, sizeof(sin)) < 0) { perror("bind"); return; } if (listen(listener, 16)<0) { perror("listen"); return; } listener_event = event_new(base, listener, EV_READ|EV_PERSIST, do_accept, (void*)base); /*XXX check it */ event_add(listener_event, NULL); event_base_dispatch(base); } int main(int c, char **v) { setvbuf(stdout, NULL, _IONBF, 0); run(); return 0; }