Epoll在Java Nio中的实现

Nio与Epoll

一直对nio和epoll没有系统的认识,最近看了下openjdk,简单的做个记录。

  • Linux2.6之后支持epoll
  • windows支持select而不支持epoll
  • 不同系统下nio的实现是不一样的,包括Sunos linux 和windows
  • select的复杂度为O(N)
  • select有最大fd限制,默认为1024
  • 修改sys/select.h可以改变select的fd数量限制
  • epoll的事件模型,无fd数量限制,复杂度O(1),不需要遍历fd

个人对于Nio不算太熟,所以用参考《netty权威指南》,写了一个TimeServer,从这个代码入手分析nio的实现原理。

 
  1. public class NioTimeServer {

  2. public static void main(String[] args) {

  3. int port = 8080;

  4. MultiplexerTimeServer timeServer = new MultiplexerTimeServer(port);

  5. new Thread(timeServer).start();

  6. }

  7.  
  8. static final class MultiplexerTimeServer implements Runnable {

  9. private Selector selector;

  10. private ServerSocketChannel servChannel;

  11. private volatile boolean stop;

  12.  
  13. public MultiplexerTimeServer(int port) {

  14. try {

  15. selector = Selector.open();

  16. servChannel = ServerSocketChannel.open();

  17. servChannel.configureBlocking(false);

  18. servChannel.socket().bind(new InetSocketAddress(port), 1024);

  19. servChannel.register(selector, SelectionKey.OP_ACCEPT);

  20. } catch (IOException e) {

  21. e.printStackTrace();

  22. System.exit(1);

  23. }

  24. }

  25.  
  26. public void stop() {

  27. this.stop = true;

  28. }

  29.  
  30. @Override

  31. public void run() {

  32. while (!stop) {

  33. try {

  34. selector.select(1000);

  35. Set selectedKeys = selector.selectedKeys();

  36. Iterator it = selectedKeys.iterator();

  37. SelectionKey key = null;

  38. while (it.hasNext()) {

  39. key = it.next();

  40. it.remove();

  41. try {

  42. handleInput(key);

  43. } catch (Exception e) {

  44. if (key != null) {

  45. key.cancel();

  46. if (key.channel() != null)

  47. key.channel().close();

  48. }

  49. }

  50. }

  51. } catch (IOException e) {

  52. e.printStackTrace();

  53. }

  54.  
  55. }

  56. }

  57.  
  58. private void handleInput(SelectionKey key) throws IOException {

  59. if (key.isValid()) {

  60. if (key.isAcceptable()) {

  61. ServerSocketChannel ssc = (ServerSocketChannel) key.channel();

  62. SocketChannel sc = ssc.accept();

  63. sc.configureBlocking(false);

  64. sc.register(selector, SelectionKey.OP_READ);

  65. }

  66. if (key.isReadable()) {

  67. SocketChannel sc = (SocketChannel) key.channel();

  68. ByteBuffer readBuf = ByteBuffer.allocate(1024);

  69. int readBytes = sc.read(readBuf);

  70. if (readBytes > 0) {

  71. readBuf.flip();

  72. byte[] bytes = new byte[readBuf.remaining()];

  73. readBuf.get(bytes);

  74. String body = new String(bytes, "UTF-8");

  75. System.out.println("The time server receive order :" + body);

  76. String currentTime = "QUERY TIME ORDER".equalsIgnoreCase(body)

  77. ? new Date(System.currentTimeMillis()).toString() : "BAD ORDER";

  78. doWrite(sc, currentTime);

  79. } else if (readBytes < 0) {

  80. key.cancel();

  81. sc.close();

  82. }

  83. }

  84. }

  85. }

  86.  
  87. /**

  88. * @param sc

  89. * @param currentTime

  90. * @throws IOException

  91. */

  92. private void doWrite(SocketChannel sc, String response) throws IOException {

  93. if (response != null && response.trim().length() > 0) {

  94. byte[] bytes = response.getBytes();

  95. ByteBuffer writeBuf = ByteBuffer.allocate(bytes.length);

  96. writeBuf.put(bytes);

  97. writeBuf.flip();

  98. sc.write(writeBuf);

  99. }

  100. }

  101.  
  102. }

大概的过程如下: 
1.创建一个ServerSocketChannel,设置为非阻塞模式,同时绑定监听端口,并注册channel到选择器上(注册感兴趣的key), 
2.用一个线程去轮询选择器,调用选择器的select方法,获取所有就绪的key,key和channel是相关的,通过key的状态来决定进一步的处理。

我们重点看的只有一个地方,那就是selector.select(1000);先看如何获取selector:

 
  1. public static Selector open() throws IOException {

  2. return SelectorProvider.provider().openSelector();

  3. }

这是使用了SelectorProvider去创建一个Selector,看下SelectorProvider的默认实例:

 
  1. public static SelectorProvider provider() {

  2. synchronized (lock) {

  3. if (provider != null)

  4. return provider;

  5. return AccessController.doPrivileged(

  6. new PrivilegedAction() {

  7. public SelectorProvider run() {

  8. if (loadProviderFromProperty())

  9. return provider;

  10. if (loadProviderAsService())

  11. return provider;

  12. provider = sun.nio.ch.DefaultSelectorProvider.create();

  13. return provider;

  14. }

  15. });

  16. }

  17. }

重点只看其中这一行:

  provider = sun.nio.ch.DefaultSelectorProvider.create();
  •  

这里用到了DefaultSelectorProvider,看下create()方法:

 
  1. public static SelectorProvider create() {

  2. String osname = AccessController.doPrivileged(

  3. new GetPropertyAction("os.name"));

  4. if ("SunOS".equals(osname)) {

  5. return new sun.nio.ch.DevPollSelectorProvider();

  6. }

  7. // use EPollSelectorProvider for Linux kernels >= 2.6

  8. if ("Linux".equals(osname)) {

  9. String osversion = AccessController.doPrivileged(

  10. new GetPropertyAction("os.version"));

  11. String[] vers = osversion.split("\\.", 0);

  12. if (vers.length >= 2) {

  13. try {

  14. int major = Integer.parseInt(vers[0]);

  15. int minor = Integer.parseInt(vers[1]);

  16. if (major > 2 || (major == 2 && minor >= 6)) {

  17. return new sun.nio.ch.EPollSelectorProvider();

  18. }

  19. } catch (NumberFormatException x) {

  20. // format not recognized

  21. }

  22. }

  23. }

  24. return new sun.nio.ch.PollSelectorProvider();

  25. }

重点到了,我们看到create方法中是通过区分操作系统来返回不同的Provider的。其中SunOs就是Solaris返回的是DevPollSelectorProvider,对于Linux,返回的Provder是EPollSelectorProvider,其余操作系统,返回的是PollSelectorProvider(比如Windows,是不支持epoll的,见注释) 
继续看下EPollSelectorProvider

 
  1. public class EPollSelectorProvider

  2. extends SelectorProviderImpl

  3. {

  4. public AbstractSelector openSelector() throws IOException {

  5. return new EPollSelectorImpl(this);

  6. }

  7.  
  8. public Channel inheritedChannel() throws IOException {

  9. return InheritedChannel.getChannel();

  10. }

  11. }

这里用到的是EPollSelectorImpl,由此可知,epoll在nio的实现就在这里了。 
EPollSelectorImpl 中select的实现如下:

 
  1. protected int doSelect(long timeout)

  2. throws IOException

  3. {

  4. if (closed)

  5. throw new ClosedSelectorException();

  6. processDeregisterQueue();

  7. try {

  8. begin();

  9. pollWrapper.poll(timeout);

  10. } finally {

  11. end();

  12. }

  13. processDeregisterQueue();

  14. int numKeysUpdated = updateSelectedKeys();

  15. if (pollWrapper.interrupted()) {

  16. // Clear the wakeup pipe

  17. pollWrapper.putEventOps(pollWrapper.interruptedIndex(), 0);

  18. synchronized (interruptLock) {

  19. pollWrapper.clearInterrupted();

  20. IOUtil.drain(fd0);

  21. interruptTriggered = false;

  22. }

  23. }

  24. return numKeysUpdated;

  25. }

只看这一句

        pollWrapper.poll(timeout);
  •  

其中,pollWrapper:

 
  1. // The poll object

  2. EPollArrayWrapper pollWrapper;

关于EPollArrayWrapper:

 
  1. /**

  2. * Manipulates a native array of epoll_event structs on Linux:

  3. *

  4. * typedef union epoll_data {

  5. * void *ptr;

  6. * int fd;

  7. * __uint32_t u32;

  8. * __uint64_t u64;

  9. * } epoll_data_t;

  10. *

  11. * struct epoll_event {

  12. * __uint32_t events;

  13. * epoll_data_t data;

  14. * };

  15. *

  16. * The system call to wait for I/O events is epoll_wait(2). It populates an

  17. * array of epoll_event structures that are passed to the call. The data

  18. * member of the epoll_event structure contains the same data as was set

  19. * when the file descriptor was registered to epoll via epoll_ctl(2). In

  20. * this implementation we set data.fd to be the file descriptor that we

  21. * register. That way, we have the file descriptor available when we

  22. * process the events.

  23. *

  24. * All file descriptors registered with epoll have the POLLHUP and POLLERR

  25. * events enabled even when registered with an event set of 0. To ensure

  26. * that epoll_wait doesn't poll an idle file descriptor when the underlying

  27. * connection is closed or reset then its registration is deleted from

  28. * epoll (it will be re-added again if the event set is changed)

  29. */

  •  

这是类注释,说明了epoll的数据结构等 
此类是epoll在openjdk中的实现类,肯定有epoll相关的jni:

 
  1. private native int epollCreate();

  2. private native void epollCtl(int epfd, int opcode, int fd, int events);

  3. private native int epollWait(long pollAddress, int numfds, long timeout,

  4. int epfd) throws IOException;

  5. private static native int sizeofEPollEvent();

  6. private static native int offsetofData();

  7. private static native int fdLimit();

  8. private static native void interrupt(int fd);

  9. private static native void init();

重点在poll方法:

 
  1. int poll(long timeout) throws IOException {

  2. updateRegistrations();

  3. updated = epollWait(pollArrayAddress, NUM_EPOLLEVENTS, timeout, epfd);

  4. for (int i=0; i

  5. if (getDescriptor(i) == incomingInterruptFD) {

  6. interruptedIndex = i;

  7. interrupted = true;

  8. break;

  9. }

  10. }

  11. return updated;

  12. }

首先调用epollCtl系统调用,更新fd到epoll实例,然后调用epollWait系统调用,线程在此处阻塞,超时或有fd就绪时会被唤醒,返回值是一个fd的集合,0表示无就绪时间,-1表示report error and abort,否则遍历并处理fd。 
关于epoll可以参考此文 http://www.ulduzsoft.com/2014/01/select-poll-epoll-practical-difference-for-system-architects/ 。

脚注

The syscall select is available in Windows but select processing is O(n) in the number of file descriptors unlike the modern constant-time multiplexers like epoll which makes select unacceptable for high-concurrency servers. This document will describe how high-concurrency programs are designed in Windows.Instead of epoll or kqueue, Windows has its own I/O multiplexer called I/O completion ports (IOCPs). IOCPs are the objects used to poll overlapped I/O for completion. IOCP polling is constant time (REF?).Windows支持select系统调用,(时间复杂度O(N)),但是不支持Epoll,Windows自身的 multiplexer是IOCPs

你可能感兴趣的:(java,IO)