java NIO的空轮询bug 以及Netty的解决办法

这个bug是指 java的NIO在linux下selector.select()时,本来如果轮询的结果为空并且不调用wakeup的方法的话,这个selector.select()应该是一直阻塞的,但是java却会打破阻塞,继续执行,导致程序无限空转,造成CPU使用率100%
java NIO的空轮询bug 以及Netty的解决办法_第1张图片
这个bug只出现在linux系统下,因为linux下NIO底层使用的是epoll来实现的,而java的epoll实现存在bug,导致selector出现了这种轮询为空却唤醒的情况。windows下NIO是使用的poll来实现selector的就不存在这种bug

Netty中解决该bug的方法

1、设置一个selector.select(timeout),有一个超时时间,selector有4种情况会跳出阻塞

  1. 有事件发生
  2. wakeup
  3. 超时
  4. 空轮询bug

而前两种返回值不为0,可以跳出循环,超时有时间戳记录,所以每次空轮询,有专门 的计数器+1,如果空轮询的次数超过了512次,就认为其触发了空轮询bug。

2、触发bug后,netty直接重建一个selector,将原来的channel重新注册到新的selector上,将旧的 selector关掉

private void select(boolean oldWakenUp) throws IOException {//节选
    for(;;)
        int selectedKeys = selector.select(timeoutMillis);
        selectCnt ++;
        
        if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
            // - Selected something,
            // - waken up by user, or
            // - the task queue has a pending task.
            // - a scheduled task is ready for processing
            break;
        }
        long time = System.nanoTime();
        if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
            // timeoutMillis elapsed without anything selected.
            // 超时
            selectCnt = 1;
        } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
            selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {//默认值512
            // The code exists in an extra method to ensure the method is not too big to inline as this
            // branch is not very likely to get hit very frequently.
            // 空轮询一次 cnt+1  如果一个周期内次数超过512,则假定发生了空轮询bug,重建selector
            selector = selectRebuildSelector(selectCnt);
            selectCnt = 1;
            break;
        }
    }
}



/**
 * Replaces the current {@link Selector} of this event loop with newly created {@link Selector}s to work
 * around the infamous epoll 100% CPU bug.
 * 新建一个selector来解决空轮询bug
 */
public void rebuildSelector() {
    if (!inEventLoop()) {
        execute(new Runnable() {
            @Override
            public void run() {
                rebuildSelector0();
            }
        });
        return;
    }
    rebuildSelector0();
}

private void rebuildSelector0() {
    final Selector oldSelector = selector;
    final SelectorTuple newSelectorTuple;

    //新建一个selector
    newSelectorTuple = openSelector();


    // 将旧的selector的channel全部拿出来注册到新的selector上
    int nChannels = 0;
    for (SelectionKey key: oldSelector.keys()) {
        Object a = key.attachment();
        if (!key.isValid() || key.channel().keyFor(newSelectorTuple.unwrappedSelector) != null) {
            continue;
        }
        int interestOps = key.interestOps();
        key.cancel();
        SelectionKey newKey = key.channel().register(newSelectorTuple.unwrappedSelector, interestOps, a);
        if (a instanceof AbstractNioChannel) {
            // Update SelectionKey
            ((AbstractNioChannel) a).selectionKey = newKey;
        }
        nChannels ++;
  
    }

    selector = newSelectorTuple.selector;
    unwrappedSelector = newSelectorTuple.unwrappedSelector;
    // time to close the old selector as everything else is registered to the new one
    //关掉旧的selector
     oldSelector.close();
}

你可能感兴趣的:(NIO)