Netty线程源码分析(一)

一、NioEventLoopGroup

继承关系图1-1:


Netty线程源码分析(一)_第1张图片
1

Netty允许处理IO和接收连接使用同一个EventLoopGroup


Netty线程源码分析(一)_第2张图片
1

1.1 NioEventLoopGroup和NioEventLoop是什么关系?

NioEventLoopGroup实际是NioEventLoop的线程组,它包含了一个或多个EventLoop,而EventLoop就是一个Channel执行实际工作的线程,当注册一个Channel后,Netty将这个Channel绑定到一个EventLoop上,在其生命周期内始终被绑定在这个EventLoop上不会被改变。

从图1-2可以看出很多Channle会共享一个EventLoop。这意味着在一个Channel在被EventLoop使用时会禁止其它Channel绑定到相同的EventLoop。我们可以理解为EventLoop是一个事件循环线程,而EventLoopGroup是一个事件循环集合。

图1-2:

Netty线程源码分析(一)_第3张图片
1

1.2 NioEventLoopGroup初始化

首先看NioEventLoopGroup构造方法:

public NioEventLoopGroup() {
    this(0);
}

public NioEventLoopGroup(int nThreads) {
    this(nThreads, null);
}

public NioEventLoopGroup(int nThreads, ThreadFactory threadFactory) {
    this(nThreads, threadFactory, SelectorProvider.provider());
}

public NioEventLoopGroup(
        int nThreads, ThreadFactory threadFactory, final SelectorProvider selectorProvider) {
    super(nThreads, threadFactory, selectorProvider);
}

以上代码可以发现NioEventLoopGroup虽然有4个构造方法,但最终调用的是MultithreadEventLoopGroup的构造方法,代码如下:

protected MultithreadEventLoopGroup(int nThreads, ThreadFactory threadFactory, Object... args) {  
    super(nThreads == 0? DEFAULT_EVENT_LOOP_THREADS : nThreads, threadFactory, args);  
}

private static final int DEFAULT_EVENT_LOOP_THREADS;

    static {
        DEFAULT_EVENT_LOOP_THREADS = Math.max(1, SystemPropertyUtil.getInt(
                "io.netty.eventLoopThreads", Runtime.getRuntime().availableProcessors() * 2));

        if (logger.isDebugEnabled()) {
            logger.debug("-Dio.netty.eventLoopThreads: {}", DEFAULT_EVENT_LOOP_THREADS);
        }
    }

在NioEventLoopGroup初始化之前,会先执行父类MultithreadEventLoopGroup的静态模块,NioEventLoop的默认大小是2倍的CPU核数,但这并不是一个恒定的最佳数量,为了避免线程上下文切换,只要能满足要求,这个值其实越少越好。

MultithreadEventExecutorGroup的构造方法:

//EventExecutor数组,保存eventLoop
private final EventExecutor[] children;
//从children中选取一个eventLoop的策略
private final EventExecutorChooser chooser;

protected MultithreadEventExecutorGroup(int nThreads, ThreadFactory threadFactory, Object... args) {
    if (nThreads <= 0) {
        throw new IllegalArgumentException(String.format("nThreads: %d (expected: > 0)", nThreads));
    }

    if (threadFactory == null) {
        //是一个通用的ThreadFactory实现,方便配置线程池
        threadFactory = newDefaultThreadFactory();
    }
    
    //根据线程数创建SingleThreadEventExecutor数组,从命名上可以看出SingleThreadEventExecutor是一个只有一个线程的线程池
    children = new SingleThreadEventExecutor[nThreads];
    //根据数组的大小,采用不同策略初始化chooser。
    if (isPowerOfTwo(children.length)) {
        chooser = new PowerOfTwoEventExecutorChooser();
    } else {
        chooser = new GenericEventExecutorChooser();
    }

    for (int i = 0; i < nThreads; i ++) {
        boolean success = false;
        try {
            //
            children[i] = newChild(threadFactory, args);
            success = true;
        } catch (Exception e) {
            // TODO: Think about if this is a good exception type
            throw new IllegalStateException("failed to create a child event loop", e);
        } finally {
            //如果没有创建成功,循环关闭所有SingleThreadEventExecutor
            if (!success) {
                for (int j = 0; j < i; j ++) {
                    children[j].shutdownGracefully();
                }

                //等待关闭成功
                for (int j = 0; j < i; j ++) {
                    EventExecutor e = children[j];
                    try {
                        while (!e.isTerminated()) {
                            e.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS);
                        }
                    } catch (InterruptedException interrupted) {
                        Thread.currentThread().interrupt();
                        break;
                    }
                }
            }
        }
    }

    final FutureListener terminationListener = new FutureListener() {
        @Override
        public void operationComplete(Future future) throws Exception {
            if (terminatedChildren.incrementAndGet() == children.length) {
                terminationFuture.setSuccess(null);
            }
        }
    };

    for (EventExecutor e: children) {
        e.terminationFuture().addListener(terminationListener);
    }
}
 
 

回到NioEventLoopGroup的newChild方法重载

@Override
protected EventExecutor newChild(
        ThreadFactory threadFactory, Object... args) throws Exception {
    return new NioEventLoop(this, threadFactory, (SelectorProvider) args[0]);
}

MultithreadEventExecutorGroup构造方法中执行的是NioEventLoopGroup中的newChild方法,所以children元素的实际类型是NioEventLoop。

解释下EventExecutorChooser的选择

    //判断一个数是否是2的幂次方
    private static boolean isPowerOfTwo(int val) {
        return (val & -val) == val;
    }
    
    private final class PowerOfTwoEventExecutorChooser implements EventExecutorChooser {
        @Override
        public EventExecutor next() {
            return children[childIndex.getAndIncrement() & children.length - 1];
        }
    }

    private final class GenericEventExecutorChooser implements EventExecutorChooser {
        @Override
        public EventExecutor next() {
            return children[Math.abs(childIndex.getAndIncrement() % children.length)];
        }
    }

EventExecutorChooser是根据线程数组大小是否是2的幂次方来选择初始化chooser。如果大小为2的幂次方,则采用PowerOfTwoEventExecutorChooser,否则使用GenericEventExecutorChooser。也就是如果线程数是2的倍数时,Netty选择线程时会使用PowerOfTwoEventExecutorChooser,因为&比%更快(Netty为了性能也是拼了).

二、NioEventLoop

NioEventLoop中维护了一个线程,里面有一个任务队列和一个延迟任务队列,每个EventLoop有一个Selector,这里强制借用一张图而且不留地址!

Netty线程源码分析(一)_第4张图片
image

继承关系:


Netty线程源码分析(一)_第5张图片
1

NioEventLoop初始化

NioEventLoop(NioEventLoopGroup parent, ThreadFactory threadFactory, SelectorProvider selectorProvider) {
    super(parent, threadFactory, false);
    if (selectorProvider == null) {
        throw new NullPointerException("selectorProvider");
    }
    provider = selectorProvider;
    selector = openSelector();
}

1、调用父类方法构造一个taskQueue,它是一个LinkedBlockingQueue

2、openSelector(): Netty是基于Nio实现的,所以也离不开selector。

3、DISABLE_KEYSET_OPTIMIZATION: 判断是否需要对sun.nio.ch.SelectorImpl中的selectedKeys进行优化, 不做配置的话默认需要优化,通过反射将selectedKeySet与sun.nio.ch.SelectorImpl中的两个field绑定

4、主要优化在哪: SelectorImpl原来的selectedKeys和publicSelectedKeys数据结构是HashSet,大家知道HashSet的数据结构是数组+链表,新的数据结构是由2个数组A、B组成,初始大小是1024,避免了HashSet扩容带来的性能问题。除了扩容外,遍历效率也是一个原因,对于需要遍历selectedKeys的全部元素, 数组效率无疑是最高的。

private Selector openSelector() {
    final Selector selector;
    try {
        selector = provider.openSelector();
    } catch (IOException e) {
        throw new ChannelException("failed to open a new selector", e);
    }

    if (DISABLE_KEYSET_OPTIMIZATION) {
        return selector;
    }

    try {
        SelectedSelectionKeySet selectedKeySet = new SelectedSelectionKeySet();

        Class selectorImplClass =
                Class.forName("sun.nio.ch.SelectorImpl", false, PlatformDependent.getSystemClassLoader());

        // Ensure the current selector implementation is what we can instrument.
        if (!selectorImplClass.isAssignableFrom(selector.getClass())) {
            return selector;
        }

        Field selectedKeysField = selectorImplClass.getDeclaredField("selectedKeys");
        Field publicSelectedKeysField = selectorImplClass.getDeclaredField("publicSelectedKeys");

        selectedKeysField.setAccessible(true);
        publicSelectedKeysField.setAccessible(true);

        selectedKeysField.set(selector, selectedKeySet);
        publicSelectedKeysField.set(selector, selectedKeySet);

        selectedKeys = selectedKeySet;
        logger.trace("Instrumented an optimized java.util.Set into: {}", selector);
    } catch (Throwable t) {
        selectedKeys = null;
        logger.trace("Failed to instrument an optimized java.util.Set into: {}", selector, t);
    }

    return selector;
}

NioEventLoop的启动

在上一遍讲过NioEventLoop中维护了一个线程,线程启动时会调用NioEventLoop的run方法,loop会不断循环一个过程:select -> processSelectedKeys(IO任务) -> runAllTasks(非IO任务)

  • I/O任务: 即selectionKey中ready的事件,如accept、connect、read、write等

  • 非IO任务: 添加到taskQueue中的任务,如bind、channelActive等

@Override
protected void run() {
    for (;;) {
        boolean oldWakenUp = wakenUp.getAndSet(false);
        try {
            // 判断是否有非IO任务,如果有立刻返回
            if (hasTasks()) {
                selectNow();
            } else {
                select(oldWakenUp);

                if (wakenUp.get()) {
                    selector.wakeup();
                }
            }

            cancelledKeys = 0;
            needsToSelectAgain = false;
            final int ioRatio = this.ioRatio;
            if (ioRatio == 100) {
                // IO任务
                processSelectedKeys();
                // 非IO任务
                runAllTasks();
            } else {
                // 用以控制IO任务与非IO任务的运行时间比
                final long ioStartTime = System.nanoTime();
                // IO任务
                processSelectedKeys();

                final long ioTime = System.nanoTime() - ioStartTime;
                // 非IO任务
                runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
            }

            if (isShuttingDown()) {
                closeAll();
                if (confirmShutdown()) {
                    break;
                }
            }
        } catch (Throwable t) {
            // Prevent possible consecutive immediate failures that lead to
            // excessive CPU consumption.
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                // Ignore.
            }
        }
    }
}

1、wakenUp: 用来决定是否调用selector.wakeup(),只有当wakenUp未true时才会调用,目的是为了减少wake-up的负载,因为Selector.wakeup()是一个昂贵的操作。

2、hasTask(): 判断是否有非IO任务,如果有的话,选择调用非阻塞的selectNow()让select立即返回, 否则以阻塞的方式调用select.timeoutMillis是阻塞时间

3、ioRatio: 控制两种任务的执行时间,你可以通过它来限制非IO任务的执行时间, 默认值是50, 表示允许非IO任务获得和IO任务相同的执行时间,这个值根据自己的具体场景来设置.

4、processSelectedKeys(): 处理IO事件

5、runAllTasks(): 处理非IO任务

6、isShuttingDown(): 检查state是否被标记为ST_SHUTTING_DOWN

private void select(boolean oldWakenUp) throws IOException {
        Selector selector = this.selector;
        try {
            int selectCnt = 0;
            long currentTimeNanos = System.nanoTime();
            long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);
            for (;;) {
                long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
                if (timeoutMillis <= 0) {
                    if (selectCnt == 0) {
                        selector.selectNow();
                        selectCnt = 1;
                    }
                    break;
                }

                int selectedKeys = selector.select(timeoutMillis);
                selectCnt ++;

                if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
                    // - Selected something,
                    // - waken up by user, or
                    // - the task queue has a pending task.
                    // - a scheduled task is ready for processing
                    break;
                }
                if (Thread.interrupted()) {
                    // Thread was interrupted so reset selected keys and break so we not run into a busy loop.
                    // As this is most likely a bug in the handler of the user or it's client library we will
                    // also log it.
                    //
                    // See https://github.com/netty/netty/issues/2426
                    if (logger.isDebugEnabled()) {
                        logger.debug("Selector.select() returned prematurely because " +
                                "Thread.currentThread().interrupt() was called. Use " +
                                "NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop.");
                    }
                    selectCnt = 1;
                    break;
                }

                long time = System.nanoTime();
                if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
                    // timeoutMillis elapsed without anything selected.
                    selectCnt = 1;
                } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
                        selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
                    // The selector returned prematurely many times in a row.
                    // Rebuild the selector to work around the problem.
                    logger.warn(
                            "Selector.select() returned prematurely {} times in a row; rebuilding selector.",
                            selectCnt);

                    rebuildSelector();
                    selector = this.selector;

                    // Select again to populate selectedKeys.
                    selector.selectNow();
                    selectCnt = 1;
                    break;
                }

                currentTimeNanos = time;
            }

            if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS) {
                if (logger.isDebugEnabled()) {
                    logger.debug("Selector.select() returned prematurely {} times in a row.", selectCnt - 1);
                }
            }
        } catch (CancelledKeyException e) {
            if (logger.isDebugEnabled()) {
                logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector - JDK bug?", e);
            }
        }
    }
protected long delayNanos(long currentTimeNanos) {
    ScheduledFutureTask scheduledTask = peekScheduledTask();
    if (scheduledTask == null) {
        return SCHEDULE_PURGE_INTERVAL;
    }

    return scheduledTask.delayNanos(currentTimeNanos);
}

public long delayNanos(long currentTimeNanos) {
    return Math.max(0, deadlineNanos() - (currentTimeNanos - START_TIME));
}

public long deadlineNanos() {
    return deadlineNanos;
}

1、delayNanos(currentTimeNanos): 在父类SingleThreadEventExecutor中有一个延迟执行任务的队列,delayNanos就是去这个延迟队列里看是否有非IO任务未执行

  • 如果没有则返回1秒钟。
  • 如果延迟队列里有任务并且最终的计算出来的时间(selectDeadLineNanos - currentTimeNanos)小于500000L纳秒,就调用selectNow()直接返回,反之执行阻塞的select

2、select如果遇到以下几种情况会立即返回

if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
                    // - Selected something,
                    // - waken up by user, or
                    // - the task queue has a pending task.
                    // - a scheduled task is ready for processing
                    break;
                }
  1. Selected something 如果select到了就绪连接(selectedKeys > 0)
  2. waken up by user 被用户唤醒了
  3. the task queue has a pending task.任务队列来了一个新任务
  4. a scheduled task is ready for processing 延迟队列里面有个预约任务需要到期执行

3、selectCnt: 记录select空转的次数(selectCnt),该方法解决了Nio中臭名昭著selector的select方法导致cpu100%的BUG,当空转的次数超过了512(定义一个阀值,这个阀值默认是512,可以在应用层通过设置系统属性io.netty.selectorAutoRebuildThreshold传入),Netty会重新构建新的Selector,将老Selector上注册的Channel转移到新建的Selector上,关闭老Selector,用新的Selector代替老Selector。详细看下面rebuildSelector()方法

4、rebuildSelector(): 就是上面说过得。

public void rebuildSelector() {
    if (!inEventLoop()) {
        execute(new Runnable() {
            @Override
            public void run() {
                rebuildSelector();
            }
        });
        return;
    }

    final Selector oldSelector = selector;
    final Selector newSelector;

    if (oldSelector == null) {
        return;
    }

    try {
        newSelector = openSelector();
    } catch (Exception e) {
        logger.warn("Failed to create a new Selector.", e);
        return;
    }

    // Register all channels to the new Selector.
    int nChannels = 0;
    for (;;) {
        try {
            for (SelectionKey key: oldSelector.keys()) {
                Object a = key.attachment();
                try {
                    if (!key.isValid() || key.channel().keyFor(newSelector) != null) {
                        continue;
                    }

                    int interestOps = key.interestOps();
                    key.cancel();
                    SelectionKey newKey = key.channel().register(newSelector, interestOps, a);
                    if (a instanceof AbstractNioChannel) {
                        // Update SelectionKey
                        ((AbstractNioChannel) a).selectionKey = newKey;
                    }
                    nChannels ++;
                } catch (Exception e) {
                    logger.warn("Failed to re-register a Channel to the new Selector.", e);
                    if (a instanceof AbstractNioChannel) {
                        AbstractNioChannel ch = (AbstractNioChannel) a;
                        ch.unsafe().close(ch.unsafe().voidPromise());
                    } else {
                        @SuppressWarnings("unchecked")
                        NioTask task = (NioTask) a;
                        invokeChannelUnregistered(task, key, e);
                    }
                }
            }
        } catch (ConcurrentModificationException e) {
            // Probably due to concurrent modification of the key set.
            continue;
        }

        break;
    }

    selector = newSelector;

    try {
        // time to close the old selector as everything else is registered to the new one
        oldSelector.close();
    } catch (Throwable t) {
        if (logger.isWarnEnabled()) {
            logger.warn("Failed to close the old Selector.", t);
        }
    }

    logger.info("Migrated " + nChannels + " channel(s) to the new Selector.");
}

你可能感兴趣的:(Netty线程源码分析(一))