Java读源码之Netty深入剖析----4.NioEventLoop
分析Netty reactor线程处理过程,包括事件监听,事件处理,常规任务处理和定时任务处理
4-1 NioEventLoop概述
4-2 NioEventLoop创建概述
4-3 ThreadPerTaskThread
4-4 创建NioEventLoop线程
4-5 创建线程选择器
4-6 NioEventLoop的启动
4-7 NioEventLoop执行概述
4-8 检测IO事件
4-9 处理IO事件
4-10 -reactor线程任务的执行
4-11 -NioEventLoop总结
初始阅读源码的时候,晦涩难懂,枯燥无味,一段时间之后就会觉得豁然开朗,被源码的魅力深深折服。
接下去要阅读的是netty的一个重要组件,NioEventLoop。
将会分为以下几点分析。
一、NioEventLoop源码
1、NioEventLoop创建
2、NioEventLoop启动
3、NioEventLoop执行逻辑
二、回答如下问题:
1、默认情况下,netty服务器启动多少个线程?何时启动?
2、netty是如何解决jdk空轮询的bug的?
3、netty如何保证异步串行无锁化?
-------------------------------------------------------------------------------------------
netty源码阅读之NioEventLoop之NioEventLoop创建
从new NioEventLoopGroup()进入分析NioEventLoop创建,创建分为以下几个过程:
1、创建线程创建器:new ThreadPerTaskExecutor()
2、构造NioEventLoop:for{newChild()}
3、创建线程选择器:chooserFactory.newChooser()
从new NioEventLoopGroup()进入,一层层进入,会有下面一段代码:
protected MultithreadEventExecutorGroup(int nThreads, Executor executor,
EventExecutorChooserFactory chooserFactory, Object... args) {
...
if (executor == null) {
executor = new ThreadPerTaskExecutor(newDefaultThreadFactory());
}
children = new EventExecutor[nThreads];
for (int i = 0; i < nThreads; i ++) {
boolean success = false;
try {
children[i] = newChild(executor, args);
success = true;
} catch (Exception e) {
// TODO: Think about if this is a good exception type
throw new IllegalStateException("failed to create a child event loop", e);
} finally {
...
}
chooser = chooserFactory.newChooser(children);
...
}
这就是刚刚说的三个过程,我们一步步分析
一、创建线程创建器:new ThreadPerTaskExecutor()
1、每次执行任务都会创建一个线程实体
2、NioEventLoop线程的命名规则nioEventLoop-(第几个线程池)-(这个线程池的第几个线程)
从new NioEventLoopGroup()进入,一层层进入,看到如下:
public NioEventLoopGroup(int nThreads, Executor executor) {
this(nThreads, executor, SelectorProvider.provider());
}
也就是,每个NioEventLoopGroup都会有一个selector,从这里创建。
继续进去,有如下代码:
protected MultithreadEventLoopGroup(int nThreads, Executor executor, Object... args) {
super(nThreads == 0 ? DEFAULT_EVENT_LOOP_THREADS : nThreads, executor, args);
}
如果没有定义线程数量,也就是为0的时候,就使用DEFAULT_EVENT_LOOP_THREADS,它的定义为:
private static final int DEFAULT_EVENT_LOOP_THREADS;
static {
DEFAULT_EVENT_LOOP_THREADS = Math.max(1, SystemPropertyUtil.getInt(
"io.netty.eventLoopThreads", Runtime.getRuntime().availableProcessors() * 2));
if (logger.isDebugEnabled()) {
logger.debug("-Dio.netty.eventLoopThreads: {}", DEFAULT_EVENT_LOOP_THREADS);
}
}
默认为系统线程数的两倍。
然后,继续进入,来到我们最开始分析的:
if (executor == null) {
executor = new ThreadPerTaskExecutor(newDefaultThreadFactory());
}
查看这个new ThreadPerTaskExecutor()的定义
public final class ThreadPerTaskExecutor implements Executor {
private final ThreadFactory threadFactory;
public ThreadPerTaskExecutor(ThreadFactory threadFactory) {
if (threadFactory == null) {
throw new NullPointerException("threadFactory");
}
this.threadFactory = threadFactory;
}
@Override
public void execute(Runnable command) {
threadFactory.newThread(command).start();
}
}
就是传进去一个ThreadFacotory,通过这个threadFactory产生线程。
回答我们之前的一个问题:NioEventLoop什么时候创建线程,在执行ThreadPerTaskExecutor这个execute方法的时候,把一个Runnable传进去创建线程。也就是每次执行任务的时候,创建一个线程实体。
回到newDefaultThreadFactory(),查看实现,可以知道上面一步的threadFactory就是DefaultThreadFactory,在里面,有一个
public DefaultThreadFactory(Class> poolType, boolean daemon, int priority) {
this(toPoolName(poolType), daemon, priority);
}
toPoolName(poolType),他的实现就是:
public static String toPoolName(Class> poolType) {
if (poolType == null) {
throw new NullPointerException("poolType");
}
String poolName = StringUtil.simpleClassName(poolType);
switch (poolName.length()) {
case 0:
return "unknown";
case 1:
return poolName.toLowerCase(Locale.US);
default:
if (Character.isUpperCase(poolName.charAt(0)) && Character.isLowerCase(poolName.charAt(1))) {
return Character.toLowerCase(poolName.charAt(0)) + poolName.substring(1);
} else {
return poolName;
}
}
}
返回其实就是nioEventLoop,因为poolType是NioEventLoop。
一层层点,查看到另外一个构造函数:
public DefaultThreadFactory(String poolName, boolean daemon, int priority, ThreadGroup threadGroup) {
if (poolName == null) {
throw new NullPointerException("poolName");
}
if (priority < Thread.MIN_PRIORITY || priority > Thread.MAX_PRIORITY) {
throw new IllegalArgumentException(
"priority: " + priority + " (expected: Thread.MIN_PRIORITY <= priority <= Thread.MAX_PRIORITY)");
}
prefix = poolName + '-' + poolId.incrementAndGet() + '-';
this.daemon = daemon;
this.priority = priority;
this.threadGroup = threadGroup;
}
添加了两个连接符,并把当前线程池的id获取到并且加一了。所以在这个类实现newThread这里,线程的名称就出来了:
@Override
public Thread newThread(Runnable r) {
Thread t = newThread(new DefaultRunnableDecorator(r), prefix + nextId.incrementAndGet());
...
return t;
}
至于它的线程,就是这个自定义的FastThreadLocalThread:
protected Thread newThread(Runnable r, String name) {
return new FastThreadLocalThread(threadGroup, r, name);
}
二、构造NioEventLoop:for{newChild()}
这一步做了三件事情:
1、保存上面创建的线程执行器ThreadPerTaskExecutor
2、创建一个MpscQueue
3、创建一个selector
首先我们看一个类图:
newChild出来的就是NioEventLoop,它继承自SingleThreadEventExecutor:
protected SingleThreadEventExecutor(EventExecutorGroup parent, Executor executor,
boolean addTaskWakesUp, int maxPendingTasks,
RejectedExecutionHandler rejectedHandler) {
super(parent);
this.addTaskWakesUp = addTaskWakesUp;
this.maxPendingTasks = Math.max(16, maxPendingTasks);
this.executor = ObjectUtil.checkNotNull(executor, "executor");
taskQueue = newTaskQueue(this.maxPendingTasks);
rejectedExecutionHandler = ObjectUtil.checkNotNull(rejectedHandler, "rejectedHandler");
}
在这里,它把刚刚传经来的executor绑定进去了。
然后,newTaskQueue创建的是mpscQueue:
@Override
protected Queue
// This event loop never calls takeTask()
return PlatformDependent.newMpscQueue(maxPendingTasks);
}
看看定义:
/**
* Create a new {@link Queue} which is safe to use for multiple producers (different threads) and a single
* consumer (one thread!).
*/
public static
return Mpsc.newMpscQueue(maxCapacity);
}
也就是,一个消费者多个生产者。
另外创造selector是在这里实现的:
NioEventLoop(NioEventLoopGroup parent, Executor executor, SelectorProvider selectorProvider,
SelectStrategy strategy, RejectedExecutionHandler rejectedExecutionHandler) {
super(parent, executor, false, DEFAULT_MAX_PENDING_TASKS, rejectedExecutionHandler);
if (selectorProvider == null) {
throw new NullPointerException("selectorProvider");
}
if (strategy == null) {
throw new NullPointerException("selectStrategy");
}
provider = selectorProvider;
selector = openSelector();
selectStrategy = strategy;
}
三、创建线程选择器:chooserFactory.newChooser()
这里面对线程的轮询采用了优化的方式
isPowerOfTwo()判断是否是2的幂
1、是
采用PowerOfTwoEventExecutorChooser(优化点),轮询的方式:
index++&(lenght-1)
2、不是
GenericEventExecutorChooser(),轮询的方式:
abs(index++%length)
直接贴代码看好了:
public final class DefaultEventExecutorChooserFactory implements EventExecutorChooserFactory {
public static final DefaultEventExecutorChooserFactory INSTANCE = new DefaultEventExecutorChooserFactory();
private DefaultEventExecutorChooserFactory() { }
@SuppressWarnings("unchecked")
@Override
public EventExecutorChooser newChooser(EventExecutor[] executors) {
if (isPowerOfTwo(executors.length)) {
return new PowerOfTowEventExecutorChooser(executors);
} else {
return new GenericEventExecutorChooser(executors);
}
}
private static boolean isPowerOfTwo(int val) {
return (val & -val) == val;
}
private static final class PowerOfTowEventExecutorChooser implements EventExecutorChooser {
private final AtomicInteger idx = new AtomicInteger();
private final EventExecutor[] executors;
PowerOfTowEventExecutorChooser(EventExecutor[] executors) {
this.executors = executors;
}
@Override
public EventExecutor next() {
return executors[idx.getAndIncrement() & executors.length - 1];
}
}
private static final class GenericEventExecutorChooser implements EventExecutorChooser {
private final AtomicInteger idx = new AtomicInteger();
private final EventExecutor[] executors;
GenericEventExecutorChooser(EventExecutor[] executors) {
this.executors = executors;
}
@Override
public EventExecutor next() {
return executors[Math.abs(idx.getAndIncrement() % executors.length)];
}
}
}
解释更优的原因:计算机底层,&比%计算速度更快,&直接通过二进制操作可以实现,但是%计算机底层没有二进制简单的实现,需要通过复杂实现。
正确的原因:实现目的就是轮询到最后一个,就开始重头轮询,普通方式的实现是可以的,我们现在解析PowerOfTwoEventExecutorChooser:
假设现在有16个线程,二进制为10000,length-1为1111,线程索引从0开始,idx也从0开始
1、轮询到第15次,也就是idx为1110,结果为1110,由于线程索引从0开始,就是第15个线程,正确
2、轮询到第16次,也就是idx为1111,结果为1111,由于线程索引从0开始,就是第16个线程,正确
3、关键是第17次,idx就是10000,结果为0000,由于线程索引从0开始,也就是第1个线程,正确
4、轮询到第18次,idx就是10001,结果为0001,由于线程索引从0开始,也就是第2个线程,正确
是不是很神奇?
因为2次幂的length减一,所有的位都是1。&的时候,在它前面的位置都会置0,也就是又可以重新开始了。
---------------------------------------------------------------------------------------------------------------------------------------------------------
netty源码阅读之NioEventLoop之NioEventLoop启动
触发NioEventLoop启动有两个方式:
1、服务端启动绑定端口
2、新连接接入通过chooser绑定一个NioEventLoop
在这里,我们先讲解第一种方式,后续文章讲解第二种。
NioEventLoop第一种启动方式入口从用户代码bind()进入,initAndRegister()方法的后面,有一个doBind0(),进入,便看到,channel绑定的NioEventLoop的execute方法:
channel.eventLoop().execute(new Runnable() {
@Override
public void run() {
if (regFuture.isSuccess()) {
channel.bind(localAddress, promise).addListener(ChannelFutureListener.CLOSE_ON_FAILURE);
} else {
promise.setFailure(regFuture.cause());
}
}
});
execute()做了添加线程和添加任务的事情,在添加线程里面,执行的是之前传进来的ThreadPerTaskExecutor的excute方法,这个方法实际上就是创建线程,绑定线程,执行线程:
1、thread=Thread.currentThread()
2、NioEventLoop.run()
第二个步骤下一篇文章详细讲解。现在来一步步分析:
NioEventLoop的execute方法这里:
@Override
public void execute(Runnable task) {
if (task == null) {
throw new NullPointerException("task");
}
boolean inEventLoop = inEventLoop();
if (inEventLoop) {
addTask(task);
} else {
startThread();
addTask(task);
if (isShutdown() && removeTask(task)) {
reject();
}
}
if (!addTaskWakesUp && wakesUpForTask(task)) {
wakeup(inEventLoop);
}
}
inEventLoop判断给我任务的线程是不是NioEventLoop的线程,如果不是就需要调用startThread()启动线程并且绑定到NioEventLoop上面,然后把任务丢到任务队列里面;否则,直接把这个任务添加到任务队列里面(这个队列就是上一篇的MpscQueue)。在这里,给任务的线程是主线程,并且NioEventLoop的线程也没有启动,所以调用startThread()方法.
startThread()一直进入,有个doStartThread方法:
private void doStartThread() {
assert thread == null;
executor.execute(new Runnable() {
@Override
public void run() {
thread = Thread.currentThread();
...
try {
SingleThreadEventExecutor.this.run();
success = true;
} catch (Throwable t) {
logger.warn("Unexpected exception from an event executor: ", t);
} finally {
...
}
}
});
}
这里就是调用ThreadPerTaskExecutor的execute方法,把新建的线程绑定到NioEventLoop的thread对象里面。然后有个
SingleThreadEventExecutor.this.run()方法,真正的去执行NioEventLoop的任务。
至于启动线程之后的addTask()方法,其实很简单,就是把任务加入到我们之前创建的newMpscQueue队列里面:
protected void addTask(Runnable task) {
if (task == null) {
throw new NullPointerException("task");
}
if (!offerTask(task)) {
reject(task);
}
}
final boolean offerTask(Runnable task) {
if (isShutdown()) {
reject();
}
return taskQueue.offer(task);
}
---------------------------------------------------------------------------------------------
netty源码阅读之NioEventLoop之NioEventLoop执行
在《netty源码阅读之NioEventLoop之NioEventLoop启动》这里,有一个这个方法,就是最终执行的任务:
SingleThreadEventExecutor.this.run();
看看NioEventLoop的实现:
@Override
protected void run() {
for (;;) {
try {
switch (selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())) {
case SelectStrategy.CONTINUE:
continue;
case SelectStrategy.SELECT:
select(wakenUp.getAndSet(false));
// 'wakenUp.compareAndSet(false, true)' is always evaluated
// before calling 'selector.wakeup()' to reduce the wake-up
// overhead. (Selector.wakeup() is an expensive operation.)
//
// However, there is a race condition in this approach.
// The race condition is triggered when 'wakenUp' is set to
// true too early.
//
// 'wakenUp' is set to true too early if:
// 1) Selector is waken up between 'wakenUp.set(false)' and
// 'selector.select(...)'. (BAD)
// 2) Selector is waken up between 'selector.select(...)' and
// 'if (wakenUp.get()) { ... }'. (OK)
//
// In the first case, 'wakenUp' is set to true and the
// following 'selector.select(...)' will wake up immediately.
// Until 'wakenUp' is set to false again in the next round,
// 'wakenUp.compareAndSet(false, true)' will fail, and therefore
// any attempt to wake up the Selector will fail, too, causing
// the following 'selector.select(...)' call to block
// unnecessarily.
//
// To fix this problem, we wake up the selector again if wakenUp
// is true immediately after selector.select(...).
// It is inefficient in that it wakes up the selector for both
// the first case (BAD - wake-up required) and the second case
// (OK - no wake-up required).
if (wakenUp.get()) {
selector.wakeup();
}
default:
// fallthrough
}
cancelledKeys = 0;
needsToSelectAgain = false;
final int ioRatio = this.ioRatio;
if (ioRatio == 100) {
try {
processSelectedKeys();
} finally {
// Ensure we always run tasks.
runAllTasks();
}
} else {
final long ioStartTime = System.nanoTime();
try {
processSelectedKeys();
} finally {
// Ensure we always run tasks.
final long ioTime = System.nanoTime() - ioStartTime;
runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
}
}
} catch (Throwable t) {
handleLoopException(t);
}
// Always handle shutdown even if the loop processing threw an exception.
try {
if (isShuttingDown()) {
closeAll();
if (confirmShutdown()) {
return;
}
}
} catch (Throwable t) {
handleLoopException(t);
}
}
}
其实里面就做三件事情:
1、select():检查是否有io事件 详解
2、processSelectedKeys():处理io任务,即selectionKey中ready的事件,如accept、connect、read、write等 详解
3、runAllTasks():处理异步任务队列里面的任务,也就是添加到taskQueue中的任务,如bind、channelActive等 详解
详情请点击详解
----------------------------------------------------------------------------------------------
netty源码阅读之NioEventLoop之NioEventLoop执行----select()检查io事件
从《netty源码阅读之NioEventLoop之NioEventLoop执行》的select(wakenUp.getAndSet(false))这个函数开始分析,select的时候,先把wakenUp这个标志位设置为false,也就是现在没有用户唤醒了,并把之前是否被用户唤醒作为oldWakenUp传入,方便以后使用,进入之后的代码为:
private void select(boolean oldWakenUp) throws IOException {
Selector selector = this.selector;
try {
int selectCnt = 0;
long currentTimeNanos = System.nanoTime();
long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);
for (;;) {
long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
if (timeoutMillis <= 0) {
if (selectCnt == 0) {
selector.selectNow();
selectCnt = 1;
}
break;
}
// If a task was submitted when wakenUp value was true, the task didn't get a chance to call
// Selector#wakeup. So we need to check task queue again before executing select operation.
// If we don't, the task might be pended until select operation was timed out.
// It might be pended until idle timeout if IdleStateHandler existed in pipeline.
if (hasTasks() && wakenUp.compareAndSet(false, true)) {
selector.selectNow();
selectCnt = 1;
break;
}
int selectedKeys = selector.select(timeoutMillis);
selectCnt ++;
if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
// - Selected something,
// - waken up by user, or
// - the task queue has a pending task.
// - a scheduled task is ready for processing
break;
}
if (Thread.interrupted()) {
// Thread was interrupted so reset selected keys and break so we not run into a busy loop.
// As this is most likely a bug in the handler of the user or it's client library we will
// also log it.
//
// See https://github.com/netty/netty/issues/2426
if (logger.isDebugEnabled()) {
logger.debug("Selector.select() returned prematurely because " +
"Thread.currentThread().interrupt() was called. Use " +
"NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop.");
}
selectCnt = 1;
break;
}
long time = System.nanoTime();
if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
// timeoutMillis elapsed without anything selected.
selectCnt = 1;
} else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
// The selector returned prematurely many times in a row.
// Rebuild the selector to work around the problem.
logger.warn(
"Selector.select() returned prematurely {} times in a row; rebuilding Selector {}.",
selectCnt, selector);
rebuildSelector();
selector = this.selector;
// Select again to populate selectedKeys.
selector.selectNow();
selectCnt = 1;
break;
}
currentTimeNanos = time;
}
if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS) {
if (logger.isDebugEnabled()) {
logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
selectCnt - 1, selector);
}
}
} catch (CancelledKeyException e) {
if (logger.isDebugEnabled()) {
logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector {} - JDK bug?",
selector, e);
}
// Harmless exception - log anyway
}
}
很长,我们分三步来解析这个源码:
1、检查定时任务是否到期,检查是否有任务执行,二者有一就进行非阻塞select(jdk原生select)
2、如果上面都没有,则进行一定时间的阻塞式select,也就是int selectedKeys = selector.select(timeoutMillis);
3、最后判断是否空轮询,是否达到空轮询的次数,如果达到,就把重建selector,这个selector就有可能不会有空轮询的情况发生了。
下面我们一个个分析:
1、
long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);返回的是定时任务队列的最近的需要执行的时间(这个定时任务队列,和我们之前的mpsc队列不同,后面会讲到),如果这个时间减去当前时间小于0,那么就说明有定时任务要执行,直接非阻塞select后退出,并且把selectCnt 置为1(后面会讲解);
// If a task was submitted when wakenUp value was true, the task didn't get a chance to call
// Selector#wakeup. So we need to check task queue again before executing select operation.
// If we don't, the task might be pended until select operation was timed out.
// It might be pended until idle timeout if IdleStateHandler existed in pipeline.
if (hasTasks() && wakenUp.compareAndSet(false, true)) {
selector.selectNow();
selectCnt = 1;
break;
}
这个也是说明有任务,和上执行的操作和上面的一样。
2、
如果前面两步都没有,就进行特定时间的阻塞式等待:
int selectedKeys = selector.select(timeoutMillis);
selectCnt ++;
并且把selectCnt加1,方便后面使用。
一定时间的阻塞式等待之后,如果发生了以下时间,都退出select:
if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
// - Selected something,
// - waken up by user, or
// - the task queue has a pending task.
// - a scheduled task is ready for processing
break;
}
3、
以下这段代码判断是否有空轮询发生
long time = System.nanoTime();
if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
// timeoutMillis elapsed without anything selected.
selectCnt = 1;
}
我们换一种方式解释,把currentTimeNonos放到左边来。也就是当前时间减去进来这个方法的时间(阻塞之前的时间),大于阻塞的时间的话,那么就没有发生空轮询,selectCnt重置为1;否则,本来应该阻塞这么长时间,没有阻塞,就是发生了空轮询。到这里就明白selectCnt可以简单的解释为空轮询的次数吧,因为如果不发生空轮询,selectCnt会重置为1(或者操作了其他任务)。
发生空轮询到达一定的次数SELECTOR_AUTO_REBUILD_THRESHOLD(查看源码可以知道是512),就会重建selector,并把selectCnt次数重置为1
else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
// The selector returned prematurely many times in a row.
// Rebuild the selector to work around the problem.
logger.warn(
"Selector.select() returned prematurely {} times in a row; rebuilding Selector {}.",
selectCnt, selector);
rebuildSelector();
selector = this.selector;
// Select again to populate selectedKeys.
selector.selectNow();
selectCnt = 1;
break;
}
查看rebuildSelector()
public void rebuildSelector() {
if (!inEventLoop()) {
execute(new Runnable() {
@Override
public void run() {
rebuildSelector();
}
});
return;
}
final Selector oldSelector = selector;
final Selector newSelector;
if (oldSelector == null) {
return;
}
try {
newSelector = openSelector();
} catch (Exception e) {
logger.warn("Failed to create a new Selector.", e);
return;
}
// Register all channels to the new Selector.
int nChannels = 0;
for (;;) {
try {
for (SelectionKey key: oldSelector.keys()) {
Object a = key.attachment();
try {
if (!key.isValid() || key.channel().keyFor(newSelector) != null) {
continue;
}
int interestOps = key.interestOps();
key.cancel();
SelectionKey newKey = key.channel().register(newSelector, interestOps, a);
if (a instanceof AbstractNioChannel) {
// Update SelectionKey
((AbstractNioChannel) a).selectionKey = newKey;
}
nChannels ++;
} catch (Exception e) {
logger.warn("Failed to re-register a Channel to the new Selector.", e);
if (a instanceof AbstractNioChannel) {
AbstractNioChannel ch = (AbstractNioChannel) a;
ch.unsafe().close(ch.unsafe().voidPromise());
} else {
@SuppressWarnings("unchecked")
NioTask
invokeChannelUnregistered(task, key, e);
}
}
}
} catch (ConcurrentModificationException e) {
// Probably due to concurrent modification of the key set.
continue;
}
break;
}
selector = newSelector;
try {
// time to close the old selector as everything else is registered to the new one
oldSelector.close();
} catch (Throwable t) {
if (logger.isWarnEnabled()) {
logger.warn("Failed to close the old Selector.", t);
}
}
logger.info("Migrated " + nChannels + " channel(s) to the new Selector.");
}
其实就是把所有之前绑定的东西都重新绑定到新的selector上面。
netty通过这个方式,巧妙地解决了jdk空轮询的bug。
---------------------------------------------------------------------------------------------------------------------------
netty源码阅读之NioEventLoop之NioEventLoop执行-----processSelectedKey()执行
从《netty源码阅读之NioEventLoop之NioEventLoop执行》知道,select之后,就是processSelectedKey()
这里我们要学习两点:
1、selected keySet的优化
2、processSelectedKeysOptimized()执行逻辑
1、selected keySet的优化
在《netty源码阅读之NioEventLoop之NioEventLoop创建》这边文章中,我们在第二步中创建了一个selector,然后我们跟进去看看创建的过程,从selector = openSelector()进入:
private Selector openSelector() {
final Selector selector;
try {
selector = provider.openSelector();
} catch (IOException e) {
throw new ChannelException("failed to open a new selector", e);
}
if (DISABLE_KEYSET_OPTIMIZATION) {
return selector;
}
final SelectedSelectionKeySet selectedKeySet = new SelectedSelectionKeySet();
Object maybeSelectorImplClass = AccessController.doPrivileged(new PrivilegedAction
首先用一句话来描述就是:把selector的keySet通过反射的方式,由原来HashSet的数据结构修改为数组的形式,使得原来在add操作的时候,可能的O(n)时间复杂度降低到O(1)。
下面我们分析:
selector = provider.openSelector();没什么好说,就是调用jdk底层,创建selector。
final SelectedSelectionKeySet selectedKeySet = new SelectedSelectionKeySet();创建一个keyset,这个keyset是由数组组成的,查看这个类的源码:
final class SelectedSelectionKeySet extends AbstractSet
private SelectionKey[] keysA;
private int keysASize;
private SelectionKey[] keysB;
private int keysBSize;
private boolean isA = true;
SelectedSelectionKeySet() {
keysA = new SelectionKey[1024];
keysB = keysA.clone();
}
@Override
public boolean add(SelectionKey o) {
if (o == null) {
return false;
}
if (isA) {
int size = keysASize;
keysA[size ++] = o;
keysASize = size;
if (size == keysA.length) {
doubleCapacityA();
}
} else {
...
}
return true;
}
private void doubleCapacityA() {
...
}
...
SelectionKey[] flip() {
if (isA) {
isA = false;
keysA[keysASize] = null;
keysBSize = 0;
return keysA;
} else {
isA = true;
keysB[keysBSize] = null;
keysASize = 0;
return keysB;
}
}
@Override
public int size() {
if (isA) {
return keysASize;
} else {
return keysBSize;
}
}
@Override
public boolean remove(Object o) {
return false;
}
@Override
public boolean contains(Object o) {
return false;
}
@Override
public Iterator
throw new UnsupportedOperationException();
}
}
由于我们的keyset不需要remove,contains,iterator等这些操作,所以可以替换为数组。另外,由于原来的keyset也是继承自
AbstractSet,这个也是继承自AbstractSet,所以有替换的可能性。
回到原来的代码
Object maybeSelectorImplClass = AccessController.doPrivileged(new PrivilegedAction
@Override
public Object run() {
try {
return Class.forName(
"sun.nio.ch.SelectorImpl",
false,
PlatformDependent.getSystemClassLoader());
} catch (ClassNotFoundException e) {
return e;
} catch (SecurityException e) {
return e;
}
}
});
if (!(maybeSelectorImplClass instanceof Class) ||
// ensure the current selector implementation is what we can instrument.
!((Class>) maybeSelectorImplClass).isAssignableFrom(selector.getClass())) {
if (maybeSelectorImplClass instanceof Exception) {
Exception e = (Exception) maybeSelectorImplClass;
logger.trace("failed to instrument a special java.util.Set into: {}", selector, e);
}
return selector;
}
这里就是通过反射的方式获取selector的类,并且判断类型,查看SelectorImpl的源码,它的keyset确实是HashSet的实现:
protected Set
protected HashSet
private Set
private Set
protected SelectorImpl(SelectorProvider var1) {
super(var1);
if (Util.atBugLevel("1.4")) {
this.publicKeys = this.keys;
this.publicSelectedKeys = this.selectedKeys;
} else {
this.publicKeys = Collections.unmodifiableSet(this.keys);
this.publicSelectedKeys = Util.ungrowableSet(this.selectedKeys);
}
}
HashSet底层是HashMap实现的,由于存在hash冲突,hashMap添加元素的时候,最坏的情况下,时间复杂度为O(n),而数组每次都添加都后面,时间复杂度都为O(1)。
下面的代码就是标准的反射的流程:
Field selectedKeysField = selectorImplClass.getDeclaredField("selectedKeys");
Field publicSelectedKeysField = selectorImplClass.getDeclaredField("publicSelectedKeys");
selectedKeysField.setAccessible(true);
publicSelectedKeysField.setAccessible(true);
selectedKeysField.set(selector, selectedKeySet);
publicSelectedKeysField.set(selector, selectedKeySet);
把selector类的域读出来,并把selector的特定的域也就是keyset替换为新的提高性能的keyset。
上面的流程中,如果修改不成功,就返回原来没优化的selector
2、processSelectedKeysOptimized()执行逻辑
在NioEventLoop里面点进去processSelectedKeys(),查找它优化过的实现:
private void processSelectedKeysOptimized(SelectionKey[] selectedKeys) {
for (int i = 0;; i ++) {
final SelectionKey k = selectedKeys[i];
if (k == null) {
break;
}
// null out entry in the array to allow to have it GC'ed once the Channel close
// See https://github.com/netty/netty/issues/2363
selectedKeys[i] = null;
final Object a = k.attachment();
if (a instanceof AbstractNioChannel) {
processSelectedKey(k, (AbstractNioChannel) a);
} else {
@SuppressWarnings("unchecked")
NioTask
processSelectedKey(k, task);
}
if (needsToSelectAgain) {
// null out entries in the array to allow to have it GC'ed once the Channel close
// See https://github.com/netty/netty/issues/2363
for (;;) {
i++;
if (selectedKeys[i] == null) {
break;
}
selectedKeys[i] = null;
}
selectAgain();
// Need to flip the optimized selectedKeys to get the right reference to the array
// and reset the index to -1 which will then set to 0 on the for loop
// to start over again.
//
// See https://github.com/netty/netty/issues/1523
selectedKeys = this.selectedKeys.flip();
i = -1;
}
}
}
在这里,就是key,然后进行操作。
-----------------------------------------------------------------------------------------------------------------------
netty源码阅读之NioEventLoop之NioEventLoop执行-----runAllTask
从《netty源码阅读之NioEventLoop之NioEventLoop执行》知道,select之后,有两个步骤:
processSelectedKey()和runAllTask()
final int ioRatio = this.ioRatio;
if (ioRatio == 100) {
try {
processSelectedKeys();
} finally {
// Ensure we always run tasks.
runAllTasks();
}
} else {
final long ioStartTime = System.nanoTime();
try {
processSelectedKeys();
} finally {
// Ensure we always run tasks.
final long ioTime = System.nanoTime() - ioStartTime;
runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
}
}
} catch (Throwable t) {
handleLoopException(t);
}
这里有个ioRatio,就是执行io任务和非io任务的时间比。用户可以自行设置。默认为50,可以看源码。如果是100的话,先执行完io任务,执行完之后才执行非io任务。我们现在学习比较复杂的第二种情况。
在runAllTasks(long timeoutNanos)里面,主要分享以下三件事情:
1、task的分类和添加
2、任务的聚合
3、任务的真正执行
点进去runAllTasks(long timeoutNanos)方法的实现:
protected boolean runAllTasks(long timeoutNanos) {
fetchFromScheduledTaskQueue();
Runnable task = pollTask();
if (task == null) {
afterRunningAllTasks();
return false;
}
final long deadline = ScheduledFutureTask.nanoTime() + timeoutNanos;
long runTasks = 0;
long lastExecutionTime;
for (;;) {
safeExecute(task);
runTasks ++;
// Check timeout every 64 tasks because nanoTime() is relatively expensive.
// XXX: Hard-coded value - will make it configurable if it is really a problem.
if ((runTasks & 0x3F) == 0) {
lastExecutionTime = ScheduledFutureTask.nanoTime();
if (lastExecutionTime >= deadline) {
break;
}
}
task = pollTask();
if (task == null) {
lastExecutionTime = ScheduledFutureTask.nanoTime();
break;
}
}
afterRunningAllTasks();
this.lastExecutionTime = lastExecutionTime;
return true;
}
第一步是把定时任务读出来, 第二步是从任务队列里面取任务,第三步是safeExecute()执行任务,通过(runTasks & 0x3F) == 0判断执行的任务次数是否达到64个,如果到达,那么就判断是否到达我们非io任务的时间,是的话就退出,并且记录下最后执行任务的时间lastExecutionTime,否则继续取任务去执行。所有任务执行完了,会调用afterRunningAllTasks()进行收尾工作。
关于64,在源码里面也就解释,为什么不执行完一次任务检查一次?检查也是相对耗时的操作,这里他们用了硬编码的方式,如果实在需要,可以修改为可配置的。
我们首先分析fetchFromScheduledTaskQueue()
private boolean fetchFromScheduledTaskQueue() {
long nanoTime = AbstractScheduledEventExecutor.nanoTime();
Runnable scheduledTask = pollScheduledTask(nanoTime);
while (scheduledTask != null) {
if (!taskQueue.offer(scheduledTask)) {
// No space left in the task queue add it back to the scheduledTaskQueue so we pick it up again.
scheduledTaskQueue().add((ScheduledFutureTask>) scheduledTask);
return false;
}
scheduledTask = pollScheduledTask(nanoTime);
}
return true;
}
也就是每次从定时任务队列scheduledTaskQueue里面取出到期的任务,添加到我们taskQueue里面,这就是任务的聚合。如果添加不成功,也就是有可能taskQueue满了,就要添加回定时任务队列,否者,这个任务可能丢失。
在深入一点看pollScheduledTask():
protected final Runnable pollScheduledTask(long nanoTime) {
assert inEventLoop();
Queue
ScheduledFutureTask> scheduledTask = scheduledTaskQueue == null ? null : scheduledTaskQueue.peek();
if (scheduledTask == null) {
return null;
}
if (scheduledTask.deadlineNanos() <= nanoTime) {
scheduledTaskQueue.remove();
return scheduledTask;
}
return null;
}
在这里可以看到,到期的定时任务会返回并从定时任务队列里面删除。最快到期的定时任务会在最前面,关于这个定时任务队列,我们继续看this.scheduledTaskQueue这个的实现:
Queue
if (scheduledTaskQueue == null) {
scheduledTaskQueue = new PriorityQueue
}
return scheduledTaskQueue;
}
使用了优先队列的实现,所以可以比较。而比较的方法在它的定时任务ScheduledFutureTask里面:
@SuppressWarnings("ComparableImplementedButEqualsNotOverridden")
final class ScheduledFutureTask
...
@Override
public int compareTo(Delayed o) {
if (this == o) {
return 0;
}
ScheduledFutureTask> that = (ScheduledFutureTask>) o;
long d = deadlineNanos() - that.deadlineNanos();
if (d < 0) {
return -1;
} else if (d > 0) {
return 1;
} else if (id < that.id) {
return -1;
} else if (id == that.id) {
throw new Error();
} else {
return 1;
}
}
...
}
可以看到,deadline时间最小的,是排在最前面的,如果deadline时间相同,那么就比较id,可以严格保证任务的先后顺序。
关于NioEventLoop的定时任务的实现,还有一个细节,在它父类AbstractScheduledEventExecutor的schedule方法里面:
if (inEventLoop()) {
scheduledTaskQueue().add(task);
} else {
execute(new Runnable() {
@Override
public void run() {
scheduledTaskQueue().add(task);
}
});
}
return task;
}
如果丢定时任务的线程是NioEventLoop的线程,那么就把它放到定时任务队列里面,这是添加定时任务,但是我们这里关注的是,如果不是NioEventLoop的线程,那就会调用execute方法,新建一个线程来把任务丢到定时任务队列,这个新建的线程最终会绑定到NioEventLoop,通过这种方式保证了线程的安全。细细体会作者这种方法,特别巧妙。
接下去就是Runnable task = pollTask();,进入这个方法,最终可以看到就是从我们的taskQueue里面把在最前面的任务取出来,这里的任务已经包括可能的定时任务了。
protected final Runnable pollTaskFrom(Queue
for (;;) {
Runnable task = taskQueue.poll();
if (task == WAKEUP_TASK) {
continue;
}
return task;
}
}
最后是safeExecute(task),直接调用task的run方法:
/**
* Try to execute the given {@link Runnable} and just log if it throws a {@link Throwable}.
*/
protected static void safeExecute(Runnable task) {
try {
task.run();
} catch (Throwable t) {
logger.warn("A task raised an exception. Task: {}", task, t);
}
}
如果执行错误,会打印日志,不影响别的任务的使用。有的源码的实现,一个任务不行,别的任务也都崩溃了。