举一个Netty服务端创建例子如下:
EventLoopGroup bossGroup = new NioEventLoopGroup(1);
EventLoopGroup workerGroup = new NioEventLoopGroup();
try {
ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup)
.channel(NioServerSocketChannel.class)
.childOption(ChannelOption.TCP_NODELAY, true)
.childAttr(AttributeKey.newInstance("childAttr"), "childAttrValue")
.handler(new ServerHandler())
.childHandler(new ChannelInitializer() {
@Override
public void initChannel(SocketChannel ch) {
ch.pipeline().addLast(new AuthHandler());
//.................
}
});
ChannelFuture f = b.bind (8888).sync();
f.channel().closeFuture().sync();
} finally {
bossGroup.shutdownGracefully();
workerGroup.shutdownGracefully();
}
从上面我们首先看到如下代码:
EventLoopGroup bossGroup = new NioEventLoopGroup( 1 ) ;
EventLoopGroup workerGroup = new NioEventLoopGroup() ;
从上可见,一个传了参数,另一个没传,跟踪代码如下:
// NioEventLoopGroup.java
public NioEventLoopGroup () {
this ( 0 ) ;
}
继续跟踪代码到如下:
// MultithreadEventLoopGroup.java
protected MultithreadEventLoopGroup ( int nThreads , Executor executor , Object... args) {
super (nThreads == 0 ? DEFAULT_EVENT_LOOP_THREADS : nThreads , executor , args) ;
}
点DEFAULT_EVENT_LOOP_THREADS跟进去看发现 无参 默认初始化为 ( 线程数 = CPU核数*2 );
// MultithreadEventLoopGroup.java
static {
DEFAULT_EVENT_LOOP_THREADS = Math. max ( 1 , SystemPropertyUtil. getInt ( " io.netty.eventLoopThreads" , Runtime. getRuntime ().availableProcessors() * 2 )) ;
........................
}
回到刚才的MultithreadEventLoopGroup方法,继续跟踪到如下方法:
// MultithreadEventExecutorGroup.java
protected MultithreadEventExecutorGroup ( int nThreads , Executor executor , EventExecutorChooserFactory chooserFactory , Object... args) {
if (nThreads <= 0 ) {
throw new IllegalArgumentException(String. format ( "nThreads: %d (expected: > 0)" , nThreads)) ;
}
// 1 初始化线程执行器
if (executor == null ) {
executor = new ThreadPerTaskExecutor(newDefaultThreadFactory()) ;
}
children = new EventExecutor[nThreads] ;
for ( int i = 0 ; i < nThreads ; i ++) {
boolean success = false;
try {
// 2 创建NIOEventLoop(里面会创建一下 MpscQueue )
children [i] = newChild(executor , args) ;
success = true;
} catch (Exception e) {
// TODO: Think about if this is a good exception type
throw new IllegalStateException( "failed to create a child event loop" , e) ;
} finally {
// 一堆关闭连接操作
if (!success) {
for ( int j = 0 ; j < i ; j ++) {
children [j].shutdownGracefully() ;
}
for ( int j = 0 ; j < i ; j ++) {
EventExecutor e = children [j] ;
try {
while (!e.isTerminated()) {
e.awaitTermination(Integer. MAX_VALUE , TimeUnit. SECONDS ) ;
}
} catch (InterruptedException interrupted) {
// Let the caller handle the interruption.
Thread. currentThread ().interrupt() ;
break;
}
}
}
}
}
// 3 创建线程选择器
chooser = chooserFactory.newChooser( children ) ;
final FutureListener terminationListener = new FutureListener() {
@Override
public void operationComplete (Future future) throws Exception {
if ( terminatedChildren .incrementAndGet() == children . length ) {
terminationFuture .setSuccess( null ) ;
}
}
} ;
for (EventExecutor e: children ) {
e.terminationFuture().addListener(terminationListener) ;
}
Set childrenSet = new LinkedHashSet( children . length ) ;
Collections. addAll (childrenSet , children ) ;
readonlyChildren = Collections. unmodifiableSet (childrenSet) ;
}
一 上面代码很多,我们首先来看看他是怎么创建线程执行器的,其创建入口代码如下:
executor = new ThreadPerTaskExecutor(newDefaultThreadFactory()) ;
首先看他的传参newDefaultThreadFactory() 有什么东西,跟踪如下:
// DefaultTreadFactory.java
public DefaultThreadFactory (Class> poolType , boolean daemon , int priority) {
this ( toPoolName (poolType) , daemon , priority) ;
}
继续往下跟踪如下,其分析如注释所示:
// DefaultThreadFactory.java
public DefaultThreadFactory (String poolName , boolean daemon , int priority , ThreadGroup threadGroup) {
if (poolName == null ) {
throw new NullPointerException( "poolName" ) ;
}
if (priority < Thread. MIN_PRIORITY || priority > Thread. MAX_PRIORITY ) {
throw new IllegalArgumentException( "priority: " + priority + " (expected: Thread.MIN_PRIORITY <= priority <= Thread.MAX_PRIORITY)" ) ;
}
// 声明待创建线程的一些属性
// 线程名 前缀(用于后面创建的线程名的拼接)、是否守护线程、优先级、线程组
prefix = poolName + '-' + poolId .incrementAndGet() + '-' ;
this . daemon = daemon ;
this . priority = priority ;
this . threadGroup = threadGroup ;
}
解析完里面newDefaultThreadFactory() 是什么样的,我们来看看new ThreadPerTaskExecutor(newDefaultThreadFacotry())是怎么样的?
// ThreadPerTaskExecutor.java
public final class ThreadPerTaskExecutor implements Executor {
private final ThreadFactory threadFactory ;
public ThreadPerTaskExecutor ( ThreadFactory threadFactory) {
if (threadFactory == null ) {
throw new NullPointerException( "threadFactory" ) ;
}
this . threadFactory = threadFactory ;
}
@Override
public void execute (Runnable command) {
threadFactory .newThread(command).start() ;
}
}
简析:上面很容易看出ThreadPerTaskExecutor只是对ThreadFactory的包装并存储,并提供一个execute用于执行。跟踪下execute((Runnable command) 方法如下:
// DefaultThreadFactory.java
public Thread newThread (Runnable r) {
// 1 这个prefix就是上面的 prefix = poolName + '-' + poolId .incrementAndGet() + '-',所以最终我们可以看到线程名长这样的:nioEventLoop-1-xx
Thread t = newThread( new DefaultRunnableDecorator(r) , prefix + nextId .incrementAndGet()) ;
try {
if (t.isDaemon()) {
if (! daemon ) {
t.setDaemon( false ) ;
}
} else {
if ( daemon ) {
t.setDaemon( true ) ;
}
}
if (t.getPriority() != priority ) {
t.setPriority( priority ) ;
}
} catch (Exception ignored) {
// Doesn't matter even if failed to set.
}
return t ;
}
二 解析完线程执行器是怎么创建的,接下来看看如何创建一个NioEventLoop
try {
children [i] = newChild(executor , args) ;
...................
跟踪得到如下代码:
NioEventLoop (NioEventLoopGroup parent , Executor executor , SelectorProvider selectorProvider , SelectStrategy strategy , RejectedExecutionHandler rejectedExecutionHandler) {
super (parent , executor , false, DEFAULT_MAX_PENDING_TASKS , rejectedExecutionHandler) ;
if (selectorProvider == null ) {
throw new NullPointerException( "selectorProvider" ) ;
}
if (strategy == null ) {
throw new NullPointerException( "selectStrategy" ) ;
}
// 保存信息, openSelector()下面会分析
provider = selectorProvider ;
selector = openSelector() ;
selectStrategy = strategy ;
}
选择super()跟踪如下:
protected SingleThreadEventExecutor (EventExecutorGroup parent , Executor executor ,
boolean addTaskWakesUp , int maxPendingTasks ,
RejectedExecutionHandler rejectedHandler) {
// 保存一些基本信息,如线程执行器,任务队列(用于执行外部线程任务)
super (parent) ;
this . addTaskWakesUp = addTaskWakesUp ;
this . maxPendingTasks = Math. max ( 16 , maxPendingTasks) ;
this . executor = ObjectUtil. checkNotNull (executor , "executor" ) ;
taskQueue = newTaskQueue( this . maxPendingTasks ) ; // PlatformDependent. newMpscQueue (maxPendingTasks) ;
rejectedExecutionHandler = ObjectUtil. checkNotNull (rejectedHandler , "rejectedHandler" ) ;
}
3 创建线程选择器
创建选择器的代码如下:
// MultithreadEventExecutorGroup.java
chooser = chooserFactory.newChooser( children ) ;
跟踪源码如下:
// DefaultEventExecutorChooserFactory.java
public EventExecutorChooser newChooser (EventExecutor[] executors) {
if ( isPowerOfTwo (executors. length )) {
return new PowerOfTowEventExecutorChooser(executors) ;
} else {
return new GenericEventExecutorChooser(executors) ;
}
}
由上面可见根据isPowerOfTwo (executors. length ) 判断使用哪种来创建连接器(由下面可知,当 线程数%2==0时,采用第一种,否则采用第二种普通的)
// DefaultEventExecutorChooserFactory.java
private static boolean isPowerOfTwo ( int val) {
// 使用了二进制的方式判断,效率更高
return (val & -val) == val ;
}
看完上面,分别看创建连接器的两个方法如下:
// DefaultEventExecutorChooserFactory.java
PowerOfTowEventExecutorChooser (EventExecutor[] executors) {
this . executors = executors ;
}
@Override
public EventExecutor next () {
// 使用二进制计算的方式,效率更高( 线程数%2==0时用这种 )
return executors [ idx .getAndIncrement() & executors . length - 1 ] ;
}
// DefaultEventExecutorChooserFactory.java
GenericEventExecutorChooser (EventExecutor[] executors) {
this . executors = executors ;
}
@Override
public EventExecutor next () {
// 求余方式,效率明显比不上上面的 ( PowerOfTowEventExecutorChooser )
return executors [Math. abs ( idx .getAndIncrement() % executors . length )] ;
}
4 NioEventLoop启动流程
先回忆一下"Netty服务端创建例子"如下:
EventLoopGroup bossGroup = new NioEventLoopGroup(1);
EventLoopGroup workerGroup = new NioEventLoopGroup();
try {
ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup)
.channel(NioServerSocketChannel.class)
.childOption(ChannelOption.TCP_NODELAY, true)
.childAttr(AttributeKey.newInstance("childAttr"), "childAttrValue")
.handler(new ServerHandler())
.childHandler(new ChannelInitializer() {
@Override
public void initChannel(SocketChannel ch) {
ch.pipeline().addLast(new AuthHandler());
//.................
}
});
ChannelFuture f = b.bind(8888).sync();
f.channel().closeFuture().sync();
} finally {
bossGroup.shutdownGracefully();
workerGroup.shutdownGracefully();
}
其中我们要分析NioEventLoop的启动流程入口在b.bind(8888)中,跟踪代码如下:
// AbstractBootstrap.java
private static void doBind0 (
final ChannelFuture regFuture , final Channel channel ,
final SocketAddress localAddress , final ChannelPromise promise) {
// This method is invoked before channelRegistered() is triggered. Give user handlers a chance to set up
// the pipeline in its channelRegistered() implementation.
channel.eventLoop(). execute ( new Runnable() {
@Override
public void run () {
if ( regFuture .isSuccess()) {
channel .bind( localAddress , promise ).addListener(ChannelFutureListener. CLOSE_ON_FAILURE ) ;
} else {
promise .setFailure( regFuture .cause()) ;
}
}
}) ;
}
点击execute()跟踪到如下:
// SingleThreadEventExecutor.java
public void execute (Runnable task) {
if (task == null ) {
throw new NullPointerException( "task" ) ;
}
boolean inEventLoop = inEventLoop() ;
if (inEventLoop) {
addTask(task) ;
} else {
startThread();
addTask(task) ;
if (isShutdown() && removeTask(task)) {
reject () ;
}
}
if (! addTaskWakesUp && wakesUpForTask(task)) {
wakeup(inEventLoop) ;
}
}
选择 startThread() 跟踪代码如下:
// SingleThreadEventExecutor.java
private void startThread () {
if ( STATE_UPDATER .get( this ) == ST_NOT_STARTED ) {
if ( STATE_UPDATER .compareAndSet( this, ST_NOT_STARTED , ST_STARTED )) {
doStartThread();
}
}
}
可见上面使用了CAS 乐观锁 来判断当前是否可启动,继续跟踪doStartThread()代码如下:
// SingleThreadEventExecutor.java
private void doStartThread () {
assert thread == null;
executor .execute( new Runnable() {
@Override
public void run () {
// 获取当前线程
thread = Thread. currentThread () ;
if ( interrupted ) {
thread .interrupt() ;
}
boolean success = false;
updateLastExecutionTime() ;
try {
// 启动当前NioEventLoop!!!
SingleThreadEventExecutor.this.run();
success = true;
} catch (Throwable t) {
logger .warn( "Unexpected exception from an event executor: " , t) ;
} finally {
for ( ;; ) {
int oldState = STATE_UPDATER .get(SingleThreadEventExecutor. this ) ;
if (oldState >= ST_SHUTTING_DOWN || STATE_UPDATER .compareAndSet( SingleThreadEventExecutor. this, oldState , ST_SHUTTING_DOWN )) {
break;
}
}
// Check if confirmShutdown() was called at the end of the loop.
if (success && gracefulShutdownStartTime == 0 ) {
logger .error( "Buggy " + EventExecutor. class .getSimpleName() + " implementation; " + SingleThreadEventExecutor. class .getSimpleName() + ".confirmShutdown() must be called " +
"before run() implementation terminates." ) ;
}
try {
// Run all remaining tasks and shutdown hooks.
for ( ;; ) {
if (confirmShutdown()) {
break;
}
}
} finally {
try {
cleanup() ;
} finally {
STATE_UPDATER .set(SingleThreadEventExecutor. this, ST_TERMINATED ) ;
threadLock .release() ;
if (! taskQueue .isEmpty()) {
logger .warn( "An event executor terminated with " + "non-empty task queue (" + taskQueue .size() + ')' ) ;
}
terminationFuture .setSuccess( null ) ;
}
}
}
}
}) ;
}
NioEventLoop启动之后做什么?请看如下
4 NioEventLoop的执行过程
首先我们找到他的启动入口SingleThreadEventExecutor.this.run(); ,点进去跟踪源码如下:
protected void run () {
for ( ;; ) {
try {
switch ( selectStrategy .calculateStrategy( selectNowSupplier , hasTasks())) {
case SelectStrategy. CONTINUE :
continue;
case SelectStrategy. SELECT :
select(wakenUp.getAndSet(false)); // 检查IO
if ( wakenUp .get()) {
selector .wakeup() ;
}
default :
// fallthrough
}
cancelledKeys = 0 ;
needsToSelectAgain = false;
final int ioRatio = this . ioRatio ;
if (ioRatio == 100 ) {
try {
processSelectedKeys(); // 处理IO
} finally {
// Ensure we always run tasks.
runAllTasks(); // 异步任务队列处理
}
} else {
final long ioStartTime = System. nanoTime () ;
try {
processSelectedKeys(); // 处理IO
} finally {
// Ensure we always run tasks.
final long ioTime = System. nanoTime () - ioStartTime ;
runAllTasks(ioTime * (100 - ioRatio) / ioRatio); // 异步任务队 列处理
}
}
} catch (Throwable t) {
handleLoopException (t) ;
}
// Always handle shutdown even if the loop processing threw an exception.
try {
if (isShuttingDown()) {
closeAll() ;
if (confirmShutdown()) {
return;
}
}
} catch (Throwable t) {
handleLoopException (t) ;
}
}
}
以上源码中,NioEventLoop的执行流程分为三部分别如下:
4.1 检查IO select(wakenUp.getAndSet(false));
4.2 处理IO processSelectedKeys();
4.3 异步任务队列处理 runAllTasks();
4.1 检查IO select(wakenUp.getAndSet(false))
private void select ( boolean oldWakenUp) throws IOException {
Selector selector = this . selector ;
try {
int selectCnt = 0 ;
long currentTimeNanos = System. nanoTime () ;
// 获取 第一个截至时间 = 当前时间 + 截止时间段
long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos) ;
for ( ;; ) {
long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L ) / 1000000L ;
// 判断当前是否超时,如果超时并且selectCnt未处理过一次,就执行非阻塞方法selectNow(),并终止循环
if (timeoutMillis <= 0 ) {
if (selectCnt == 0 ) {
selector.selectNow() ;
selectCnt = 1 ;
}
break;
}
// 判断是否有外部线程任务,如果有 就执行非阻塞方法selectNow(),并终止循环
if (hasTasks() && wakenUp .compareAndSet( false, true )) {
selector.selectNow() ;
selectCnt = 1 ;
break;
}
// 如果上面的非阻塞都没被调用,就调用这个select()阻塞,等待IO有数据
int selectedKeys = selector.select(timeoutMillis) ;
selectCnt ++ ;
if (selectedKeys != 0 || oldWakenUp || wakenUp .get() || hasTasks() || hasScheduledTasks()) {
// - Selected something, 有IO事件
// - waken up by user, or 被唤醒
// - the task queue has a pending task. 有外部线程任务
// - a scheduled task is ready for processing 定时队列里有任务
break;
}
// 判断当前线程是否被中断
if (Thread. interrupted ()) {
if ( logger .isDebugEnabled()) {
logger .debug( "Selector.select() returned prematurely because " + "Thread.currentThread().interrupt() was called. Use " + "NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop." ) ;
}
selectCnt = 1 ;
break;
}
long time = System. nanoTime () ;
// 当前大于超时时间,说明已进行了一次阻塞的select操作。下面会 currentTimeNanos = time 重置 currentTimeNanos,所以这个判断条件不会再进去
if (time - TimeUnit. MILLISECONDS .toNanos(timeoutMillis) >= currentTimeNanos) {
// timeoutMillis elapsed without anything selected.
selectCnt = 1 ;
} else if ( SELECTOR_AUTO_REBUILD_THRESHOLD > 0 && selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD ) {
// 如果上面过了,然后进入了这个if判断,就算是一次空轮询,netty设置了一个阈值=512 限制空轮询的空转次数,到达这个次数会进行如下处理
// The selector returned prematurely many times in a row.
// Rebuild the selector to work around the problem.
logger .warn( "Selector.select() returned prematurely {} times in a row; rebuilding Selector {}." , selectCnt , selector) ;
// ****!!!重点!!!***********解决这个空轮询的bug的方法:通过rebuildSelector()将产生bug的selector丢弃掉,重建新的selector
rebuildSelector() ;
selector = this . selector ;
// Select again to populate selectedKeys.
selector.selectNow() ;
selectCnt = 1 ;
break;
}
currentTimeNanos = time ;
}
if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS ) {
if ( logger .isDebugEnabled()) {
logger .debug( "Selector.select() returned prematurely {} times in a row for Selector {}." , selectCnt - 1 , selector) ;
}
}
} catch (CancelledKeyException e) {
if ( logger .isDebugEnabled()) {
logger .debug(CancelledKeyException. class .getSimpleName() + " raised by a Selector {} - JDK bug?" , selector , e) ;
}
// Harmless exception - log anyway
}
}
备注:netty在static {} 中初始化了空轮询的最大次数,代码如下
int selectorAutoRebuildThreshold = SystemPropertyUtil. getInt ( " io.netty.selectorAutoRebuildThreshold" , 512 ) ;
4.2 处理IO processSelectedKeys()
分析如下:
// NioEventLoop.java
private void processSelectedKeys () {
if ( selectedKeys != null ) {
processSelectedKeysOptimized(selectedKeys.flip());
} else {
processSelectedKeysPlain( selector .selectedKeys()) ;
}
}
继续跟踪到最后如下(分别为 OP_CONNECT、 OP_WRITE、 OP_ACCEPT、 OP_READ ),这里面selectKey有一个优化的过程(底层结构由set优化为数组,有兴趣可以自行去找):
private void processSelectedKey (SelectionKey k , AbstractNioChannel ch) {
// 验证是否合法,不合法的情况省略
.................................
try {
int readyOps = k.readyOps() ; //获取IO事件
// We first need to call finishConnect() before try to trigger a read(...) or write(...) as otherwise
// the NIO JDK channel implementation may throw a NotYetConnectedException.
if ((readyOps & SelectionKey. OP_CONNECT ) != 0 ) {
int ops = k.interestOps() ;
ops &= ~SelectionKey. OP_CONNECT ;
k.interestOps(ops) ;
unsafe.finishConnect() ;
}
// Process OP_WRITE first as we may be able to write some queued buffers and so free memory.
if ((readyOps & SelectionKey. OP_WRITE ) != 0 ) {
// Call forceFlush which will also take care of clear the OP_WRITE once there is nothing left to write
ch.unsafe().forceFlush() ;
}
// Also check for readOps of 0 to workaround possible JDK bug which may otherwise lead
// to a spin loop
if ((readyOps & (SelectionKey. OP_READ | SelectionKey. OP_ACCEPT )) != 0 || readyOps == 0 ) {
unsafe.read() ;
if (!ch.isOpen()) {
// Connection already closed - no need to handle write.
return;
}
}
} catch (CancelledKeyException ignored) {
unsafe.close(unsafe.voidPromise()) ;
}
}
4.3 异步任务队列处理 runAllTasks()
调用如下会执行两种任务队列:
其中定时任务队列逻辑跟踪代码如下(他会把定时任务的全部移动到 外部 普通 任务队列(第一种(taskQueue) ))
private boolean fetchFromScheduledTaskQueue () {
long nanoTime = AbstractScheduledEventExecutor.nanoTime ();
// 从任务队列取出第一个
Runnable scheduledTask = pollScheduledTask(nanoTime);
while (scheduledTask != null ) {
if (!taskQueue .offer(scheduledTask)) {
// 失败就重新放回去
scheduledTaskQueue().add((ScheduledFutureTask>) scheduledTask);
return false;
}
scheduledTask = pollScheduledTask(nanoTime);
}
return true;
}
定时任务 全部放到第一种(外部普通任务队列)任务后,就开始执行任务了~~~~~~~~
附加说明:执行任务时,每执行64个,会统计时间是否超过,如果超过会终止任务执行,代码如下
// Check timeout every 64 tasks because nanoTime() is relatively expensive.
// XXX: Hard-coded value - will make it configurable if it is really a problem.
if ((runTasks & 0x3F ) == 0 ) { // 0x3F -> 0011 1111
lastExecutionTime = ScheduledFutureTask.nanoTime ();
if (lastExecutionTime >= deadline) {
break;
}
}