Netty源码分析之Reactor线程模型

一、背景

 最近在研究netty的源代码,今天发表一篇关于netty的线程框架--Reactor线程模型,作为最近研究成果。如果有还不了解Reactor模型请自行百度,网上有很多关于Reactor模式。

 研究netty的时候,先看了下《netty权威指南》,里面讲解不错,从原理到源码均有介绍,那为什么要写本篇博客呢?《netty权威指南》在介绍线程模型时候,介绍不够细腻,流程没有打通。我个人认为,这部分是基石,只要把这部分搞清楚,对后面ChannelPipe流水线处理就可游刃有余了。此次分析Netty是基于5.0版本

今天以《netty权威指南》中TimeServer实例进行分析,具体实现方法(核心)如下:

public void bind(int port) throws Exception {
    EventLoopGroup bossGroup = new NioEventLoopGroup();
    EventLoopGroup workerGroup = new NioEventLoopGroup();
    try {
        ServerBootstrap b = new ServerBootstrap();
        b.group(bossGroup, workerGroup)
                .channel(NioServerSocketChannel.class)
                .option(ChannelOption.SO_BACKLOG, 1024)
                .childHandler(new ChildChannelHandler());
        ChannelFuture f = b.bind(port).sync();
        f.channel().closeFuture().sync();
    } finally {
        bossGroup.shutdownGracefully();
        workerGroup.shutdownGracefully();
    }
}

private class ChildChannelHandler extends ChannelInitializer {

    protected void initChannel(SocketChannel ch) throws Exception {
        // 将监听事件 注册到ChannelPipe流水线中 放到链表中  也可以注册多个监听事件 可以指定名字如果没有名字 会自动生成
        ch.pipeline().addLast("GetTime", new TimeServerHandler());
    }
}

通过上面的代码可知,最重要的两个类是:NioEventLoopGroupServerBootstrap(如果是客户端则是Bootstrap),下面是这两个类的UML类图:

Netty源码分析之Reactor线程模型_第1张图片
Netty源码分析之Reactor线程模型_第2张图片

二、NioEventLoopGroup线程组

 NioEventLoopGroup类主要工作是,创建一个线程池。上述代码中创建了两个EventLoop,一个是boosGroup,主要是用于监听,另外一个是workerGroup主要用于C/S通信。这两个线程池是实现Reactor线程模型的基础。接下来分析按照uml类图关系进行介绍,从下往上开始。

NioEventLoopGroup类代码较少,其中最重要的方式就是下面。这个方法是父类MultithreadEventLoopGroup定义的抽象方法,此方法主要用XXXX,是一个线程,后面会看到调用的地方。

@Override
protected EventLoop newChild(Executor executor, Object... args) throws Exception {
    return new NioEventLoop(this, executor, (SelectorProvider) args[0]);
}

  调用无参的NioEventLoopGroup的构造函数最终会调用,

public NioEventLoopGroup(
        int nThreads, ThreadFactory threadFactory, final SelectorProvider selectorProvider) {
    super(nThreads, threadFactory, selectorProvider);
}

  说一下此处的实参,nThreads0threadFactorynullselectorProvider是调用SelectorProvider.provider()。第三个参数是生成Selector选择器(Java底层网络模型采用的linux epoll模型,而非select模型),最后调用父类的MultithreadEventLoopGroup的构造方法。

protected MultithreadEventExecutorGroup(int nThreads, Executor executor, Object... args) {
    if (nThreads <= 0) {
        throw new IllegalArgumentException(String.format("nThreads: %d (expected: > 0)", nThreads));
    }

    if (executor == null) {
        executor = new ThreadPerTaskExecutor(newDefaultThreadFactory());
    }

    children = new EventExecutor[nThreads];
    for (int i = 0; i < nThreads; i ++) {
        boolean success = false;
        try {
            children[i] = newChild(executor, args);
            success = true;
        } catch (Exception e) {
            // TODO: Think about if this is a good exception type
            throw new IllegalStateException("failed to create a child event loop", e);
        } finally {
            if (!success) {
                for (int j = 0; j < i; j ++) {
                    children[j].shutdownGracefully();
                }

                for (int j = 0; j < i; j ++) {
                    EventExecutor e = children[j];
                    try {
                        while (!e.isTerminated()) {
                            e.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS);
                        }
                    } catch (InterruptedException interrupted) {
                        Thread.currentThread().interrupt();
                        break;
                    }
                }
            }
        }
    }

    final FutureListener terminationListener = new FutureListener() {
        @Override
        public void operationComplete(Future future) throws Exception {
            if (terminatedChildren.incrementAndGet() == children.length) {
                terminationFuture.setSuccess(null);
            }
        }
    };

    for (EventExecutor e: children) {
        e.terminationFuture().addListener(terminationListener);
    }

    Set childrenSet = new LinkedHashSet(children.length);
    Collections.addAll(childrenSet, children);
    readonlyChildren = Collections.unmodifiableSet(childrenSet);
} 
  

 此方法有两点说明:

 1) 这个地方的executor一直都是都null,所以在这个地方创建一个默认executor执行器。这个ThreadPerTaskExecutor类中只有一个具体方法,是实现execute方法。这个方法在后面会调用到。

 2) 第一个for循环主要是创建线程的。其中方法newChild(),实际调用的是NioEventLoopGroup类中的newChild方法。

 

三、NioEventLoop线程

 下面是NioEventLoopUML类图。

Netty源码分析之Reactor线程模型_第3张图片

NioEventLoop构造方法中,主要做了两件事情:

1、excutor赋值给父类并且父类创建Task队列。

2、创建selector选择器并且初始胡selectorKey

NioEventLoop类中有一个最重要的方法,就是run方法,此方法是一个死循环(除非关闭、异常才退出),这run方法就是用于轮训事件消息,包括accept事件、read事件、write事件。这个方法在初始化NioEventLoopGroup不会调用到(是bind时调用),后面再详细介绍run方法。

 

三、ServerBootStrap服务启动

 通过上面的代码可知,ServerBootStrap需要设置线程池,Channel以及流水线Pipe,设置完这些则调用bind开始监听流程,最终会调用到doBind方法,方法如下:

private ChannelFuture doBind(final SocketAddress localAddress) {
    final ChannelFuture regFuture = initAndRegister();
    final Channel channel = regFuture.channel();
    if (regFuture.cause() != null) {
        return regFuture;
    }

    final ChannelPromise promise;
    if (regFuture.isDone()) {
        promise = channel.newPromise();
        doBind0(regFuture, channel, localAddress, promise);
    } else {
        // Registration future is almost always fulfilled already, but just in case it's not.
        promise = new DefaultChannelPromise(channel, GlobalEventExecutor.INSTANCE);
        regFuture.addListener(new ChannelFutureListener() {
            @Override
            public void operationComplete(ChannelFuture future) throws Exception {
                doBind0(regFuture, channel, localAddress, promise);
            }
        });
    }

    return promise;
}

initAndRegister初始化并注册,此函数中有createChannelinit(channel)

final ChannelFuture initAndRegister() {
    Channel channel;
    try {
        channel = createChannel();
    } catch (Throwable t) {
        return VoidChannel.INSTANCE.newFailedFuture(t);
    }

    try {
        init(channel);
    } catch (Throwable t) {
        channel.unsafe().closeForcibly();
        return channel.newFailedFuture(t);
    }

    ChannelPromise regFuture = channel.newPromise();
    channel.unsafe().register(regFuture);
    if (regFuture.cause() != null) {
        if (channel.isRegistered()) {
            channel.close();
        } else {
            channel.unsafe().closeForcibly();
        }
    }

    return regFuture;
}

 createChannel方法实现,在类ServerBootstrap中,其中group()是获取bossGroupnext()是从bossGroup线程池中取一个线程,此线程主要用监听socketnewChannel中第二参数childGroupworkerGroup线程池,该线程池主要用于客户端建链成功之后,提供C/S服务线程,这也就是Reactor线程模型。

@Override
Channel createChannel() {
    EventLoop eventLoop = group().next();
    return channelFactory().newChannel(eventLoop, childGroup);
}

newChannel方法,是通过反射方式动态创建类对象即创建NioServerSocketChannel

对于init(channel)方法比较简单,主要用于设置options和流水线pipe

下面是register方法:

public final void register(final ChannelPromise promise) {
    if (eventLoop.inEventLoop()) {
        register0(promise);
    } else {
        try {
            eventLoop.execute(new Runnable() {
                @Override
                public void run() {
                    register0(promise);
                }
            });
        } catch (Throwable t) {
            logger.warn(
                    "Force-closing a channel whose registration task was not accepted by an event loop: {}",
                    AbstractChannel.this, t);
            closeForcibly();
            closeFuture.setClosed();
            promise.setFailure(t);
        }
    }
}

该方法第一步判断执行register线程与eventLoop线程是否相同(eventLoop是来自bossGroup,在方法createChannel中设置),第一次肯定不相同,因此当前线程是main线程,所以会进入else分支。eventLoop.execute方法实现在类SingleThreadEventExecutor

public void execute(Runnable task) {
    if (task == null) {
        throw new NullPointerException("task");
    }

    boolean inEventLoop = inEventLoop();
    if (inEventLoop) {
        addTask(task);
    } else {
        startThread();
        addTask(task);
        if (isShutdown() && removeTask(task)) {
            reject();
        }
    }

    if (!addTaskWakesUp) {
        wakeup(inEventLoop);
    }
}

根据上面分析,这个会进入else分支,启动线程并且将task添加到阻塞队列中,启动的线程会从队列中取出task并且执行task

方法startThread会调用到doStartThread,执行executor.execute接口,此接口的实现方法是类ThreadPerTaskExecutorexecute方法,该方法会调用start方法,将线程激活。下面看一下run方法,这个run方法中最重要的一行代码是:SingleThreadEventExecutor.this.run();第一次调用run接口,该接口实现方法是在NioEventLoop.javarun方法。

main线程启动子线程-A后,会把task加入到队列中,然后main线程就去执行doBind0方法。而子线程-A启动成功后对从队列中取出这个task并且执行这个taskdoBind0方法是由main线程执行,main线程会把doBind0具体操作放到队列中,然后由子线程-A去执行bind操作。至此,main线程所做的事情就结束了,最后会回到main方法中阻塞。

五、子线程-A执行task

 子线程执行的task,定义在doStartThread方法中,这段代码最终一行代码就是SingleThreadEventExecutor.this.run();这个是一个接口,那么实现在哪里呢?

private void doStartThread() {
    assert thread == null;
    executor.execute(new Runnable() {
        @Override
        public void run() {
            thread = Thread.currentThread();
            if (interrupted) {
                thread.interrupt();
            }

            boolean success = false;
            updateLastExecutionTime();
            try {
                SingleThreadEventExecutor.this.run();
                success = true;
            } catch (Throwable t) {
                logger.warn("Unexpected exception from an event executor: ", t);
            } finally {
                if (state < ST_SHUTTING_DOWN) {
                    state = ST_SHUTTING_DOWN;
                }

                // Check if confirmShutdown() was called at the end of the loop.
                if (success && gracefulShutdownStartTime == 0) {
                    logger.error("Buggy " + EventExecutor.class.getSimpleName() + " implementation; " +
                            SingleThreadEventExecutor.class.getSimpleName() + ".confirmShutdown() must be called " +
                            "before run() implementation terminates.");
                }

                try {
                    // Run all remaining tasks and shutdown hooks.
                    for (;;) {
                        if (confirmShutdown()) {
                            break;
                        }
                    }
                } finally {
                    try {
                        cleanup();
                    } finally {
                        synchronized (stateLock) {
                            state = ST_TERMINATED;
                        }
                        threadLock.release();
                        if (!taskQueue.isEmpty()) {
                            logger.warn(
                                    "An event executor terminated with " +
                                            "non-empty task queue (" + taskQueue.size() + ')');
                        }

                        terminationFuture.setSuccess(null);
                    }
                }
            }
        }
    });
}

run的实现方法:NioEventLoop.javarun方法,这里就和前面串起来了。哈哈

protected void run() {
    for (;;) {
        oldWakenUp = wakenUp.getAndSet(false);
        try {
            if (hasTasks()) {
                selectNow();
            } else {
                select();
                if (wakenUp.get()) {
                    selector.wakeup();
                }
            }

            cancelledKeys = 0;

            final long ioStartTime = System.nanoTime();
            needsToSelectAgain = false;
            if (selectedKeys != null) {
                processSelectedKeysOptimized(selectedKeys.flip());
            } else {
                processSelectedKeysPlain(selector.selectedKeys());
            }
            final long ioTime = System.nanoTime() - ioStartTime;

            final int ioRatio = this.ioRatio;
            runAllTasks(ioTime * (100 - ioRatio) / ioRatio);

            if (isShuttingDown()) {
                closeAll();
                if (confirmShutdown()) {
                    break;
                }
            }
        } catch (Throwable t) {
            logger.warn("Unexpected exception in the selector loop.", t);

            // Prevent possible consecutive immediate failures that lead to
            // excessive CPU consumption.
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                // Ignore.
            }
        }
    }
}

从这个方法看,是一个死循环,主要用于轮训事件,如果有task存在则立即触发select,否则睡眠一段时间,这个和linux select模型类似。接下来是处理SelectKey,默认会进入processSelectedKeysOptimized方法,开始循环遍历,默认进入if分支,下面是processSelectedKey方法,主要内容是三段if判断:

if ((readyOps & (SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) != 0 || readyOps == 0) {
    unsafe.read();
    if (!ch.isOpen()) {
        // Connection already closed - no need to handle write.
        return;
    }
}

OP_READOP_ACCEPT事件:主要是用于客户端连接、客户端发来的消息。

if ((readyOps & SelectionKey.OP_WRITE) != 0) {
    // Call forceFlush which will also take care of clear the OP_WRITE once there is nothing left to write
    ch.unsafe().forceFlush();
}

OP_WRITE事件:用于给对端发送消息,当调用flush时候会触发这个。

if ((readyOps & SelectionKey.OP_CONNECT) != 0) {
    // remove OP_CONNECT as otherwise Selector.select(..) will always return without blocking
    // See https://github.com/netty/netty/issues/924
    int ops = k.interestOps();
    ops &= ~SelectionKey.OP_CONNECT;
    k.interestOps(ops);

    unsafe.finishConnect();
}

OP_CONNECT这个是客户端程序会进入,表示tcp连接完成。这个地方需要把OP_CONNECT标志清除掉。

我们着重分析一下Read事件。Reactor线程模式:如果有新的接入,则创建一个新的线程,为新连接服务。那么我们顺着unsafe.read(),去查看在什么地方创建的新线程?这个unsafe.read是一个接口,它的实现有两个:

1)如果是监听线程--NioServerSocketChannel,主要处理客户端接入请求Accept

实现方法在类AbstractNioMessageChannel.javaread()

@Override
public void read() {
    assert eventLoop().inEventLoop();
    if (!config().isAutoRead()) {
        removeReadOp();
    }

    final ChannelConfig config = config();
    final int maxMessagesPerRead = config.getMaxMessagesPerRead();
    final boolean autoRead = config.isAutoRead();
    final ChannelPipeline pipeline = pipeline();
    boolean closed = false;
    Throwable exception = null;
    try {
        for (;;) {
            int localRead = doReadMessages(readBuf);
            if (localRead == 0) {
                break;
            }
            if (localRead < 0) {
                closed = true;
                break;
            }

            if (readBuf.size() >= maxMessagesPerRead | !autoRead) {
                break;
            }
        }
    } catch (Throwable t) {
        exception = t;
    }

    int size = readBuf.size();
    for (int i = 0; i < size; i ++) {
        pipeline.fireChannelRead(readBuf.get(i));
    }
    readBuf.clear();
    pipeline.fireChannelReadComplete();

    if (exception != null) {
        if (exception instanceof IOException) {
            // ServerChannel should not be closed even on IOException because it can often continue
            // accepting incoming connections. (e.g. too many open files)
            closed = !(AbstractNioMessageChannel.this instanceof ServerChannel);
        }

        pipeline.fireExceptionCaught(exception);
    }

    if (closed) {
        if (isOpen()) {
            close(voidPromise());
        }
    }
}

这方法中最重要的方法就是doReadMessages()

protected int doReadMessages(List buf) throws Exception {
    SocketChannel ch = javaChannel().accept();

    try {
        if (ch != null) {
            buf.add(new NioSocketChannel(this, childEventLoopGroup().next(), ch));
            return 1;
        }
    } catch (Throwable t) {
        logger.warn("Failed to create a new channel from an accepted socket.", t);

        try {
            ch.close();
        } catch (Throwable t2) {
            logger.warn("Failed to close a socket.", t2);
        }
    }

    return 0;
} 
  

注意:上面的add操作,其中childEventLoopGroup().next(),就是从workGroup中挑选一个线程,这个线程就是服务于客户端与服务端。这个地方就是Reactor线程模型核心之地。

2)如果是服务线程--即与客户端通信线程NioSocketChannel,主要处理对端发送过来的消息

如果是其他的消息(例如客户端正常发送消息)就会进入下面方法:

@Override
    public void read() {
        final ChannelConfig config = config();
        final ChannelPipeline pipeline = pipeline();
        final ByteBufAllocator allocator = config.getAllocator();
        final int maxMessagesPerRead = config.getMaxMessagesPerRead();
        RecvByteBufAllocator.Handle allocHandle = this.allocHandle;
        if (allocHandle == null) {
            this.allocHandle = allocHandle = config.getRecvByteBufAllocator().newHandle();
        }
        if (!config.isAutoRead()) {
            removeReadOp();
        }

        ByteBuf byteBuf = null;
        int messages = 0;
        boolean close = false;
        try {
            int byteBufCapacity = allocHandle.guess();
            int totalReadAmount = 0;
            do {
                byteBuf = allocator.ioBuffer(byteBufCapacity);
                int writable = byteBuf.writableBytes();
                int localReadAmount = doReadBytes(byteBuf);
                if (localReadAmount <= 0) {
                    // not was read release the buffer
                    byteBuf.release();
                    close = localReadAmount < 0;
                    break;
                }

                pipeline.fireChannelRead(byteBuf);
                byteBuf = null;

                if (totalReadAmount >= Integer.MAX_VALUE - localReadAmount) {
                    // Avoid overflow.
                    totalReadAmount = Integer.MAX_VALUE;
                    break;
                }

                totalReadAmount += localReadAmount;
                if (localReadAmount < writable) {
                    // Read less than what the buffer can hold,
                    // which might mean we drained the recv buffer completely.
                    break;
                }
            } while (++ messages < maxMessagesPerRead);

            pipeline.fireChannelReadComplete();
            allocHandle.record(totalReadAmount);

            if (close) {
                closeOnRead(pipeline);
                close = false;
            }
        } catch (Throwable t) {
            handleReadException(pipeline, byteBuf, t, close);
        }
    }
}

doReadBytes方法是通过socket读取报文,通过fireChannelRead方法将数据传递到handler进行处理。

通过上面两种场景,可以有一个总结:先通过底层socket读取数据,然后触发fireChannelRead事件,当所有数据读完成最后触发fireChannelReadComplete事件。


至此,netty服务启动以及Reactor线程模型源码分析就结束了。后面会介绍Channel以及流水线Pipe


【补充知识】

上面介绍Selector时候,会出现空轮训。什么是空轮训呢?就是本次select操作,没有发生任何事件,这样会造成Selector假死,CPU100%。这个是java epoll模型的bug。因此Netty提供了一个解决方法:重建Selector。就是重新new Selector然后把旧的Selector注册的事件全部移植到新的Selector中,然后重新轮训新的Selector。Netty中设置了一定次数,如果空轮训了N次(代码中有静态变量),就会重建Selector。Netty通过这种间接方式处理java epoll模型bug,不过还是希望java jdk能早日解决这个问题(java 7中仍然没有解决这个问题)。














你可能感兴趣的:(网络,java并发与NIO)