用过netty,大家都知道在请求处理之前,会有一个缓冲区用于接受数据,不同场景对缓冲的大小都不太一样。
比如UDP协议的DatagramChannel,默认缓冲区大小只给了2048,而假如开发一个SyslogUdp的协议服务,大小其实就不止这么点。
因此,缓冲区怎么用,怎么设置就非常关键啦,很不小心就会踩坑,本文主要给大家讲解下netty4下接受缓冲区的原理及对源码进行解读…
Netty的接收缓冲区,是由接口类RecvByteBufAllocator实现而来,该类有个子接口Handle,包含了三个方法:
interface Handle {
/**
* 创建一个合适大小的接收缓冲区(大到足够读取所有inbound数据,小到不会存在数据浪费)
*/
ByteBuf allocate(ByteBufAllocator alloc);
/**
* 猜测这个缓冲区的容量,之所有是猜测,是因为存在动态扩容的一种缓冲区
*/
int guess();
/**
* 记录上一次读操作实际读取的字节数,以便接收缓冲区能动态调整一个合适的容量
*
* @param actualReadBytes the actual number of read bytes in the previous read operation
*/
void record(int actualReadBytes);
}
了解完基类,应该可以知道设计者的意图,提供多种灵活方式来创建接收缓冲区,比如固定空间大小的,空间动态扩容缩的等…
这个有什么好处呢,对比起JDK原生的NIO类库使用的java.nio.ByteBuffer,实际是一个固定长度的byte数组,这也说明原生buffer无法动态扩容,相关代码如下:
public abstract class ByteBuffer extends Buffer implements Comparable<ByteBuffer>
{
// These fields are declared here rather than in Heap-X-Buffer in order to
// reduce the number of virtual method invocations needed to access these
// values, which is especially costly when coding small buffers.
//
final byte[] hb; // Non-null only for heap buffers
final int offset;
boolean isReadOnly; // Valid only for heap buffers
讲好处前先踩踩原生ByteBuffer固定空间的坏处,例如开发人员一开始很难预测到每条消息报文的长度,或者消息堆积空间所需大小,当然你可以说干脆直接分配一个比较大的ByteBuffer,这通常没问题,不过对于海量推送、高并发等场景,这会给服务器带来沉重的内存负担,也算是一种资源浪费啊。
举个详细点例子:例如海量推送服务,单条消息若最大上限是16K,消息平均大小是6K,为了满足消息支持16K的处理,我们需要把buffer设置成16K,这样子的话,因是海量链路推送,那么假如并发链接数为100w,每个链路都有独立的ByteBuffer接收缓冲区,那将会额外损耗的总内存为:100,0000 * (16K-6K) = 9764M。是不是吓一跳,竟然快要消耗一个G的内存了,大内存不仅增加硬件成本,而且会导致长时间的FGC,对系统的维护和稳定保障带来非常大的冲击。
那讲完坏处,相比好处就显而易见啦,能灵活调整内存空间大小,是一件多么棒的事情。实际上RecvByteBufAllocator提供了两种实现,分别是:AdaptiveRecvByteBufAllocator和FixedRecvByteBufAllocator
我们先来看看简单的FixedRecvByteBufAllocator,代码逻辑如下:
public class FixedRecvByteBufAllocator implements RecvByteBufAllocator {
private static final class HandleImpl implements Handle {
private final int bufferSize;
HandleImpl(int bufferSize) {
this.bufferSize = bufferSize;
}
@Override
public ByteBuf allocate(ByteBufAllocator alloc) {
return alloc.ioBuffer(bufferSize);
}
@Override
public int guess() {
return bufferSize;
}
@Override
public void record(int actualReadBytes) { }
}
private final Handle handle;
/**
* Creates a new predictor that always returns the same prediction of
* the specified buffer size.
*/
public FixedRecvByteBufAllocator(int bufferSize) {
if (bufferSize <= 0) {
throw new IllegalArgumentException(
"bufferSize must greater than 0: " + bufferSize);
}
handle = new HandleImpl(bufferSize);
}
@Override
public Handle newHandle() {
return handle;
}
}
实现非常简单,FixedRecvByteBufAllocator提供了一个固定长度的接收缓冲区,就好比与JDK原生的ByteBuffer,跳过吧…
PS:这里担心大家理解有误,追加下说明,虽然其ByteBuf长度是固定,但当容量不足时,因Netty-ByteBuf本身的特性,还是会进行动态扩展的。
重点是另外一个,AdaptiveRecvByteBufAllocator,看官方说法,是一个能根据以往接受的消息进行计算,动态调整内存,利用CPU资源来换取内存资源,具体的实现策略如下:根据之前Channel接收到的数据包大小进行计算,如果连续填充满接收缓冲区的可写空间,则动态扩展容量。如果连续2次接收到的数据包都小于指定值,则收缩当前的容量,以节约内存。具体使用时,代码如下:
Bootstrap serverBootstrap = new Bootstrap();
serverBootstrap.group(nioEventLoopGroup);
serverBootstrap.channel(NioDatagramChannel.class);
serverBootstrap.option(ChannelOption.RCVBUF_ALLOCATOR,
new AdaptiveRecvByteBufAllocator(DEFAULT_MIN_SIZE, DEFAULT_INITIAL_SIZE, DEFAULT_MAX_SIZE));
//或者使用如下的方式,用默认的分配器
//option(ChannelOption.RCVBUF_ALLOCATOR, AdaptiveRecvByteBufAllocator.DEFAULT);
值得注意的是,无论是接收缓冲区还是接收缓冲区,大小都建议设置为消息的平均大小,不要设置成消息的最大上限,通过如下方式可设置缓冲区的初始大小:
/**
* Creates a new predictor with the specified parameters.
*
* @param minimum the inclusive lower bound of the expected buffer size
* @param initial the initial buffer size when no feed back was received
* @param maximum the inclusive upper bound of the expected buffer size
*/
public AdaptiveRecvByteBufAllocator(int minimum, int initial, int maximum) {
从如上AdaptiveRecvByteBufAllocator的构造方法,可得知其需要三个参数,当然该接收缓冲区也有默认的实现:
static final int DEFAULT_MINIMUM = 64;
static final int DEFAULT_INITIAL = 1024;
static final int DEFAULT_MAXIMUM = 65536;
public static final AdaptiveRecvByteBufAllocator DEFAULT = new AdaptiveRecvByteBufAllocator();
/**
* Creates a new predictor with the default parameters. With the default
* parameters, the expected buffer size starts from {@code 1024}, does not
* go down below {@code 64}, and does not go up above {@code 65536}.
*/
private AdaptiveRecvByteBufAllocator() {
this(DEFAULT_MINIMUM, DEFAULT_INITIAL, DEFAULT_MAXIMUM);
}
那具体内部的实现逻辑是怎么做到的呢,让我们来一步步看下源码逻辑:
首先第一步,需要进行静态代码块初始化:
private static final int INDEX_INCREMENT = 4;
private static final int INDEX_DECREMENT = 1;
private static final int[] SIZE_TABLE;
static {
List sizeTable = new ArrayList();
for (int i = 16; i < 512; i += 16) {
sizeTable.add(i);
}
for (int i = 512; i > 0; i <<= 1) {
sizeTable.add(i);
}
SIZE_TABLE = new int[sizeTable.size()];
for (int i = 0; i < SIZE_TABLE.length; i ++) {
SIZE_TABLE[i] = sizeTable.get(i);
}
}
分配了一个int类型的数组,并进行该数组的初始化处理,从实现来看,该数组的长度是53,前32位是16的倍数,value值是从16开始,到512;从第33位开始,值是前一位的两倍,即从1024、2048、到最大值1073741824。
那我们来看下,AdaptiveRecvByteBufAllocator的构造方法是怎样的:
public AdaptiveRecvByteBufAllocator(int minimum, int initial, int maximum) {
if (minimum <= 0) {
throw new IllegalArgumentException("minimum: " + minimum);
}
if (initial < minimum) {
throw new IllegalArgumentException("initial: " + initial);
}
if (maximum < initial) {
throw new IllegalArgumentException("maximum: " + maximum);
}
int minIndex = getSizeTableIndex(minimum);
if (SIZE_TABLE[minIndex] < minimum) {
this.minIndex = minIndex + 1;
} else {
this.minIndex = minIndex;
}
int maxIndex = getSizeTableIndex(maximum);
if (SIZE_TABLE[maxIndex] > maximum) {
this.maxIndex = maxIndex - 1;
} else {
this.maxIndex = maxIndex;
}
this.initial = initial;
}
构造器需要三个参数,第一个是缓冲区预期的最小长度下限;第二个是初始化时的大小,最后一个是缓冲区期望的最大长度上限
从代码逻辑可见,minimum不能大于初始化长度initial,而初始化长度initial不能超过maximum
上面逻辑中,还关乎了一个关键方法,getSizeTableIndex,让我们进一步看看该方法主要干了啥:
private static int getSizeTableIndex(final int size) {
for (int low = 0, high = SIZE_TABLE.length - 1;;) {
if (high < low) {
return low;
}
if (high == low) {
return high;
}
int mid = low + high >>> 1;
int a = SIZE_TABLE[mid];
int b = SIZE_TABLE[mid + 1];
if (size > b) {
low = mid + 1;
} else if (size < a) {
high = mid - 1;
} else if (size == a) {
return mid;
} else {
return mid + 1;
}
}
}
入参是一个大小,然后利用二分查找法对该数组进行size的定位,目标是为了找出该size值在数组中的下标位置,该构造方法,最终初始化了如下是三个参数:
private final int minIndex;
private final int maxIndex;
//注意最后一个参数没有进行下标查找
private final int initial;
到目前为止,我们大致了解了,该AdaptiveRecvByteBufAllocator的内部的一些数据结构,那具体的处理过程是怎样的,详见如下的Hander:
private static final class HandleImpl implements Handle {
private final int minIndex;
private final int maxIndex;
private int index;
private int nextReceiveBufferSize;
private boolean decreaseNow;
HandleImpl(int minIndex, int maxIndex, int initial) {
this.minIndex = minIndex;
this.maxIndex = maxIndex;
index = getSizeTableIndex(initial);
nextReceiveBufferSize = SIZE_TABLE[index];
}
@Override
public ByteBuf allocate(ByteBufAllocator alloc) {
return alloc.ioBuffer(nextReceiveBufferSize);
}
@Override
public int guess() {
return nextReceiveBufferSize;
}
...
}
该实现类,有一个关键的参数nextReceiveBufferSize,在其构造器中可见,用initial获取一开始初始化缓冲的下标,在根据SIZE_TABLE查找应分配的BufferSize。
进一步往下看,在实际进行分配时,分配大小传递的是nextReceiveBufferSize这个参数,由此可见,该参数是每次调整容量的关键。
那具体是谁在更新这个参数呢,我们进一步看HanderImpl类的最后一个方法:
@Override
public void record(int actualReadBytes) {
if (actualReadBytes <= SIZE_TABLE[Math.max(0, index - INDEX_DECREMENT - 1)]) {
if (decreaseNow) {
index = Math.max(index - INDEX_DECREMENT, minIndex);
nextReceiveBufferSize = SIZE_TABLE[index];
decreaseNow = false;
} else {
decreaseNow = true;
}
} else if (actualReadBytes >= nextReceiveBufferSize) {
index = Math.min(index + INDEX_INCREMENT, maxIndex);
nextReceiveBufferSize = SIZE_TABLE[index];
decreaseNow = false;
}
}
该方法的参数是一次读取操作中实际读取到的数据大小,将其与nextReceiveBufferSize 进行比较,如果实际字节数actualReadBytes大于等于该值,则立即更新nextReceiveBufferSize ,其更新后的值与INDEX_INCREMENT有关。INDEX_INCREMENT为默认常量,值为4。也就是说在扩容时会一次性增大多一些,以保证下次有足够空间可以接收数据。而相对扩容的策略,缩容策略则实际保守些,常量为INDEX_INCREMENT,值为1,同样也是进行对比, 但不同的是,若实际字节小于所用nextReceiveBufferSize,并不会立马进行大小调整,而是先把 decreaseNow 设置为true,如果下次仍然小于,则才会减少nextReceiveBufferSize的大小。
看到这里,我们大概了解了动态接收缓冲区的实现逻辑,本质是利用了预先分配好的SIZE_TABLE数组来进行空间大小的分配。
讲到具体的使用,我们先从请求协议的角度来看待Netty4对于接收缓冲区是如何做默认处理的。
先来看看UDP协议,来一段bootstrap的最基本逻辑:
EventLoopGroup group = new NioEventLoopGroup();
try {
Bootstrap b = new Bootstrap();
b.group(group)
.channel(NioDatagramChannel.class)
.handler(new XXXXXXServerHandler());
b.bind(PORT).sync().channel().closeFuture().await();
} finally {
group.shutdownGracefully();
}
如上代码,就是一般最常见的创建基于UDP协议服务端的启动代码,进一步看看NioDatagramChannel,其是如何创建接收缓冲区的:
具体如何运行到该类的创建,有兴趣的同学,具体可以跟着代码逻辑从bind->doBind->initAndRegister->newChannel->newInstance。
private static final RecvByteBufAllocator DEFAULT_RCVBUF_ALLOCATOR = new FixedRecvByteBufAllocator(2048);
/**
* Create a new instance which will use the Operation Systems default {@link InternetProtocolFamily}.
*/
public NioDatagramChannel() {
this(newSocket(DEFAULT_SELECTOR_PROVIDER));
}
public NioDatagramChannel(DatagramChannel socket) {
super(null, socket, SelectionKey.OP_READ);
config = new NioDatagramChannelConfig(this, socket);
}
如上可见,创建UDP协议服务端时,其底层默认就是使用了FixedRecvByteBufAllocator,并且长度上只分配了2048的空间,用的时候要注意啦!
好,那TCP协议的流程又是怎样的,同样让我们看一段TCP协议的boostrap代码:
EventLoopGroup bossGroup = new NioEventLoopGroup(1);
EventLoopGroup workerGroup = new NioEventLoopGroup();
try {
ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup)
.channel(NioServerSocketChannel.class)
.handler(new LoggingHandler(LogLevel.INFO))
.childHandler(new SocksServerInitializer());
b.bind(PORT).sync().channel().closeFuture().sync();
} finally {
bossGroup.shutdownGracefully();
workerGroup.shutdownGracefully();
}
可见看到的是,其会创建两个group,对reactor模型有认识的,应该会知道两个group对应着两种Channel通道
bossGroup对应NioServerSocketChannel,而workerGroup对应NioSocketChannel,那NioSocketChannel通道对应的缓冲区又是怎样搞的呢?
那首先从workerGroup的线程run方法起,当然这里不展开,具体会运行到processSelectedKey,并进行请求的读取
@Override
public final void read() {
final ChannelConfig config = config();
if (!config.isAutoRead() && !isReadPending()) {
// ChannelConfig.setAutoRead(false) was called in the meantime
removeReadOp();
return;
}
final ChannelPipeline pipeline = pipeline();
final ByteBufAllocator allocator = config.getAllocator();
final int maxMessagesPerRead = config.getMaxMessagesPerRead();
RecvByteBufAllocator.Handle allocHandle = this.allocHandle;
if (allocHandle == null) {
this.allocHandle = allocHandle = config.getRecvByteBufAllocator().newHandle();
}
ByteBuf byteBuf = null;
int messages = 0;
boolean close = false;
try {
int totalReadAmount = 0;
boolean readPendingReset = false;
do {
byteBuf = allocHandle.allocate(allocator);
如上可得知,实际的接受ByteBuf分配器是在config方法中获取的,我们进入该方法:
private static final RecvByteBufAllocator DEFAULT_RCVBUF_ALLOCATOR = AdaptiveRecvByteBufAllocator.DEFAULT;
private volatile RecvByteBufAllocator rcvBufAllocator = DEFAULT_RCVBUF_ALLOCATOR;
@Override
public RecvByteBufAllocator getRecvByteBufAllocator() {
return rcvBufAllocator;
}
实际中发现,该接收缓冲区的分配器,是采用了默认的AdaptiveRecvByteBufAllocator的默认策略。
大致了解了TCP和UDP的不同默认分配规则,若我们有场景需要自己创建,则可用如下代码来实现
.option(ChannelOption.RCVBUF_ALLOCATOR, AdaptiveRecvByteBufAllocator.DEFAULT);
.option(ChannelOption.RCVBUF_ALLOCATOR, new FixedRecvByteBufAllocator(DEFAULT_SIZE));
.option(ChannelOption.RCVBUF_ALLOCATOR,
new AdaptiveRecvByteBufAllocator(DEFAULT_MIN_SIZE, DEFAULT_INITIAL_SIZE, DEFAULT_MAX_SIZE));