PulsarClient
PulsarClient client = PulsarClient.builder()
.serviceUrl("pulsar://localhost:6650")
.build();
让我们看一下这个类的主要方法
创建producer/consumer/reader
元数据信息相关
transaction相关
close方法
ClientBuilder
这里有一个builder方法用来传递一些PulsarClient的配置
支持的配置项
-
连接配置相关:
连接地址:serviceUrl / serviceUrlProvider / listener / proxyServiceUrl
operation超时时间: operationTimeout
-
tcp配置:
tcpNoDelay
keepAliveinterval
建立连接超时:connectionTimeout
一个broker创建多少连接
请求重试策略(请求出错后backOff时间是多少)
-
lookup请求配置:
lookup请求并发
最大重定向次数
连接最大拒绝的请求数目
-
线程数目:
ioThreads
listenerThreads
TLS + 鉴权相关
事务相关
metric相关
这里面Builder.build就直接配置参数传入了PulsarClientImpl的构造函数了
我们看下这里面做了什么操作
PulsarClientImpl
package org.apache.pulsar.client.impl;
public class PulsarClientImpl implements PulsarClient {
// 查找服务
private LookupService lookup;
// 连接池
private final ConnectionPool cnxPool;
// netty 里面的HashedWheelTimer,用来调度一些延迟操作
private final Timer timer;
private final ExecutorProvider externalExecutorProvider;
private final ExecutorProvider internalExecutorService;
// 当前PulsarClient的状态
private AtomicReference state = new AtomicReference<>();
// 所有的业务处理单元(客户端逻辑)
private final Set> producers;
private final Set> consumers;
// id发号器
private final AtomicLong producerIdGenerator = new AtomicLong();
private final AtomicLong consumerIdGenerator = new AtomicLong();
private final AtomicLong requestIdGenerator = new AtomicLong();
// 这里面的EventLoopGroup好像只被当成线程池来用了
// 0. ConnectionPool 里面初始化作为连接的io线程池(netty客户端常规用法)
// 1. 在Consumer里面用来定时flush PersistentAcknowledgmentsGroupingTracker
// 2. Producer 里面用来定时生成加密的key
// 3. 作为AsyncHttpClient的构造参数
private final EventLoopGroup eventLoopGroup;
// Schema 的cache
private final LoadingCache schemaProviderLoadingCache;
// producer 用来生成PublishTime
private final Clock clientClock;
@Getter
private TransactionCoordinatorClientImpl tcClient;
这个类的构造函数主要就是初始化这几个关键变量,没有特殊操作
LookUpService
根据配置参数会选择HttpLookupService
或者是BinaryProtoLookupService
ConnectionPool
我们先看一下ConnectionPool
package org.apache.pulsar.client.impl;
public class ConnectionPool implements Closeable {
// 连接池,保存连接
// 地址 -> 第x个连接 -> 连接
// 如果配置maxConnectionsPerHosts=0 则把pooling关闭了
protected final ConcurrentHashMap>> pool;
// netty 相关
// PulsarClient 传递过来的
private final EventLoopGroup eventLoopGroup;
private final Bootstrap bootstrap;
private final PulsarChannelInitializer channelInitializerHandler;
protected final DnsNameResolver dnsResolver;
// 配置
private final ClientConfigurationData clientConfig;
private final int maxConnectionsPerHosts;
// 是否是Server Name Indication 代理,TLS 相关,先忽略
private final boolean isSniProxy;
构造函数主要是按照netty 网络客户端方式初始化相关成员变量
bootstrap = new Bootstrap();
// 绑定io线程池
bootstrap.group(eventLoopGroup);
// 配置了channel类型,如果支持Epoll的话会变成Epoll的channel
bootstrap.channel(EventLoopUtil.getClientSocketChannelClass(eventLoopGroup));
// 设置tcp的连接超时时间
bootstrap.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, conf.getConnectionTimeoutMs());
// 设置tcp no delay
bootstrap.option(ChannelOption.TCP_NODELAY, conf.isUseTcpNoDelay());
// 配置allocator
bootstrap.option(ChannelOption.ALLOCATOR, PulsarByteBufAllocator.DEFAULT);
// 绑定channelInitializer
channelInitializerHandler = new PulsarChannelInitializer(conf, clientCnxSupplier);
bootstrap.handler(channelInitializerHandler);
// 这个类是netty提供的,用来解析DNS,后面专门会说
this.dnsResolver = new DnsNameResolverBuilder(eventLoopGroup.next()).traceEnabled(true)
.channelType(EventLoopUtil.getDatagramChannelClass(eventLoopGroup)).build();
}
这里面传入的BufferPool是一个自定义的
这个连接池的主要功能
创建并cache连接
归还连接
按照配置的
maxConnectionsPerHosts
限制连接数目
具体使用方式可以参照org.apache.pulsar.client.impl.ConnectionPoolTest
这个类
ConnectionPool pool;
InetSocketAddress brokerAddress = ....;
// 获取连接,如果之前没有的话,会创建一个
CompletableFuture conn = pool.getConnection(brokerAddress);
ClientCnx cnx = conn.get();
// 使用连接做事情
...
// 归还给连接池
pool.releaseConnection(cnx);
pool.closeAllConnections();
pool.close();
我们先看一下这个类PulsarChannelInitializer
用来初始化和pulsar broker 端的连接。
public void initChannel(SocketChannel ch) throws Exception {
// tls相关
ch.pipeline().addLast("ByteBufPairEncoder", tlsEnabled ? ByteBufPair.COPYING_ENCODER : ByteBufPair.ENCODER);
// 定长解码器
ch.pipeline().addLast("frameDecoder",
new LengthFieldBasedFrameDecoder(
Commands.DEFAULT_MAX_MESSAGE_SIZE + Commands.MESSAGE_SIZE_FRAME_PADDING, 0, 4, 0, 4));
// 到这里可以拿到了RPC协议反序列化后的对象,进行客户端逻辑处理
// 实际在这个类ClientCnx里面处理所有逻辑
ch.pipeline().addLast("handler", clientCnxSupplier.get());
}
创建连接逻辑 (connectToAddress)
netty 的bootstrap.connect
(忽略tls)
ClientCnx
我们看一下这个类的层次结构
public class ClientCnx extends PulsarHandler;
public abstract class PulsarHandler extends PulsarDecoder;
public abstract class PulsarDecoder extends ChannelInboundHandlerAdapter;
PulsarDecoder
PulsarDecoder
这个类前面在初始化连接的时候还加入了一个LengthFieldBasedFrameDecoder
.
所以到这里的channelRead
就可以直接反序列化RPC就可以,之后会调用相应的RPC处理方法(handleXXXXXX)
public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception {
...
// Get a buffer that contains the full frame
ByteBuf buffer = (ByteBuf) msg;
BaseCommand cmd = null;
BaseCommand.Builder cmdBuilder = null;
try {
// De-serialize the command
int cmdSize = (int) buffer.readUnsignedInt();
int writerIndex = buffer.writerIndex();
buffer.writerIndex(buffer.readerIndex() + cmdSize);
// 从对象池里拿到一个ByteBufCodedInputStream
ByteBufCodedInputStream cmdInputStream = ByteBufCodedInputStream.get(buffer);
cmdBuilder = BaseCommand.newBuilder();
// 反序列化
cmd = cmdBuilder.mergeFrom(cmdInputStream, null).build();
buffer.writerIndex(writerIndex);
cmdInputStream.recycle();
...
// 下面按照不同的RPC类型调用不用的方法进行处理
switch (cmd.getType()) {
case PARTITIONED_METADATA:
checkArgument(cmd.hasPartitionMetadata());
try {
interceptCommand(cmd);
handlePartitionMetadataRequest(cmd.getPartitionMetadata());
} catch (InterceptException e) {
ctx.writeAndFlush(Commands.newPartitionMetadataResponse(getServerError(e.getErrorCode()),
e.getMessage(), cmd.getPartitionMetadata().getRequestId()));
} finally {
cmd.getPartitionMetadata().recycle();
}
break;
...
// 省略其他RPC方法,都是正常handleXXXXX
} finally {
// 清理方法
if (cmdBuilder != null) {
cmdBuilder.recycle();
}
if (cmd != null) {
cmd.recycle();
}
buffer.release();
}
}
PulsarHandler
这个类实际里面主要增加了KeepAlive逻辑的实现。
具体查看相应方法即可,比较容易
ClientCnx
这里主要负责和服务端交互的逻辑。
package org.apache.pulsar.client.impl;
public class ClientCnx extends PulsarHandler {
// 连接状态
enum State {
None, SentConnectFrame, Ready, Failed, Connecting
}
private State state;
//----------------------------------------------------------------------
// 临时的请求队列
// requestId -> 请求
private final ConcurrentLongHashMap> pendingRequests =
new ConcurrentLongHashMap<>(16, 1);
// Lookup 请求队列
private final Queue>>> waitingLookupRequests;
//----------------------------------------------------------------------
// 一些业务逻辑单元
private final ConcurrentLongHashMap> producers = new ConcurrentLongHashMap<>(16, 1);
private final ConcurrentLongHashMap> consumers = new ConcurrentLongHashMap<>(16, 1);
private final ConcurrentLongHashMap transactionMetaStoreHandlers = new ConcurrentLongHashMap<>(16, 1);
//----------------------------------------------------------------------
// 异步新建连接的handle
private final CompletableFuture connectionFuture = new CompletableFuture();
//----------------------------------------------------------------------
// PulsarClient 构造时传递进来的线程池
private final EventLoopGroup eventLoopGroup;
//----------------------------------------------------------------------
// 限流(和lookup有关)
private final Semaphore pendingLookupRequestSemaphore;
private final Semaphore maxLookupRequestSemaphore;
// 连接拒绝相关的成员(和lookup有关)
private final int maxNumberOfRejectedRequestPerConnection;
private final int rejectedRequestResetTimeSec = 60;
// 被拒绝的请求数目(和lookup有关)
private static final AtomicIntegerFieldUpdater NUMBER_OF_REJECTED_REQUESTS_UPDATER = AtomicIntegerFieldUpdater
.newUpdater(ClientCnx.class, "numberOfRejectRequests");
@SuppressWarnings("unused")
private volatile int numberOfRejectRequests = 0;
//----------------------------------------------------------------------
// 用来检查请求是否超时的数据结构
private static class RequestTime {
final long creationTimeMs;
final long requestId;
final RequestType requestType;
RequestTime(long creationTime, long requestId, RequestType requestType) {
this.creationTimeMs = creationTime;
this.requestId = requestId;
this.requestType = requestType;
}
}
// 超时的请求队列
private final ConcurrentLinkedQueue requestTimeoutQueue = new ConcurrentLinkedQueue<>();
//----------------------------------------------------------------------
// 消息的最大大小
@Getter
private static int maxMessageSize = Commands.DEFAULT_MAX_MESSAGE_SIZE;
// RPC协议版本
private final int protocolVersion;
// operation超时时间
private final long operationTimeoutMs;
// 用来检查operation超时时间的handle
private ScheduledFuture> timeoutTask;
//----------------------------------------------------------------------
// 一些记录是否从proxy连接的信息
protected String proxyToTargetBrokerAddress = null;
protected String remoteHostName = null;
// TLS 相关
private boolean isTlsHostnameVerificationEnable;
private static final TlsHostnameVerifier HOSTNAME_VERIFIER = new TlsHostnameVerifier();
protected final Authentication authentication;
protected AuthenticationDataProvider authenticationDataProvider;
//----------------------------------------------------------------------
// 事务相关
private TransactionBufferHandler transactionBufferHandler;
private enum RequestType {
Command,
GetLastMessageId,
GetTopics,
GetSchema,
GetOrCreateSchema;
String getDescription() {
if (this == Command) {
return "request";
} else {
return name() + " request";
}
}
}
这里临时回到ConnectionPool
的逻辑中,之前创建连接的时候实际调用Bootstrap.connect
这里返回的实际是一个Netty的Channel对象,但是ConnectionPool里面返回的ClientCnx
对象。
ConnectionPool
private CompletableFuture createConnection(InetSocketAddress logicalAddress,
InetSocketAddress physicalAddress, int connectionKey) {
final CompletableFuture cnxFuture = new CompletableFuture();
// Trigger async connect to broker
createConnection(physicalAddress).thenAccept(channel -> {
....
// 这里面ClientCnx对象实际是从这个已经成功连接的Channel的pipeline里拿到的
final ClientCnx cnx = (ClientCnx) channel.pipeline().get("handler");
....
if (!logicalAddress.equals(physicalAddress)) {
// We are connecting through a proxy. We need to set the target broker in the ClientCnx object so that
// it can be specified when sending the CommandConnect.
// That phase will happen in the ClientCnx.connectionActive() which will be invoked immediately after
// this method.
cnx.setTargetBroker(logicalAddress);
}
// 保存了远端连接的地址
cnx.setRemoteHostName(physicalAddress.getHostName());
cnx.connectionFuture().thenRun(() -> {
...
// 连接成功则返回
cnxFuture.complete(cnx);
}).exceptionally(exception -> {
cnxFuture.completeExceptionally(exception);
cleanupConnection(logicalAddress, connectionKey, cnxFuture);
cnx.ctx().close();
return null;
});
...
ClientCnx的主要方法(功能)
-
连接生命周期管理(netty Handler里面的方法)
channelActive
channelInActive
exceptionCaught
......
-
发送request:主动发送RPC的方法,并按照业务逻辑处理
Lookup请求
getLastMessageId
getSchema
.....
处理response:继承自
PulsarDecoder
的handleXXXXX RPC 处理逻辑主动发送RPC方法获得原始的response
CompletableFuture sendRequestAndHandleTimeout(ByteBuf requestMessage,long requestId,RequestType requestType)
检查请求是否超时
checkRequestTimeout
-
注册/ 删除业务逻辑对象(业务逻辑对象后面单出文章说)
consumer
producer
transactionMetaStoreHandler
transactionBufferHandler
sendRequestAndHandleTimeout方法
private CompletableFuture sendRequestAndHandleTimeout(ByteBuf requestMessage, long requestId, RequestType requestType) {
// 放入到pending请求队列里面,用来等待response
CompletableFuture future = new CompletableFuture<>();
pendingRequests.put(requestId, future);
// 直接发送RPC body
ctx.writeAndFlush(requestMessage).addListener(writeFuture -> {
if (!writeFuture.isSuccess()) {
log.warn("{} Failed to send {} to broker: {}", ctx.channel(), requestType.getDescription(), writeFuture.cause().getMessage());
pendingRequests.remove(requestId);
future.completeExceptionally(writeFuture.cause());
}
});
// 在超时队列里面增加一个数据结构用来记录超时
requestTimeoutQueue.add(new RequestTime(System.currentTimeMillis(), requestId, requestType));
return future;
}
chanelActive 方法
这个方法逻辑比较简单
PulsarHandler.channelActive方法里面开启了KeepAlive逻辑的调度任务
ClientCnx.channelActive 方法里面开启了requestTimeout逻辑的调度任务
发送一个ConnectCommand请求给服务端(服务端处理逻辑到后面会说)
请求超时的处理
这个逻辑也比较容易。
使用了EventLoopGroup调度了一个定时任务,每次去查看requestTimeoutQueue里面的请求是否有超时的
有的话就把这个请求的response设置成TimeoutException
这里的请求超时检查时间间隔是operationTimeoutMs
决定的
PulsarClient 功能回顾
这样让我们回顾一下PulsarClient的总体功能
包含了一个连接池用来创建ClientCnx和服务端进行沟通
保存了一些自定义业务处理单元(consumer,producer, tcClient)
LookupService
一些周期check的动作
Schema 的LoadingCache
业务单元通过注册到ClientCnx上面,可以使用这个连接发送RPC,获得response,这样传递回业务逻辑单元里面
PulsarClient这个类对使用者来说提供了一个RPC层面的抽象,其他类使用RPC完成自己的逻辑