在SOA
分布式服务体系架构中,注册中心担任了服务注册以及服务调用解析的任务。那么在RocketMQ
中,NameServer
则负责了类似的职责。它是整个RocketMQ
体系中的中枢系统,负责体系中的消息调度以及控制。
在介绍NameServer
的工作任务流程之前,我们先一起来看下RocketMQ
物理部署的架构图,如下所示:
(图片来自于网络)
在RocketMQ
物理部署结构中,Broker
消息服务器在启动时会向所有的NameServer
服务器进行注册操作。此后消息发送者(Producer
)将在发送消息之前从NameServer
获取到Broker
消息服务器的地址列表。NameServer
与Broker
消息服务器保持着长连接,同时每次间隔30s做Broker
的存活检测。如果发现Broker
依然宕机,就会从路由注册表中将该Broker
进行删除。此结构中,NameServer
服务之间不互相通信。但是所有的Broker
、Producer
以及Consumer
都需要与NameServer
进行通信。
在NamesrvStartup
类中进行NameServer
的启动操作,源码如下所示:
public static NamesrvController main0(String[] args) {
try {
//创建NamesrvController
NamesrvController controller = createNamesrvController(args);
//启动NamesrvController
start(controller);
String tip = "The Name Server boot success. serializeType=" + RemotingCommand.getSerializeTypeConfigInThisServer();
log.info(tip);
System.out.printf("%s%n", tip);
return controller;
} catch (Throwable e) {
e.printStackTrace();
System.exit(-1);
}
return null;
}
创建NamesrvController
实例,源码如下所示:
public static NamesrvController createNamesrvController(String[] args) throws IOException, JoranException {
System.setProperty(RemotingCommand.REMOTING_VERSION_KEY, Integer.toString(MQVersion.CURRENT_VERSION));
//PackageConflictDetect.detectFastjson();
//读取启动参数
Options options = ServerUtil.buildCommandlineOptions(new Options());
commandLine = ServerUtil.parseCmdLine("mqnamesrv", args, buildCommandlineOptions(options), new PosixParser());
if (null == commandLine) {
System.exit(-1);
return null;
}
//NamesrvConfig为NameServer配置,NettyServerConfig为Netty配置,用来构造NamesrvController
final NamesrvConfig namesrvConfig = new NamesrvConfig();
final NettyServerConfig nettyServerConfig = new NettyServerConfig();
nettyServerConfig.setListenPort(9876);
if (commandLine.hasOption('c')) {
String file = commandLine.getOptionValue('c');
if (file != null) {
InputStream in = new BufferedInputStream(new FileInputStream(file));
properties = new Properties();
properties.load(in);
MixAll.properties2Object(properties, namesrvConfig);
MixAll.properties2Object(properties, nettyServerConfig);
namesrvConfig.setConfigStorePath(file);
System.out.printf("load config properties file OK, %s%n", file);
in.close();
}
}
if (commandLine.hasOption('p')) {
InternalLogger console = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_CONSOLE_NAME);
MixAll.printObjectProperties(console, namesrvConfig);
MixAll.printObjectProperties(console, nettyServerConfig);
System.exit(0);
}
MixAll.properties2Object(ServerUtil.commandLine2Properties(commandLine), namesrvConfig);
if (null == namesrvConfig.getRocketmqHome()) {
System.out.printf("Please set the %s variable in your environment to match the location of the RocketMQ installation%n", MixAll.ROCKETMQ_HOME_ENV);
System.exit(-2);
}
LoggerContext lc = (LoggerContext) LoggerFactory.getILoggerFactory();
JoranConfigurator configurator = new JoranConfigurator();
configurator.setContext(lc);
lc.reset();
configurator.doConfigure(namesrvConfig.getRocketmqHome() + "/conf/logback_namesrv.xml");
log = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_LOGGER_NAME);
MixAll.printObjectProperties(log, namesrvConfig);
MixAll.printObjectProperties(log, nettyServerConfig);
final NamesrvController controller = new NamesrvController(namesrvConfig, nettyServerConfig);
// remember all configs to prevent discard
controller.getConfiguration().registerConfig(properties);
return controller;
}
启动NamesrvController
源码如下所示:
public static NamesrvController start(final NamesrvController controller) throws Exception {
if (null == controller) {
throw new IllegalArgumentException("NamesrvController is null");
}
//初始化NamesrvController实例
boolean initResult = controller.initialize();
if (!initResult) {
controller.shutdown();
System.exit(-3);
}
//注册JVM钩子函数,同时启动服务器,用于监听Broker、消息生产者的网络请求
Runtime.getRuntime().addShutdownHook(new ShutdownHookThread(log, new Callable<Void>() {
@Override
public Void call() throws Exception {
controller.shutdown();
return null;
}
}));
//启动
controller.start();
return controller;
}
NamesrvController
初始化操作如下所示:
public boolean initialize() {
this.kvConfigManager.load();
//初始化nettyServer
this.remotingServer = new NettyRemotingServer(this.nettyServerConfig, this.brokerHousekeepingService);
//初始化线程池
this.remotingExecutor =
Executors.newFixedThreadPool(nettyServerConfig.getServerWorkerThreads(), new ThreadFactoryImpl("RemotingExecutorThread_"));
//注册请求处理
this.registerProcessor();
this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
NamesrvController.this.routeInfoManager.scanNotActiveBroker();
}
}, 5, 10, TimeUnit.SECONDS);
this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
NamesrvController.this.kvConfigManager.printAllPeriodically();
}
}, 1, 10, TimeUnit.MINUTES);
if (TlsSystemConfig.tlsMode != TlsMode.DISABLED) {
// Register a listener to reload SslContext
try {
fileWatchService = new FileWatchService(
new String[] {
TlsSystemConfig.tlsServerCertPath,
TlsSystemConfig.tlsServerKeyPath,
TlsSystemConfig.tlsServerTrustCertPath
},
new FileWatchService.Listener() {
boolean certChanged, keyChanged = false;
@Override
public void onChanged(String path) {
if (path.equals(TlsSystemConfig.tlsServerTrustCertPath)) {
log.info("The trust certificate changed, reload the ssl context");
reloadServerSslContext();
}
if (path.equals(TlsSystemConfig.tlsServerCertPath)) {
certChanged = true;
}
if (path.equals(TlsSystemConfig.tlsServerKeyPath)) {
keyChanged = true;
}
if (certChanged && keyChanged) {
log.info("The certificate and private key changed, reload the ssl context");
certChanged = keyChanged = false;
reloadServerSslContext();
}
}
private void reloadServerSslContext() {
((NettyRemotingServer) remotingServer).loadSslContext();
}
});
} catch (Exception e) {
log.warn("FileWatchService created error, can't load the certificate dynamically");
}
}
return true;
}
NameServer
主要是为了生产者以及消费者提供Topic
的路由信息。那么路由信息是存在哪里呢?在RouteInfoManager
源码中,可以查看到如下信息:
private final HashMap<String/* topic */, List<QueueData>> topicQueueTable;
private final HashMap<String/* brokerName */, BrokerData> brokerAddrTable;
private final HashMap<String/* clusterName */, Set<String/* brokerName */>> clusterAddrTable;
private final HashMap<String/* brokerAddr */, BrokerLiveInfo> brokerLiveTable;
private final HashMap<String/* brokerAddr */, List<String>/* Filter Server */> filterServerTable;
1、 topicQueueTable
Topic
消息队列路由信息,存储所有Topic
的属性信息,消息发送时会根据路由表进行负载均衡;
它是一个HashMap
的数据结构,对应的key
值为Topic
的名称,value
为List
,QueueData
信息如下所示:
public class QueueData implements Comparable<QueueData> {
//Broker名称
private String brokerName;
//读队列长度
private int readQueueNums;
//写队列长度
private int writeQueueNums;
//读写去权限
private int perm;
//Topic同步标志
private int topicSynFlag;
...
}
2、 brokerAddrTable
存储Broker
的属性信息,包括Broker
基础信息,包含brokerName
、所属集群名称,主备Broker
地址;
对应的key
只为BrokerName
,value为BrokerData
。BrokerData
的相关属性如下所示:
public class BrokerData implements Comparable<BrokerData> {
private String cluster;
private String brokerName;
private HashMap<Long/* brokerId */, String/* broker address */> brokerAddrs;
...
}
相同名称的Broker
可能有多台机器,也就是说它可能是个集群,一个Master
以及多个Slave
。所以在BrokerData
中,存储了相关的属性,即Broker
的名称、所属集群的信息,以及对应的机器的地址信息。
3、clusterAddrTable
Broker
集群信息,存储集群重点 所有Broker
的名称;
key
为集群的名称,value
为Broker
名称的集合。
4、brokerLiveTable
Broker
及其的实时状态信息,NameServer
每次收到心跳探测包会进行状态信息更新;
key值为 Broker
的地址,对应一台机器,value
为BrokerLiveInfo
,它保存着对应Broker
服务器的地址信息。lastUpdateTimestamp
表示上次的状态更新时间,NameServer
定时检查这个时间出的实时性,如果发现这个时间戳超过一定时间没有进行更新,则会将该Broker
的地址从列表中进行删除。
class BrokerLiveInfo {
//最后更新时间
private long lastUpdateTimestamp;
//数据版本号
private DataVersion dataVersion;
//连接信息
private Channel channel;
//服务器地址
private String haServerAddr;
...
}
5、filterServerTable
存储过滤服务器信息,Broker
上的FilterServer
列表,用于类模式消息过滤。对应的key
值为Broker
地址,value
为FilterServer
的地址列表。
在第二小节中,我们介绍了路由信息的相关内容。那么在RocketMQ
体系中,路由注册时如何实现的呢?路由注册通过Broker
与NameServer
之间心跳连接进行的。当Broker
服务启动之后,就会向集群中的所有的NameServer
发送心跳探测包。正常运行之后,Broker
会每隔三十秒向集群中的所有NameServer
发送心跳检测包。当接收到心跳检测包时,NameServer
会更新brokerLiveTable
缓存中的BrokerLiveInfo
的lastUpdateTimestamp
时间戳。与此同时,NameServer
会每隔10s对brokerLiveTable
进行扫描,如果发现连续120s未检测到心跳,则会将该Broker
路由信息进行移除,流程如下所示:
在BrokerController
中,通过调用它的start()
方法,进行Broker端的心跳包发送,逻辑代码如下所示:
this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
try {
//注册所有Broker
BrokerController.this.registerBrokerAll(true, false, brokerConfig.isForceRegister());
} catch (Throwable e) {
log.error("registerBrokerAll Exception", e);
}
}
}, 1000 * 10, Math.max(10000, Math.min(brokerConfig.getRegisterNameServerPeriod(), 60000)), TimeUnit.MILLISECONDS);
在registerBrokerAll
方法中调用了doRegisterBrokerAll
方法,如下所示:
private void doRegisterBrokerAll(boolean checkOrderConfig, boolean oneway,
TopicConfigSerializeWrapper topicConfigWrapper) {
List<RegisterBrokerResult> registerBrokerResultList = this.brokerOuterAPI.registerBrokerAll(
this.brokerConfig.getBrokerClusterName(),
this.getBrokerAddr(),
this.brokerConfig.getBrokerName(),
this.brokerConfig.getBrokerId(),
this.getHAServerAddr(),
topicConfigWrapper,
this.filterServerManager.buildNewFilterServerList(),
oneway,
this.brokerConfig.getRegisterBrokerTimeoutMills(),
this.brokerConfig.isCompressedRegister());
if (registerBrokerResultList.size() > 0) {
RegisterBrokerResult registerBrokerResult = registerBrokerResultList.get(0);
if (registerBrokerResult != null) {
if (this.updateMasterHAServerAddrPeriodically && registerBrokerResult.getHaServerAddr() != null) {
this.messageStore.updateHaMasterAddress(registerBrokerResult.getHaServerAddr());
}
this.slaveSynchronize.setMasterAddr(registerBrokerResult.getMasterAddr());
if (checkOrderConfig) {
this.getTopicConfigManager().updateOrderTopicConfig(registerBrokerResult.getKvTable());
}
}
}
}
在BrokerOuterAPI
中调用registerBrokerAll
方法获取所有注册的Broker
列表:
public List<RegisterBrokerResult> registerBrokerAll(
//集群名称
final String clusterName,
//broker地址
final String brokerAddr,
//broker名称
final String brokerName,
//broker ID
final long brokerId,
//master地址
final String haServerAddr,
final TopicConfigSerializeWrapper topicConfigWrapper,
final List<String> filterServerList,
final boolean oneway,
final int timeoutMills,
final boolean compressed) {
final List<RegisterBrokerResult> registerBrokerResultList = Lists.newArrayList();
List<String> nameServerAddressList = this.remotingClient.getNameServerAddressList();
if (nameServerAddressList != null && nameServerAddressList.size() > 0) {
final RegisterBrokerRequestHeader requestHeader = new RegisterBrokerRequestHeader();
requestHeader.setBrokerAddr(brokerAddr);
requestHeader.setBrokerId(brokerId);
requestHeader.setBrokerName(brokerName);
requestHeader.setClusterName(clusterName);
requestHeader.setHaServerAddr(haServerAddr);
requestHeader.setCompressed(compressed);
RegisterBrokerBody requestBody = new RegisterBrokerBody();
requestBody.setTopicConfigSerializeWrapper(topicConfigWrapper);
requestBody.setFilterServerList(filterServerList);
final byte[] body = requestBody.encode(compressed);
final int bodyCrc32 = UtilAll.crc32(body);
requestHeader.setBodyCrc32(bodyCrc32);
final CountDownLatch countDownLatch = new CountDownLatch(nameServerAddressList.size());
//遍历所有NameServer列表
for (final String namesrvAddr : nameServerAddressList) {
brokerOuterExecutor.execute(new Runnable() {
@Override
public void run() {
try {
//分别向NameServer进行注册
RegisterBrokerResult result = registerBroker(namesrvAddr,oneway, timeoutMills,requestHeader,body);
if (result != null) {
registerBrokerResultList.add(result);
}
log.info("register broker to name server {} OK", namesrvAddr);
} catch (Exception e) {
log.warn("registerBroker Exception, {}", namesrvAddr, e);
} finally {
countDownLatch.countDown();
}
}
});
}
try {
countDownLatch.await(timeoutMills, TimeUnit.MILLISECONDS);
} catch (InterruptedException e) {
}
}
return registerBrokerResultList;
}
NameServer每隔10s会对brokerLiveTable状态表进行扫描,扫描时如果发现BrokerLive中的lastUpdateTimestamp的时间戳信息与当前时间相差超过120s,则认为此时的Broker已经失活了,需要将其进行移除操作,同时还需要关闭对应的连接。接下来我们看下具体的代码:
在进行NamesrvCntroller示例初始化时,会进行定时检测任务。
this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
NamesrvController.this.routeInfoManager.scanNotActiveBroker();
}
}, 5, 10, TimeUnit.SECONDS);
在RouteInfoManager中的scanNotActiveBroker()方法,主要用于遍历brokerLiveInfo路由表信息,通过检测BrokerLiveInfo中的lastUpdateTimestamp字段,该时间戳信息为上次收到心跳检测包的时间,判断该时间信息与当前的时间之差是否超过了120s。
public void scanNotActiveBroker() {
//获取在线Broker列表
Iterator<Entry<String, BrokerLiveInfo>> it = this.brokerLiveTable.entrySet().iterator();
while (it.hasNext()) {
Entry<String, BrokerLiveInfo> next = it.next();
//获取对应的接收到心跳包的更新时间
long last = next.getValue().getLastUpdateTimestamp();
if ((last + BROKER_CHANNEL_EXPIRED_TIME) < System.currentTimeMillis()) {
RemotingUtil.closeChannel(next.getValue().getChannel());
it.remove();
log.warn("The broker channel expired, {} {}ms", next.getKey(), BROKER_CHANNEL_EXPIRED_TIME);
this.onChannelDestroy(next.getKey(), next.getValue().getChannel());
}
}
}
在onChannelDestroy方法中进行路由表的相关删除操作,如下所示:
public void onChannelDestroy(String remoteAddr, Channel channel) {
String brokerAddrFound = null;
if (channel != null) {
try {
try {
this.lock.readLock().lockInterruptibly();
Iterator<Entry<String, BrokerLiveInfo>> itBrokerLiveTable =
this.brokerLiveTable.entrySet().iterator();
while (itBrokerLiveTable.hasNext()) {
Entry<String, BrokerLiveInfo> entry = itBrokerLiveTable.next();
if (entry.getValue().getChannel() == channel) {
brokerAddrFound = entry.getKey();
break;
}
}
} finally {
this.lock.readLock().unlock();
}
} catch (Exception e) {
log.error("onChannelDestroy Exception", e);
}
}
if (null == brokerAddrFound) {
brokerAddrFound = remoteAddr;
} else {
log.info("the broker's channel destroyed, {}, clean it's data structure at once", brokerAddrFound);
}
if (brokerAddrFound != null && brokerAddrFound.length() > 0) {
try {
try {
//申请写锁
this.lock.writeLock().lockInterruptibly();
//移除信息
this.brokerLiveTable.remove(brokerAddrFound);
this.filterServerTable.remove(brokerAddrFound);
String brokerNameFound = null;
boolean removeBrokerName = false;
Iterator<Entry<String, BrokerData>> itBrokerAddrTable =
this.brokerAddrTable.entrySet().iterator();
while (itBrokerAddrTable.hasNext() && (null == brokerNameFound)) {
BrokerData brokerData = itBrokerAddrTable.next().getValue();
Iterator<Entry<Long, String>> it = brokerData.getBrokerAddrs().entrySet().iterator();
while (it.hasNext()) {
Entry<Long, String> entry = it.next();
Long brokerId = entry.getKey();
String brokerAddr = entry.getValue();
if (brokerAddr.equals(brokerAddrFound)) {
brokerNameFound = brokerData.getBrokerName();
it.remove();
log.info("remove brokerAddr[{}, {}] from brokerAddrTable, because channel destroyed",
brokerId, brokerAddr);
break;
}
}
if (brokerData.getBrokerAddrs().isEmpty()) {
removeBrokerName = true;
itBrokerAddrTable.remove();
log.info("remove brokerName[{}] from brokerAddrTable, because channel destroyed",
brokerData.getBrokerName());
}
}
//对brokerAddrTable进行维护
if (brokerNameFound != null && removeBrokerName) {
Iterator<Entry<String, Set<String>>> it = this.clusterAddrTable.entrySet().iterator();
while (it.hasNext()) {
Entry<String, Set<String>> entry = it.next();
String clusterName = entry.getKey();
Set<String> brokerNames = entry.getValue();
//移除brokerName
boolean removed = brokerNames.remove(brokerNameFound);
if (removed) {
log.info("remove brokerName[{}], clusterName[{}] from clusterAddrTable, because channel destroyed",
brokerNameFound, clusterName);
if (brokerNames.isEmpty()) {
log.info("remove the clusterName[{}] from clusterAddrTable, because channel destroyed and no broker in this cluster",
clusterName);
it.remove();
}
break;
}
}
}
//从clusterAddrTable中找到Broker从集群中进行删除
if (removeBrokerName) {
Iterator<Entry<String, List<QueueData>>> itTopicQueueTable =
this.topicQueueTable.entrySet().iterator();
while (itTopicQueueTable.hasNext()) {
Entry<String, List<QueueData>> entry = itTopicQueueTable.next();
String topic = entry.getKey();
List<QueueData> queueDataList = entry.getValue();
Iterator<QueueData> itQueueData = queueDataList.iterator();
while (itQueueData.hasNext()) {
QueueData queueData = itQueueData.next();
if (queueData.getBrokerName().equals(brokerNameFound)) {
//移除Topic信息
itQueueData.remove();
log.info("remove topic[{} {}], from topicQueueTable, because channel destroyed",
topic, queueData);
}
}
if (queueDataList.isEmpty()) {
itTopicQueueTable.remove();
log.info("remove topic[{}] all queue, from topicQueueTable, because channel destroyed",
topic);
}
}
}
} finally {
//释放锁
this.lock.writeLock().unlock();
}
} catch (Exception e) {
log.error("onChannelDestroy Exception", e);
}
}
}
本文主要阐述了NameServer
相关的路由管理的内容,路由管理是RocketMQ
系统中进行消息发送以及消费的重要前提。对于该部分的理解有助于我们更加深刻