一、NameServer介绍
NameServer 是专为 RocketMQ 设计的轻量级名称服务,具有简单、可集群横向扩展、无状态,节点之间互不通信等特点。整个Rocketmq集群的工作原理如下图所示:
可以看到,RocketMQ架构上主要分为四部分, Broker、Producer、Consumer、NameServer,其他三个都会与NameServer进行通信:
-
NameServer: 一个简单的Topic路由注册中心,其角色类似Dubbo中的zookeeper,支持Broker的动态注册与发现。
主要包括两个功能:
Broker管理:NameServer接受Broker的注册请求,处理请求数据作为路由信息的基础数据。对broker进行心跳检测机制,检测是否还存活(120s);
topic路由信息管理:每个NameServer都保存整个Broker集群的路由信息,用于Producer和Conumser查询的路由信息,从而进行消息的投递和消费。
Producer: 消息发布的角色,可集群部署。通过NameServer集群获得Topic的路由信息,包括Topic下面有哪些Queue,这些Queue分布在哪些Broker上等。(Producer只会将消息发送到Master节点,因此只需与Master节点建立连接)。
Consumer: 消息消费的角色,可集群部署。通过NameServer集群获得Topic的路由信息,连接到对应的Broker上拉取和消费消息。(Master和Slave都可以拉取消息,因此Consumer会与Master和Slave都建立连接)。
Broker: 主要负责消息的存储、投递和查询以及服务高可用保证。
二、为什么要使用NameServer?
目前可以作为服务发现组件有很多,如etcd、consul、zookeeper、nacos等:
那么为什么rocketmq选择自己开发一个NameServer,而不是使用这些开源组件呢?原因如下:
RocketMQ的架构设计决定了只需一个轻量级的元数据服务器,只需保持最终一致,而不需要Zookeeper的强一致性解决方案,无需再依赖另一个中间件,从而减少整体维护成本。
NameServer互相独立,彼此没有通信关系,由于Broker向每个NameServer注册自己的路由信息,所以每个NameServer都保存一份完整的路由信息,单台NameServer挂掉,Broker仍然可以向其它NameServer同步路由信息,不影响其他NameServer,所以Producer,Consumer仍然可以动态感知Broker的路由的信息。
三、NameServer 内部解密
NameServer的路由数据来源是broker注册提供,然后内部加工处理,而路由的数据的使用者是producer和consumer,接下来将着重解析NameServer的路由数据结构,路由注册/查询、broker动态等检测核心逻辑(源码)。3.1 路由数据结构RouteInfoManager是NameServer核心逻辑类,其代码作用就是维护路由信息管理,提供路由注册/查询等核心功能,由于路由信息都是保存在NameServer应用内存里,其本质就是维护HashMap,而为了防止并发操作,添加了ReentrantReadWriteLock读写锁,简单代码描述如下:
public class RouteInfoManager {
private static final InternalLogger log = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_LOGGER_NAME);
// NameServer 与 Broker 空闲时长,默认2分钟,在2分钟内 Nameserver 没有收到 Broker 的心跳包,则关闭该连接。
private final static long BROKER_CHANNEL_EXPIRED_TIME = 1000 * 60 * 2;
//读写锁
private final ReadWriteLock lock = new ReentrantReadWriteLock();
// Topic,以及对应的队列信息 --消息发送时根据路由表进行负载均衡 。
private final HashMap> topicQueueTable;
// 以BrokerName为单位的Broker集合(Broker基础信息,包含 brokerName、所属集群名称、主备 Broker地址。)
private final HashMap brokerAddrTable;
// 集群以及属于该进群的Broker列表(根据一个集群名,获得对应的一组BrokerName的列表)
private final HashMap > clusterAddrTable;
// 存活的Broker地址列表 (NameServer 每次收到心跳包时会 替换该信息 )
private final HashMap brokerLiveTable;
// Broker对应的Filter Server列表-消费端拉取消息用到
private final HashMap/* Filter Server */> filterServerTable;
...省略...
}
可以通过以下类图更清楚查看其关系:
QueueData 属性解析:
/**
* 队列信息
*/
public class QueueData implements Comparable {
// 队列所属的Broker名称
private String brokerName;
// 读队列数量 默认:16
private int readQueueNums;
// 写队列数量 默认:16
private int writeQueueNums;
//todo Topic的读写权限(2是写 4是读 6是读写)
private int perm;
/** 同步复制还是异步复制--对应TopicConfig.topicSysFlag
* {@link org.apache.rocketmq.common.sysflag.TopicSysFlag}
*/
private int topicSynFlag;
...省略...
}
map: topicQueueTable 数据格式demo(json):
{
"TopicTest":[
{
"brokerName":"broker-a",
"perm":6,
"readQueueNums":4,
"topicSynFlag":0,
"writeQueueNums":4
}
]
}
BrokerData 属性解析:
/**
* broker的数据:Master与Slave 的对应关系通过指定相同的BrokerName,不同的BrokerId来定义,BrokerId为0 表示Master,非0表示Slave。
*/
public class BrokerData implements Comparable {
// broker所属集群
private String cluster;
// brokerName
private String brokerName;
// 同一个brokerName下可以有一个Master和多个Slave,所以brokerAddrs是一个集合
// brokerld=O表示 Master,大于 O表示从 Slave
private HashMap brokerAddrs;
// 用于查找broker地址
private final Random random = new Random();
...省略...
}
map: brokerAddrTable 数据格式demo(json):
{
"broker-a":{
"brokerAddrs":{
"0":"172.16.62.75:10911"
},
"brokerName":"broker-a",
"cluster":"DefaultCluster"
}
}
BrokerLiveInfo 属性解析:
/**
* 存放存活的Broker信息,当前存活的 Broker,该信息不是实时的,NameServer 每10S扫描一次所有的 broker,根据心跳包的时间得知 broker的状态,
* 该机制也是导致当一个 Broker 进程假死后,消息生产者无法立即感知,可能继续向其发送消息,导致失败(非高可用)
*/
class BrokerLiveInfo {
//最后一次更新时间
private long lastUpdateTimestamp;
//版本号信息
private DataVersion dataVersion;
//Netty的Channel
private Channel channel;
//HA Broker的地址 是Slave从Master拉取数据时链接的地址,由brokerIp2+HA端口构成
private String haServerAddr;
...省略...
}
map: brokerLiveTable 数据格式demo(json):
{
"172.16.62.75:10911":{
"channel":{
"active":true,
"inputShutdown":false,
"open":true,
"outputShutdown":false,
"registered":true,
"writable":true
},
"dataVersion":{
"counter":2,
"timestamp":1630907813571
},
"haServerAddr":"172.16.62.75:10912",
"lastUpdateTimestamp":1630907814074
}
}
brokerAddrTable -Map 数据格式demo(json)
{"DefaultCluster":["broker-a"]}
从RouteInfoManager维护的HashMap数据结构和QueueData、BrokerData、BrokerLiveInfo类属性得知,NameServer维护的信息既简单但极其重要。
3.2 路由注册roker主动注册路由信息存在以下几种情况:
启动时向集群中所有的NameServer注册
定时30s向集群中所有NameServer发送心跳包注册
当broker中topic信息发送变更(新增/修改/删除)发送心跳包注册。
但其实对于NameServer来讲,其核心处理逻辑方法就是RouteInfoManager#registerBroker, 源码分析如下:
RouteInfoManager#registerBroker
public RegisterBrokerResult registerBroker(
final String clusterName, final String brokerAddr,
final String brokerName, final long brokerId,
final String haServerAddr,
//TopicConfigSerializeWrapper比较复杂的数据结构,主要包含了broker上所有的topic信息
final TopicConfigSerializeWrapper topicConfigWrapper,
final List filterServerList, final Channel channel) {
RegisterBrokerResult result = new RegisterBrokerResult();
try {
try {
this.lock.writeLock().lockInterruptibly(); // 锁
//1: 此处维护 clusterAddrTable 数据
Set brokerNames = this.clusterAddrTable.get(clusterName);
if (null == brokerNames) {
brokerNames = new HashSet();
this.clusterAddrTable.put(clusterName, brokerNames);
}
brokerNames.add(brokerName);
//2:此处维护 brokerAddrTable 数据
boolean registerFirst = false;//是否第一次注册(如果Topic配置信息发生变更或者该broker为第一次注册)
BrokerData brokerData = this.brokerAddrTable.get(brokerName);
if (null == brokerData) {
registerFirst = true;
brokerData = new BrokerData(clusterName, brokerName, new HashMap());
this.brokerAddrTable.put(brokerName, brokerData);
}
//3: 此处维护 topicQueueTable 数据,数据更新操作方法在:createAndUpdateQueueData
String oldAddr = brokerData.getBrokerAddrs().put(brokerId, brokerAddr);
registerFirst = registerFirst || (null == oldAddr);
if (null != topicConfigWrapper
&& MixAll.MASTER_ID == brokerId) { //小知识点:只处理主节点请求,因为备节点的topic信息是同步主节点的
// 如果Topic配置信息发生变更或者该broker为第一次注册
if (this.isBrokerTopicConfigChanged(brokerAddr, topicConfigWrapper.getDataVersion())
|| registerFirst) {
ConcurrentMap tcTable = topicConfigWrapper.getTopicConfigTable();
if (tcTable != null) {
for (Map.Entry entry : tcTable.entrySet()) {
this.createAndUpdateQueueData(brokerName, entry.getValue());
}
}
}
}
//4: 此处维护:brokerLiveTable数据,关键点:BrokerLiveInfo构造器第一个参数:System.currentTimeMillis(),用于存活判断
BrokerLiveInfo prevBrokerLiveInfo = this.brokerLiveTable.put(brokerAddr,
new BrokerLiveInfo(
System.currentTimeMillis(),
topicConfigWrapper.getDataVersion(),
channel,
haServerAddr));
if (null == prevBrokerLiveInfo) {
log.info("new broker registered, {} HAServer: {}", brokerAddr, haServerAddr);
}
//5-维护:filterServerTable 数据
if (filterServerList != null) {
if (filterServerList.isEmpty()) {
this.filterServerTable.remove(brokerAddr);
} else {
this.filterServerTable.put(brokerAddr, filterServerList);
}
}
//返回值(如果当前broker为slave节点)则将haServerAddr、masterAddr等信息设置到result返回值中
if (MixAll.MASTER_ID != brokerId) {
String masterAddr = brokerData.getBrokerAddrs().get(MixAll.MASTER_ID);
if (masterAddr != null) {
BrokerLiveInfo brokerLiveInfo = this.brokerLiveTable.get(masterAddr);
if (brokerLiveInfo != null) {
result.setHaServerAddr(brokerLiveInfo.getHaServerAddr());
result.setMasterAddr(masterAddr);
}
}
}
} finally {
this.lock.writeLock().unlock();
}
} catch (Exception e) {
log.error("registerBroker Exception", e);
}
return result;
}
备注:
createAndUpdateQueueData方法:其实就是维护topicQueueTable的数据,仔细的你肯定会去撸它。
从源码可看出:Broker注册的路由信息对于NameServer来说,其实就是维护clusterAddrTable、brokerAddrTable、topicQueueTable、brokerLiveTable、filterServerTable.其实源码就是这么简单。
3.3 路由删除
路由删除有两种方式,一种是broker主动上报删除,一种是NameServer主动删除。虽然对于NameServer来说处理逻辑有点差别,但是你一看就会懂,分析如下:
1. Broker主动上报删除:Broker在正常被关闭的情况下,会执行unregisterBroker指令,向NameServer发送取消注册请求,其核心源码如下:
RouteInfoManager#unregisterBroker
public void unregisterBroker(
final String clusterName, final String brokerAddr,
final String brokerName, final long brokerId) {
try {
try {
this.lock.writeLock().lockInterruptibly();
//1-直接删除 brokerLiveTable 信息,无需判断时间
BrokerLiveInfo brokerLiveInfo = this.brokerLiveTable.remove(brokerAddr);
log.info("unregisterBroker, remove from brokerLiveTable {}, {}",
brokerLiveInfo != null ? "OK" : "Failed",
brokerAddr
);
//2-删除 filterServerTable 信息
this.filterServerTable.remove(brokerAddr);
//3-维护删除 brokerAddrTable 信息
boolean removeBrokerName = false;
BrokerData brokerData = this.brokerAddrTable.get(brokerName);
if (null != brokerData) {
String addr = brokerData.getBrokerAddrs().remove(brokerId);
log.info("unregisterBroker, remove addr from brokerAddrTable {}, {}",
addr != null ? "OK" : "Failed",
brokerAddr
);
if (brokerData.getBrokerAddrs().isEmpty()) {
this.brokerAddrTable.remove(brokerName);
log.info("unregisterBroker, remove name from brokerAddrTable OK, {}",
brokerName
);
removeBrokerName = true;
}
}
//4- 维护删除 clusterAddrTable 信息
if (removeBrokerName) {
Set nameSet = this.clusterAddrTable.get(clusterName);
if (nameSet != null) {
boolean removed = nameSet.remove(brokerName);
log.info("unregisterBroker, remove name from clusterAddrTable {}, {}",
removed ? "OK" : "Failed",
brokerName);
if (nameSet.isEmpty()) {
this.clusterAddrTable.remove(clusterName);
log.info("unregisterBroker, remove cluster from clusterAddrTable {}",
clusterName
);
}
}
//5- 维护删除 topicQueueTable 信息
this.removeTopicByBrokerName(brokerName);
}
} finally {
this.lock.writeLock().unlock();
}
} catch (Exception e) {
log.error("unregisterBroker Exception", e);
}
}
备注:
removeTopicByBrokerName 方法:其实就是删除topicQueueTable的数据,仔细的你已有会去撸它。
2. NameServer主动删除:NameServer定时(10s)扫描brokerLiveTable,检测上次心跳包与当前系统时间的时间差,如果时间戳大于120s,则需要移除该Broker信息,其核心源码如下:
RouteInfoManager#scanNotActiveBroker
public void scanNotActiveBroker() {
Iterator> it = this.brokerLiveTable.entrySet().iterator();
while (it.hasNext()) {
Entry next = it.next();
long last = next.getValue().getLastUpdateTimestamp();
//1- BROKER_CHANNEL_EXPIRED_TIME,默认(1000 * 60 * 2)120s,判断是不是超过120s
if ((last + BROKER_CHANNEL_EXPIRED_TIME) < System.currentTimeMillis()) {
RemotingUtil.closeChannel(next.getValue().getChannel());
it.remove();
log.warn("The broker channel expired, {} {}ms", next.getKey(), BROKER_CHANNEL_EXPIRED_TIME);
this.onChannelDestroy(next.getKey(), next.getValue().getChannel());
}
}
}
public void onChannelDestroy(String remoteAddr, Channel channel) {
String brokerAddrFound = null;
if (channel != null) {
try {
try { //1- 查询需要删除的broker信息
this.lock.readLock().lockInterruptibly();
Iterator> itBrokerLiveTable =
this.brokerLiveTable.entrySet().iterator();
while (itBrokerLiveTable.hasNext()) {
Entry entry = itBrokerLiveTable.next();
if (entry.getValue().getChannel() == channel) {
brokerAddrFound = entry.getKey();
break;
}
}
} finally {
this.lock.readLock().unlock();
}
} catch (Exception e) {
log.error("onChannelDestroy Exception", e);
}
}
if (null == brokerAddrFound) {
brokerAddrFound = remoteAddr;
} else {
log.info("the broker's channel destroyed, {}, clean it's data structure at once", brokerAddrFound);
}
if (brokerAddrFound != null && brokerAddrFound.length() > 0) {
try {
try {
this.lock.writeLock().lockInterruptibly();
this.brokerLiveTable.remove(brokerAddrFound); //2-维护删除 brokerLiveTable 信息
this.filterServerTable.remove(brokerAddrFound); //3-维护删除 filterServerTable 信息
String brokerNameFound = null;
boolean removeBrokerName = false;
Iterator> itBrokerAddrTable =
this.brokerAddrTable.entrySet().iterator(); //4-维护删除 brokerAddrTable 信息
while (itBrokerAddrTable.hasNext() && (null == brokerNameFound)) {
BrokerData brokerData = itBrokerAddrTable.next().getValue();
Iterator> it = brokerData.getBrokerAddrs().entrySet().iterator();
while (it.hasNext()) {
Entry entry = it.next();
Long brokerId = entry.getKey();
String brokerAddr = entry.getValue();
if (brokerAddr.equals(brokerAddrFound)) {
brokerNameFound = brokerData.getBrokerName();
it.remove();
log.info("remove brokerAddr[{}, {}] from brokerAddrTable, because channel destroyed",
brokerId, brokerAddr);
break;
}
}
if (brokerData.getBrokerAddrs().isEmpty()) {
removeBrokerName = true;
itBrokerAddrTable.remove();
log.info("remove brokerName[{}] from brokerAddrTable, because channel destroyed",
brokerData.getBrokerName());
}
}
if (brokerNameFound != null && removeBrokerName) {
Iterator>> it = this.clusterAddrTable.entrySet().iterator(); // 5-维护删除 clusterAddrTable 信息
while (it.hasNext()) {
Entry> entry = it.next();
String clusterName = entry.getKey();
Set brokerNames = entry.getValue();
boolean removed = brokerNames.remove(brokerNameFound);
if (removed) {
log.info("remove brokerName[{}], clusterName[{}] from clusterAddrTable, because channel destroyed",
brokerNameFound, clusterName);
if (brokerNames.isEmpty()) {
log.info("remove the clusterName[{}] from clusterAddrTable, because channel destroyed and no broker in this cluster",
clusterName);
it.remove();
}
break;
}
}
}
if (removeBrokerName) {
Iterator>> itTopicQueueTable =
this.topicQueueTable.entrySet().iterator(); // 6- 维护删除:topicQueueTable 信息
while (itTopicQueueTable.hasNext()) {
Entry> entry = itTopicQueueTable.next();
String topic = entry.getKey();
List queueDataList = entry.getValue();
Iterator itQueueData = queueDataList.iterator();
while (itQueueData.hasNext()) {
QueueData queueData = itQueueData.next();
if (queueData.getBrokerName().equals(brokerNameFound)) {
itQueueData.remove();
log.info("remove topic[{} {}], from topicQueueTable, because channel destroyed",
topic, queueData);
}
}
if (queueDataList.isEmpty()) {
itTopicQueueTable.remove();
log.info("remove topic[{}] all queue, from topicQueueTable, because channel destroyed",
topic);
}
}
}
} finally {
this.lock.writeLock().unlock();
}
} catch (Exception e) {
log.error("onChannelDestroy Exception", e);
}
}
}
从源码可看出:Broker取消注册两种方式对于NameServer来说,其实就是删除clusterAddrTable、brokerAddrTable、topicQueueTable、brokerLiveTable、filterServerTable相关信息.
3.4 路由发现
RocketMQ路由发现其实是非实时的,当Topic路由出现变化后,NameServer不会主动推送给客户端,而是由生产端和消费端定时拉取主题最新的路由,其核心源码如下:
RouteInfoManager#pickupTopicRouteData
public TopicRouteData pickupTopicRouteData(final String topic) {
TopicRouteData topicRouteData = new TopicRouteData();
boolean foundQueueData = false;
boolean foundBrokerData = false;
Set brokerNameSet = new HashSet();
List brokerDataList = new LinkedList();
topicRouteData.setBrokerDatas(brokerDataList);
HashMap> filterServerMap = new HashMap>();
topicRouteData.setFilterServerTable(filterServerMap);
try {
try {
this.lock.readLock().lockInterruptibly();
List queueDataList = this.topicQueueTable.get(topic);
if (queueDataList != null) {
topicRouteData.setQueueDatas(queueDataList);
foundQueueData = true;
Iterator it = queueDataList.iterator();
while (it.hasNext()) {
QueueData qd = it.next();
brokerNameSet.add(qd.getBrokerName());
}
// 处理构建:BrokerData数据
for (String brokerName : brokerNameSet) {
BrokerData brokerData = this.brokerAddrTable.get(brokerName);
if (null != brokerData) {
BrokerData brokerDataClone = new BrokerData(brokerData.getCluster(), brokerData.getBrokerName(), (HashMap) brokerData
.getBrokerAddrs().clone());
brokerDataList.add(brokerDataClone);
foundBrokerData = true;
for (final String brokerAddr : brokerDataClone.getBrokerAddrs().values()) {
List filterServerList = this.filterServerTable.get(brokerAddr);
filterServerMap.put(brokerAddr, filterServerList);
}
}
}
}
} finally {
this.lock.readLock().unlock();
}
} catch (Exception e) {
log.error("pickupTopicRouteData Exception", e);
}
log.debug("pickupTopicRouteData {} {}", topic, topicRouteData);
if (foundBrokerData && foundQueueData) {
return topicRouteData;
}
return null;
}
备注:
这段代码都不好意思写注释了,其实就是从topicQueueTable、brokerAddrTable、filterServerTable这些Map中查询数据,组装给TopicRouteData,然后返回给客户端使用。
下面列出TopicRouteData简单介绍下的属性,你会发现原来可以如此简单:
public class TopicRouteData extends RemotingSerializable {
//topic排序的配置,和"ORDER_TOPIC_CONFIG"这个NameSpace有关,参照DefaultRequestProcessor#getRouteInfoByTopic,后续可讲解此小知识点
private String orderTopicConf;
// topic 队列元数据
private List queueDatas;
// topic分布的 broker元数据
private List brokerDatas;
// broker上过滤服务器地址列表
private HashMap/* Filter Server */> filterServerTable;
...省略...
}
其实NameServer里面提供了很多其他功能方法,例如:
getBrokerClusterInfo(获取集群信息),
getAllTopicListFromNameserver(获取所有topic)等,
但大部分都是围绕
clusterAddrTable
brokerAddrTable
topicQueueTable
brokerLiveTable
filterServerTable
这几个hashMap来转。
四、结论
从功能作用角度上总结:NameServer作为RocketMQ的“大脑”,保存着集群MQ的路由信息,具体就是记录维护Topic、Broker的信息,及监控Broker的运行状态,为client提供路由能力;而从源代码的角度总结:NameServer就是维护了多个HashMap,Broker的注册,Client的查询都是围绕其Map操作,当然为了解决并发问题添加了ReentrantReadWriteLock(读写锁). 其实本节只描述了NameServer部分关键代码,其NameServer的启动流程等多处源代码都值得分析和学习。
五、问题
细心的你,有没有发现NameServer存在下面这种缺陷:
假设Broker异常宕机,NameServer至少等120s才将该Broker从路由信息中剔除,在Broker故障期间,消息生产者Producer根据topic获取到的路由信息包含已经宕机的Broker,会导致消息在短时间内发送失败,那这种情况怎么办?岂不是消息发送不是高可用的? 消费端消费消息是否存在影响?
请带着问题解析来将会在发送端和消费端一一为你讲解.
关注IT巅峰技术,私信作者,获取以下2021全球架构师峰会PDF资料。