文章旨在 简单易懂,学习交流;文章中会尽量避免其他 关联内容;
其他关联的 微服务内容会单独 启动专题说明;也可留言,会第一时间解答,相互交流
非喜勿喷;
Nacos 致力于帮助您发现、配置和管理微服务。Nacos 提供了一组简单易用的特性集,帮助您快速实现动态服务发现、服务配置、服务元数据及流量管理。
Nacos 帮助您更敏捷和容易地构建、交付和管理微服务平台。 Nacos 是构建以“服务”为中心的现代应用架构 (例如微服务范式、云原生范式) 的服务基础设施。
Nacos地图
服务是指一个或一组软件功能(例如特定信息的检索或一组操作的执行),其目的是不同的客户端可以为不同的目的重用(例如通过跨进程的网络调用)。Nacos 支持主流的服务生态,如 Kubernetes Service、gRPC|Dubbo RPC Service 或者 Spring Cloud RESTful Service。
服务注册中心,它是服务,其实例及元数据的数据库。服务实例在启动时注册到服务注册表,并在关闭时注销。服务和路由器的客户端查询服务注册表以查找服务的可用实例。服务注册中心可能会调用服务实例的健康检查 API 来验证它是否能够处理请求。
服务元数据是指包括服务端点(endpoints)、服务标签、服务版本号、服务实例权重、路由规则、安全策略等描述服务的数据。
是指提供可复用和可调用服务的应用方。
是指会发起对某个服务调用的应用方。
在系统开发过程中通常会将一些需要变更的参数、变量等从代码中分离出来独立管理,以独立的配置文件的形式存在。目的是让静态的系统工件或者交付物(如 WAR,JAR 包等)更好地和实际的物理运行环境进行适配。配置管理一般包含在系统部署的过程中,由系统管理员或者运维人员完成这个步骤。配置变更是调整系统运行时的行为的有效手段之一。
在数据中心中,系统中所有配置的编辑、存储、分发、变更管理、历史版本管理、变更审计等所有与配置相关的活动统称为配置管理。
提供分布式系统中所有对象(Object)、实体(Entity)的“名字”到关联的元数据之间的映射管理服务,例如 ServiceName -> Endpoints Info, Distributed Lock Name -> Lock Owner/Status Info, DNS Domain Name -> IP List, 服务发现和 DNS 就是名字服务的2大场景。
在服务或者应用运行过程中,提供动态配置或者元数据以及配置管理的服务提供者。
Nacos 数据模型 Key 由三元组唯一确定, Namespace默认是空串,公共命名空间(public),分组默认是 DEFAULT_GROUP。
围绕配置,主要有两个关联的实体,一个是配置变更历史,一个是服务标签(用于打标分类,方便索引),由 ID 关联。
Nacos 支持标准 Docker 镜像(TODO: 0.2版本开始支持)及 zip(tar.gz)压缩包的构建物。
Nacos 支持将注册中心(Service Registry)与配置中心(Config Center) 在一个进程合并部署或者将2者分离部署的两种模式。
除了您自己部署和启动 Nacos 服务之外,在云计算时代,Nacos 也支持公有云模式,在阿里云公有云的商业产品(如MSE, EDAS) 中会提供 Nacos 的免费的公有云服务。我们也欢迎和支持其他的公有云提供商提供 Nacos 的公有云服务。
服务注册与发现是微服务架构得以运转的核心功能,它不提供任何业务功能,仅仅用来进行服务的发现和注册,并对服务的健康状态进行监控和管理。
其核心的工作原理:
Nacos为服务注册与发现提供了一个SDK类 NamingService,通过该类,可以实现服务的注册与发现、订阅服务的动态变化、获取服务实例、获取注册中心的健康状态等等。
其中,NamingService中的接口,可以分为以下几类:
创建 NamingService:
NamingService namingService = NacosFactory.createNamingService(properties);
获取 NacosNamingService 中带Properties形参的构造函数,然后反射创建。
public class NamingFactory {
public static NamingService createNamingService(Properties properties) throws NacosException {
try {
Class<?> driverImplClass = Class.forName("com.alibaba.nacos.client.naming.NacosNamingService");
Constructor constructor = driverImplClass.getConstructor(Properties.class);
NamingService vendorImpl = (NamingService)constructor.newInstance(properties);
return vendorImpl;
} catch (Throwable e) {
throw new NacosException(NacosException.CLIENT_INVALID_PARAM, e);
}
}
}
通过构造函数,创建一个NacosNamingService 对象,代码如下:
public class NacosNamingService implements NamingService {
public NacosNamingService(Properties properties) {
init(properties);
}
private void init(Properties properties) {
// 初始化namespace
namespace = InitUtils.initNamespaceForNaming(properties);
// 初始化服务端地址列表
initServerAddr(properties);
// 初始化web root上下文
InitUtils.initWebRootContext();
// 初始化缓存目录:用于故障转移,user.home + /nacos/naming/ + namespace
initCacheDir();
initLogName(properties);
// 事件调度程序:while(true)线程,如果服务列表发生变更,则发送NamingEvent到当前服务所有listener
eventDispatcher = new EventDispatcher();
// 创建服务代理类:用于服务实例注册/注销、心跳发送、服务实例获取、服务实例更新/删除/创建等
// 所有向服务器发送、获取的动作,都在 NamingProxy 中完成。
serverProxy = new NamingProxy(namespace, endpoint, serverList);
serverProxy.setProperties(properties);
// 用于心跳检测:如果向Nacos注册一个临时节点,需要创建一个心跳检测任务
beatReactor = new BeatReactor(serverProxy, initClientBeatThreadCount(properties));
// 用于获取服务端注册信息,故障转移,定时刷盘备份等
hostReactor = new HostReactor(eventDispatcher, serverProxy, cacheDir, isLoadCacheAtStart(properties), initPollingThreadCount(properties));
}
}
创建服务代理类:用于服务实例注册/注销、心跳发送、服务实例获取、服务实例更新/删除/创建等。
所有向服务器发送、获取的动作,都在 NamingProxy 中完成。
public class NamingProxy {
public NamingProxy(String namespaceId, String endpoint, String serverList) {
this.namespaceId = namespaceId;
this.endpoint = endpoint;
if (StringUtils.isNotEmpty(serverList)) {
this.serverList = Arrays.asList(serverList.split(","));
if (this.serverList.size() == 1) {
this.nacosDomain = serverList;
}
}
// 初始化端点服务列表,并且每隔30秒刷新一次服务列表
initRefreshSrvIfNeed();
}
// 初始化端点服务列表
private void initRefreshSrvIfNeed() {
// 端点为空,直接返回
if (StringUtils.isEmpty(endpoint)) {
return;
}
ScheduledExecutorService executorService = new ScheduledThreadPoolExecutor(1, new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread t = new Thread(r);
t.setName("com.alibaba.nacos.client.naming.serverlist.updater");
t.setDaemon(true);
return t;
}
});
// 延时任务:刷新端点服务列表,vipSrvRefInterMillis = 30s
executorService.scheduleWithFixedDelay(new Runnable() {
@Override
public void run() {
// 刷新服务列表
refreshSrvIfNeed();
}
}, 0, vipSrvRefInterMillis, TimeUnit.MILLISECONDS);
// 刷新端点服务列表
refreshSrvIfNeed();
}
// 刷新端点服务列表
private void refreshSrvIfNeed() {
try {
if (!CollectionUtils.isEmpty(serverList)) {
NAMING_LOGGER.debug("server list provided by user: " + serverList);
return;
}
if (System.currentTimeMillis() - lastSrvRefTime < vipSrvRefInterMillis) {
return;
}
// 从端点获取服务器列表
List<String> list = getServerListFromEndpoint();
if (CollectionUtils.isEmpty(list)) {
throw new Exception("Can not acquire Nacos list");
}
if (!CollectionUtils.isEqualCollection(list, serversFromEndpoint)) {
NAMING_LOGGER.info("[SERVER-LIST] server list is updated: " + list);
}
serversFromEndpoint = list;
lastSrvRefTime = System.currentTimeMillis();
} catch (Throwable e) {
NAMING_LOGGER.warn("failed to update server list", e);
}
}
// 从端点获取服务器列表 http://endpoint/nacos/serverlist
public List<String> getServerListFromEndpoint() {
try {
String urlString = "http://" + endpoint + "/nacos/serverlist";
List<String> headers = builderHeaders();
HttpClient.HttpResult result = HttpClient.httpGet(urlString, headers, null, UtilAndComs.ENCODING);
if (HttpURLConnection.HTTP_OK != result.code) {
throw new IOException("Error while requesting: " + urlString + "'. Server returned: "
+ result.code);
}
String content = result.content;
List<String> list = new ArrayList<String>();
for (String line : IoUtils.readLines(new StringReader(content))) {
if (!line.trim().isEmpty()) {
list.add(line.trim());
}
}
return list;
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
}
BeatReactor 主要是用来发送心跳包,服务端根据心跳时间,来判断当前服务上线、下线。
public class BeatReactor {
public BeatReactor(NamingProxy serverProxy, int threadCount) {
this.serverProxy = serverProxy;
executorService = new ScheduledThreadPoolExecutor(threadCount, new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread thread = new Thread(r);
thread.setDaemon(true);
thread.setName("com.alibaba.nacos.naming.beat.sender");
return thread;
}
});
}
// 服务注册是,如果是临时节点,需要添加一个心跳信息
public void addBeatInfo(String serviceName, BeatInfo beatInfo) {
NAMING_LOGGER.info("[BEAT] adding beat: {} to beat map.", beatInfo);
dom2Beat.put(buildKey(serviceName, beatInfo.getIp(), beatInfo.getPort()), beatInfo);
// 执行心跳任务,beatInfo.getPeriod():心跳间隔
executorService.schedule(new BeatTask(beatInfo), beatInfo.getPeriod(), TimeUnit.MILLISECONDS);
MetricsMonitor.getDom2BeatSizeMonitor().set(dom2Beat.size());
}
// Beat Task
class BeatTask implements Runnable {
BeatInfo beatInfo;
public BeatTask(BeatInfo beatInfo) {
this.beatInfo = beatInfo;
}
@Override
public void run() {
if (beatInfo.isStopped()) {
return;
}
// 调用NamingProxy 中的sendBeat方法,发送心跳包,url=/instance/beat
long result = serverProxy.sendBeat(beatInfo);
// 下一次发送心跳时间
long nextTime = result > 0 ? result : beatInfo.getPeriod();
// 创建一个心跳延时任务
executorService.schedule(new BeatTask(beatInfo), nextTime, TimeUnit.MILLISECONDS);
}
}
}
用于获取服务端注册信息,故障转移,定时刷盘备份等。
public class HostReactor {
public HostReactor(EventDispatcher eventDispatcher, NamingProxy serverProxy, String cacheDir,
boolean loadCacheAtStart, int pollingThreadCount) {
executor = new ScheduledThreadPoolExecutor(pollingThreadCount, new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread thread = new Thread(r);
thread.setDaemon(true);
thread.setName("com.alibaba.nacos.client.naming.updater");
return thread;
}
});
this.eventDispatcher = eventDispatcher;
this.serverProxy = serverProxy;
this.cacheDir = cacheDir;
if (loadCacheAtStart) { // 是否在启动是加载缓存
// DiskCache.read(this.cacheDir):从缓存目标中读取所有文件,缓存到serviceInfoMap内存中
this.serviceInfoMap = new ConcurrentHashMap<String, ServiceInfo>(DiskCache.read(this.cacheDir));
} else {
this.serviceInfoMap = new ConcurrentHashMap<String, ServiceInfo>(16);
}
this.updatingMap = new ConcurrentHashMap<String, Object>();
// 用于在注册中心故障时,故障转移。
this.failoverReactor = new FailoverReactor(this, cacheDir);
// 基于UDP连接,获取服务端注册信息变更通知,更新本地缓存、内存。
this.pushReceiver = new PushReceiver(this);
}
}
用于在注册中心故障时,客户端获取注册实例信息,一种故障转移安全机制。
public class FailoverReactor {
public FailoverReactor(HostReactor hostReactor, String cacheDir) {
this.hostReactor = hostReactor;
// 故障转移目标:user.home + /nacos/naming/ + namespace + /failover
this.failoverDir = cacheDir + "/failover";
// 初始化
this.init();
}
// 初始化
public void init() {
// 故障转移开关,5s后执行,判断 failoverDir + FAILOVER_SWITCH文件是否存在
// 如果不存在或者文件内容为0,关闭故障转移
// 如果存在,并且文件内容为1,开启故障转移
// 故障转移:开启一个FailoverFileReader线程,读取 failoverDir 下所有文件,缓存到内存中
executorService.scheduleWithFixedDelay(new SwitchRefresher(), 0L, 5000L, TimeUnit.MILLISECONDS);
// 开启一个刷盘延时任务:用于备份服务信息到故障转移目录下(failoverDir)
// DAY_PERIOD_MINUTES = 24 * 60:每隔24小时备份一次
// 刷盘:获取所有serviceInfoMap循环遍历,过滤掉其他文件(参照源码),将serviceInfoMap中文件,刷盘到failoverDir目录下
executorService.scheduleWithFixedDelay(new DiskFileWriter(), 30, DAY_PERIOD_MINUTES, TimeUnit.MINUTES);
// backup file on startup if failover directory is empty.
// 如果在启动时,failover 目录为空,先进行刷盘备份,10s后执行
executorService.schedule(new Runnable() {
@Override
public void run() {
try {
File cacheDir = new File(failoverDir);
if (!cacheDir.exists() && !cacheDir.mkdirs()) {
throw new IllegalStateException("failed to create cache dir: " + failoverDir);
}
File[] files = cacheDir.listFiles();
if (files == null || files.length <= 0) {
new DiskFileWriter().run();
}
} catch (Throwable e) {
NAMING_LOGGER.error("[NA] failed to backup file on startup.", e);
}
}
}, 10000L, TimeUnit.MILLISECONDS);
}
}
基于UDP连接,获取服务端注册信息变更通知,更新本地缓存、内存。
public class PushReceiver implements Runnable {
public PushReceiver(HostReactor hostReactor) {
try {
this.hostReactor = hostReactor;
// 建立一个UDP socket连接
udpSocket = new DatagramSocket();
executorService = new ScheduledThreadPoolExecutor(1, new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread thread = new Thread(r);
thread.setDaemon(true);
thread.setName("com.alibaba.nacos.naming.push.receiver");
return thread;
}
});
// 执行当前线程
executorService.execute(this);
} catch (Exception e) {
NAMING_LOGGER.error("[NA] init udp socket failed", e);
}
}
@Override
public void run() {
while (true) {
try {
// byte[] is initialized with 0 full filled by default
byte[] buffer = new byte[UDP_MSS];
DatagramPacket packet = new DatagramPacket(buffer, buffer.length);
// 阻塞:服务端注册信息发送变更后,会主动监理一个UDP连接,发送消息
udpSocket.receive(packet);
// 接收到服务端注册信息
String json = new String(IoUtils.tryDecompress(packet.getData()), "UTF-8").trim();
NAMING_LOGGER.info("received push data: " + json + " from " + packet.getAddress().toString());
PushPacket pushPacket = JSON.parseObject(json, PushPacket.class);
// 服务端应答ack
String ack;
if ("dom".equals(pushPacket.type) || "service".equals(pushPacket.type)) {
// 处理服务端注册信息 字符串
// 1. 更新注册信息到本地缓存
// 2. 更新注册信息到本地文件
hostReactor.processServiceJSON(pushPacket.data);
// send ack to server
ack = "{\"type\": \"push-ack\""
+ ", \"lastRefTime\":\"" + pushPacket.lastRefTime
+ "\", \"data\":" + "\"\"}";
} else if ("dump".equals(pushPacket.type)) {
// dump data to server
ack = "{\"type\": \"dump-ack\""
+ ", \"lastRefTime\": \"" + pushPacket.lastRefTime
+ "\", \"data\":" + "\""
+ StringUtils.escapeJavaScript(JSON.toJSONString(hostReactor.getServiceInfoMap()))
+ "\"}";
} else {
// do nothing send ack only
ack = "{\"type\": \"unknown-ack\""
+ ", \"lastRefTime\":\"" + pushPacket.lastRefTime
+ "\", \"data\":" + "\"\"}";
}
// 发送服务端应答ack
udpSocket.send(new DatagramPacket(ack.getBytes(Charset.forName("UTF-8")),
ack.getBytes(Charset.forName("UTF-8")).length, packet.getSocketAddress()));
} catch (Exception e) {
NAMING_LOGGER.error("[NA] error while receiving push data", e);
}
}
}
}
namingService.registerInstance(serviceName, "localhost", 8080);
使用 namingService 中的 registerInstance 方法注册服务实例信息,最终都会调用到 registerInstance(String serviceName, String groupName, Instance instance)
方法。
public class NacosNamingService implements NamingService {
@Override
public void registerInstance(String serviceName, String groupName, Instance instance) throws NacosException {
// 判断是否临时节点:临时节点需要发送心跳包,维持服务状态,默认为true
if (instance.isEphemeral()) {
// 构建心跳包参数
BeatInfo beatInfo = new BeatInfo();
beatInfo.setServiceName(NamingUtils.getGroupedName(serviceName, groupName));
beatInfo.setIp(instance.getIp());
beatInfo.setPort(instance.getPort());
beatInfo.setCluster(instance.getClusterName());
beatInfo.setWeight(instance.getWeight());
beatInfo.setMetadata(instance.getMetadata());
beatInfo.setScheduled(false);
// 获取实例心跳间隔
long instanceInterval = instance.getInstanceHeartBeatInterval();
// 设置间隔,默认DEFAULT_HEART_BEAT_INTERVAL=5s
beatInfo.setPeriod(instanceInterval == 0 ? DEFAULT_HEART_BEAT_INTERVAL : instanceInterval);
// 使用2.2.2 BeatReactor 定时发送心跳包
beatReactor.addBeatInfo(NamingUtils.getGroupedName(serviceName, groupName), beatInfo);
}
// serverProxy,即为NamingProxy:注册服务
serverProxy.registerService(NamingUtils.getGroupedName(serviceName, groupName), groupName, instance);
}
}
public class NamingProxy {
// 注册服务
public void registerService(String serviceName, String groupName, Instance instance) throws NacosException {
NAMING_LOGGER.info("[REGISTER-SERVICE] {} registering service {} with instance: {}",
namespaceId, serviceName, instance);
final Map<String, String> params = new HashMap<String, String>(9);
params.put(CommonParams.NAMESPACE_ID, namespaceId);
params.put(CommonParams.SERVICE_NAME, serviceName);
params.put(CommonParams.GROUP_NAME, groupName); // 分组,默认DEFAULT_GROUP
params.put(CommonParams.CLUSTER_NAME, instance.getClusterName()); // 所属集群名称,默认DEFAULT
params.put("ip", instance.getIp());
params.put("port", String.valueOf(instance.getPort()));
params.put("weight", String.valueOf(instance.getWeight())); // 权重
params.put("enable", String.valueOf(instance.isEnabled()));
params.put("healthy", String.valueOf(instance.isHealthy()));
params.put("ephemeral", String.valueOf(instance.isEphemeral())); // 是否临时节点,默认true,是
params.put("metadata", JSON.toJSONString(instance.getMetadata())); // 元数据
// 发送服务注册请求:/nacos/v1/ns/instance
// reqAPI:主要是发送对应请求到服务端,如果异常,会遍历整改serverList请求
// 当所有server都请求失败,并且只有一个server服务时会重试(默认三次)
reqAPI(UtilAndComs.NACOS_URL_INSTANCE, params, HttpMethod.POST);
}
}
服务端收到客户端发送的 /nacos/v1/ns/instance
POST请求后,先将服务注册到 ServiceManager 中的 Map
对象中,然后为其中的 Service 添加实例信息,最后交由Nacos集群一致性服务,保持集群数据的一致性。
Nacos存储服务结构如下:
@RestController
@RequestMapping(UtilsAndCommons.NACOS_NAMING_CONTEXT + "/instance")
public class InstanceController {
@CanDistro
@RequestMapping(value = "", method = RequestMethod.POST)
public String register(HttpServletRequest request) throws Exception {
// 获取服务名称
String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
// 获取namespace
String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
// 注册实例
// parseInstance(request):解析请求参数,封装为 Instance 对象
serviceManager.registerInstance(namespaceId, serviceName, parseInstance(request));
return "ok";
}
}
@Component
@DependsOn("nacosApplicationContext")
public class ServiceManager implements RecordListener<Service> {
/**
* Map>
*/
private Map<String, Map<String, Service>> serviceMap = new ConcurrentHashMap<>();
public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
// 创建一个空的Service
createEmptyService(namespaceId, serviceName, instance.isEphemeral());
// 获取Service:Map> serviceMap
Service service = getService(namespaceId, serviceName);
if (service == null) {
throw new NacosException(NacosException.INVALID_PARAM,
"service not found, namespace: " + namespaceId + ", service: " + serviceName);
}
// 给当前Service添加实例
addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
}
// 创建一个空的Service
public void createEmptyService(String namespaceId, String serviceName, boolean local) throws NacosException {
createServiceIfAbsent(namespaceId, serviceName, local, null);
}
// 如果不存在,创建一个Service实例
public void createServiceIfAbsent(String namespaceId, String serviceName, boolean local, Cluster cluster) throws NacosException {
// 从缓存中获取Service:Map> serviceMap
Service service = getService(namespaceId, serviceName);
// 为空,创建Service实例
if (service == null) {
Loggers.SRV_LOG.info("creating empty service {}:{}", namespaceId, serviceName);
service = new Service();
service.setName(serviceName);
service.setNamespaceId(namespaceId);
service.setGroupName(NamingUtils.getGroupName(serviceName));
// now validate the service. if failed, exception will be thrown
service.setLastModifiedMillis(System.currentTimeMillis());
// 重新计算checksum
service.recalculateChecksum();
if (cluster != null) {
cluster.setService(service);
// 设置集群
service.getClusterMap().put(cluster.getName(), cluster);
}
// 参数校验
service.validate();
// 判断是否临时节点,客户端默认为true
if (local) {
// 临时节点:添加并且初始化
putServiceAndInit(service);
} else {
// 持久化节点:添加或替换服务
addOrReplaceService(service);
}
}
}
// 临时节点:添加并且初始化
private void putServiceAndInit(Service service) throws NacosException {
// 添加:将service对象添加到 Map> serviceMap 中
putService(service);
// 初始化service:详解2.3.1.2 Service 服务对象
service.init();
// 临时节点
// 一致性服务:用于保证Nacos集群中,Service对象数据一致性问题
// listen(String key, RecordListener listener)
// key -> com.alibaba.nacos.naming.iplist.ephemeral. + namespaceId + ## + serviceName
// listener -> 当前Service对象
// 参照后续详解Nacos集群一致性算法Raft
consistencyService.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), true), service);
// 持久化节点
// 一致性服务:用于保证Nacos集群中,Service对象数据一致性问题
// listen(String key, RecordListener listener)
// key -> com.alibaba.nacos.naming.iplist. + namespaceId + ## + serviceName
// listener -> 当前Service对象
// 参照后续详解Nacos集群一致性算法Raft
consistencyService.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), false), service);
Loggers.SRV_LOG.info("[NEW-SERVICE] {}", service.toJSON());
}
// 添加:将service对象添加到 Map> serviceMap 中
public void putService(Service service) {
if (!serviceMap.containsKey(service.getNamespaceId())) {
synchronized (putServiceLock) {
if (!serviceMap.containsKey(service.getNamespaceId())) {
serviceMap.put(service.getNamespaceId(), new ConcurrentHashMap<>(16));
}
}
}
serviceMap.get(service.getNamespaceId()).put(service.getName(), service);
}
// 持久化节点:添加或替换服务
public void addOrReplaceService(Service service) throws NacosException {
// 将当前持久化节点注册信息转发到Nacos集群中的Leader节点进行处理,Follower只处理非事务请求
// put(String key, Record value)
// key -> com.alibaba.nacos.naming.domains.meta. + namespaceId + ## + serviceName
// valuy -> 当前Service对象
// 参照后续详解Nacos集群一致性算法Raft
consistencyService.put(KeyBuilder.buildServiceMetaKey(service.getNamespaceId(), service.getName()), service);
}
// 给当前Service添加实例
public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips) throws NacosException {
// 实例列表key
// 临时节点:com.alibaba.nacos.naming.iplist.ephemeral. + namespaceId + ## + serviceName
// 持久化节点:com.alibaba.nacos.naming.iplist. + namespaceId + ## + serviceName
String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
// 从缓存中获取Service:Map> serviceMap
Service service = getService(namespaceId, serviceName);
// 为Service添加实例
List<Instance> instanceList = addIpAddresses(service, ephemeral, ips);
Instances instances = new Instances();
instances.setInstanceList(instanceList);
// 添加到一致性服务中ConcurrentMap datums
consistencyService.put(key, instances);
}
// 为Service添加实例
public List<Instance> addIpAddresses(Service service, boolean ephemeral, Instance... ips) throws NacosException {
return updateIpAddresses(service, UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD, ephemeral, ips);
}
public List<Instance> updateIpAddresses(Service service, String action, boolean ephemeral, Instance... ips) throws NacosException {
// 从一致性服务中获取Datum对象:ConcurrentMap datums
// 实例列表key
// 临时节点:com.alibaba.nacos.naming.iplist.ephemeral. + namespaceId + ## + serviceName
// 持久化节点:com.alibaba.nacos.naming.iplist. + namespaceId + ## + serviceName
Datum datum = consistencyService.get(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), ephemeral));
Map<String, Instance> oldInstanceMap = new HashMap<>(16);
// 获取当前Service所有Instance对象
List<Instance> currentIPs = service.allIPs(ephemeral);
Map<String, Instance> map = new ConcurrentHashMap<>(currentIPs.size());
for (Instance instance : currentIPs) {
map.put(instance.toIPAddr(), instance);
}
if (datum != null) {
// 获取Old Instance
oldInstanceMap = setValid(((Instances) datum.value).getInstanceList(), map);
}
// use HashMap for deep copy:
HashMap<String, Instance> instanceMap = new HashMap<>(oldInstanceMap.size());
instanceMap.putAll(oldInstanceMap);
for (Instance instance : ips) {
if (!service.getClusterMap().containsKey(instance.getClusterName())) {
Cluster cluster = new Cluster(instance.getClusterName(), service);
// 集群初始化,详解参照2.3.1.3 Cluster集群对象
cluster.init();
service.getClusterMap().put(instance.getClusterName(), cluster);
Loggers.SRV_LOG.warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
instance.getClusterName(), instance.toJSON());
}
// 添加 or 删除
if (UtilsAndCommons.UPDATE_INSTANCE_ACTION_REMOVE.equals(action)) {
instanceMap.remove(instance.getDatumKey());
} else {
instanceMap.put(instance.getDatumKey(), instance);
}
}
if (instanceMap.size() <= 0 && UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD.equals(action)) {
throw new IllegalArgumentException("ip list can not be empty, service: " + service.getName() + ", ip list: "
+ JSON.toJSONString(instanceMap.values()));
}
return new ArrayList<>(instanceMap.values());
}
}
public class Service extends com.alibaba.nacos.api.naming.pojo.Service implements Record, RecordListener<Instances> {
// 检查当前服务中的实例是否可用:根据客户端心跳请求更新时间判断
@JSONField(serialize = false)
private ClientBeatCheckTask clientBeatCheckTask = new ClientBeatCheckTask(this);
// 集群Map:key -> 集群名称
private Map<String, Cluster> clusterMap = new HashMap<>();
// 更新实例信息
public void onChange(String key, Instances value) throws Exception{
......
}
// 批量更新
public void updateIPs(Collection<Instance> instances, boolean ephemeral){
......
}
// 初始化
public void init() {
// 开启检查检查任务
HealthCheckReactor.scheduleCheck(clientBeatCheckTask);
for (Map.Entry<String, Cluster> entry : clusterMap.entrySet()) {
entry.getValue().setService(this);
entry.getValue().init();
}
}
// 获取所有实例信息
public List<Instance> allIPs(){
......
}
}
public class Cluster extends com.alibaba.nacos.api.naming.pojo.Cluster implements Cloneable {
// 健康检查任务
@JSONField(serialize = false)
private HealthCheckTask checkTask;
// 持久化节点存储对象
@JSONField(serialize = false)
private Set<Instance> persistentInstances = new HashSet<>();
// 临时节点存储对象
@JSONField(serialize = false)
private Set<Instance> ephemeralInstances = new HashSet<>();
// 初始化
public void init() {
if (inited) {
return;
}
// 健康检查任务
checkTask = new HealthCheckTask(this);
// 开启健康检查任务
HealthCheckReactor.scheduleCheck(checkTask);
inited = true;
}
}
List<Instance> allInstances = namingService.getAllInstances("serviceName");
使用 namingService 中的 getAllInstances 方法获取服务实例信息,最终都会调用到 List
方法。
public class NacosNamingService implements NamingService {
@Override
public List<Instance> getAllInstances(String serviceName, String groupName, List<String> clusters, boolean subscribe) throws NacosException {
ServiceInfo serviceInfo;
// 是否订阅服务动态变化:默认为true
if (subscribe) {
// 获取、并订阅服务动态变化
serviceInfo = hostReactor.getServiceInfo(NamingUtils.getGroupedName(serviceName, groupName), StringUtils.join(clusters, ","));
} else {
// 该方法是直接通过NamingProxy发送请求到服务器,获取服务信息
// url=/nacos/v1/ns/instance/list GET请求
serviceInfo = hostReactor.getServiceInfoDirectlyFromServer(NamingUtils.getGroupedName(serviceName, groupName), StringUtils.join(clusters, ","));
}
List<Instance> list;
// 获取并返回实例信息
if (serviceInfo == null || CollectionUtils.isEmpty(list = serviceInfo.getHosts())) {
return new ArrayList<Instance>();
}
return list;
}
}
上面分析过 HostReactor 对象是用于获取服务端注册信息,故障转移,定时刷盘备份等,下面我们具体分析一下其中的方法。
获取服务信息(本地缓存、故障转移中心、远程服务)。
public class HostReactor {
public ServiceInfo getServiceInfo(final String serviceName, final String clusters) {
NAMING_LOGGER.debug("failover-mode: " + failoverReactor.isFailoverSwitch());
// serviceName + @@ + clusters
String key = ServiceInfo.getKey(serviceName, clusters);
// 是否开启故障转移
if (failoverReactor.isFailoverSwitch()) {
// 从故障转移中心获取服务信息
return failoverReactor.getService(key);
}
// 从缓存中获取服务信息:Map serviceInfoMap
ServiceInfo serviceObj = getServiceInfo0(serviceName, clusters);
if (null == serviceObj) {
// 创建一个空的Service
serviceObj = new ServiceInfo(serviceName, clusters);
// 放入serviceInfoMap缓存中
serviceInfoMap.put(serviceObj.getKey(), serviceObj);
// 正在更新的serviceName:用于处理高并发,保证同一时刻,尽少线程调用url=/nacos/v1/ns/instance/list GET请求
// 保证了Nacos Server端的安全性
updatingMap.put(serviceName, new Object());
// 更新当前实例
updateServiceNow(serviceName, clusters);
updatingMap.remove(serviceName);
} else if (updatingMap.containsKey(serviceName)) { // 已经存在更新
if (UPDATE_HOLD_INTERVAL > 0) { // 5000 > 0 >>>> true
// hold a moment waiting for update finish
// 高并发:阻塞其他需要获取的线程,等待唤醒
synchronized (serviceObj) {
try {
// 最多等待 5000ms,即5s
serviceObj.wait(UPDATE_HOLD_INTERVAL);
} catch (InterruptedException e) {
NAMING_LOGGER.error("[getServiceInfo] serviceName:" + serviceName + ", clusters:" + clusters, e);
}
}
}
}
// 定时更新任务:添加一个UpdateTask,每隔一段时间,发送请求到服务端,更新本地缓存、磁盘缓存等
scheduleUpdateIfAbsent(serviceName, clusters);
return serviceInfoMap.get(serviceObj.getKey());
}
}
立即更新服务信息。
发送 url=/nacos/v1/ns/instance/list GET请求
public class HostReactor {
public void updateServiceNow(String serviceName, String clusters) {
ServiceInfo oldService = getServiceInfo0(serviceName, clusters);
try {
// 发送 url=/nacos/v1/ns/instance/list GET请求
String result = serverProxy.queryList(serviceName, clusters, pushReceiver.getUDPPort(), false);
if (StringUtils.isNotEmpty(result)) {
// 处理返回结果:更新缓存,通知事件监听等
processServiceJSON(result);
}
} catch (Exception e) {
NAMING_LOGGER.error("[NA] failed to update serviceName: " + serviceName, e);
} finally {
if (oldService != null) {
synchronized (oldService) {
// 缓存等待线程
// 与hostReactor.getServiceInfo中的阻塞对应
// synchronized (serviceObj) {
// try {
// 最多等待 5000ms,即5s
// serviceObj.wait(UPDATE_HOLD_INTERVAL);
// } catch (InterruptedException e) {
// NAMING_LOGGER.error("[getServiceInfo] serviceName:" + serviceName + ", clusters:" + clusters, e);
// }
// }
oldService.notifyAll();
}
}
}
}
}
public class HostReactor {
// 处理返回结果
public ServiceInfo processServiceJSON(String json) {
ServiceInfo serviceInfo = JSON.parseObject(json, ServiceInfo.class);
ServiceInfo oldService = serviceInfoMap.get(serviceInfo.getKey());
// 忽略空或错误推送
if (serviceInfo.getHosts() == null || !serviceInfo.validate()) {
//empty or error push, just ignore
return oldService;
}
boolean changed = false;
// 如果oldService不为空,需要进行合并处理
if (oldService != null) {
// getLastRefTime():最后更新时间
if (oldService.getLastRefTime() > serviceInfo.getLastRefTime()) {
NAMING_LOGGER.warn("out of date data received, old-t: " + oldService.getLastRefTime()
+ ", new-t: " + serviceInfo.getLastRefTime());
}
// 更新本地缓存
serviceInfoMap.put(serviceInfo.getKey(), serviceInfo);
/****************比较并处理oldHostMap 和当前 serviceInfo的不同,并合并处理start**************/
Map<String, Instance> oldHostMap = new HashMap<String, Instance>(oldService.getHosts().size());
for (Instance host : oldService.getHosts()) {
oldHostMap.put(host.toInetAddr(), host);
}
Map<String, Instance> newHostMap = new HashMap<String, Instance>(serviceInfo.getHosts().size());
for (Instance host : serviceInfo.getHosts()) {
newHostMap.put(host.toInetAddr(), host);
}
Set<Instance> modHosts = new HashSet<Instance>();
Set<Instance> newHosts = new HashSet<Instance>();
Set<Instance> remvHosts = new HashSet<Instance>();
List<Map.Entry<String, Instance>> newServiceHosts = new ArrayList<Map.Entry<String, Instance>>(
newHostMap.entrySet());
for (Map.Entry<String, Instance> entry : newServiceHosts) {
Instance host = entry.getValue();
String key = entry.getKey();
if (oldHostMap.containsKey(key) && !StringUtils.equals(host.toString(),
oldHostMap.get(key).toString())) {
modHosts.add(host);
continue;
}
if (!oldHostMap.containsKey(key)) {
newHosts.add(host);
}
}
for (Map.Entry<String, Instance> entry : oldHostMap.entrySet()) {
Instance host = entry.getValue();
String key = entry.getKey();
if (newHostMap.containsKey(key)) {
continue;
}
if (!newHostMap.containsKey(key)) {
remvHosts.add(host);
}
}
if (newHosts.size() > 0) {
changed = true;
NAMING_LOGGER.info("new ips(" + newHosts.size() + ") service: "
+ serviceInfo.getKey() + " -> " + JSON.toJSONString(newHosts));
}
if (remvHosts.size() > 0) {
changed = true;
NAMING_LOGGER.info("removed ips(" + remvHosts.size() + ") service: "
+ serviceInfo.getKey() + " -> " + JSON.toJSONString(remvHosts));
}
if (modHosts.size() > 0) {
changed = true;
NAMING_LOGGER.info("modified ips(" + modHosts.size() + ") service: "
+ serviceInfo.getKey() + " -> " + JSON.toJSONString(modHosts));
}
serviceInfo.setJsonFromServer(json);
/****************比较并处理oldHostMap 和当前 serviceInfo的不同,并合并处理end**************/
// 判断是否存在更新
if (newHosts.size() > 0 || remvHosts.size() > 0 || modHosts.size() > 0) {
// 存在更新,发送服务更新事件,通知所有事件监听
eventDispatcher.serviceChanged(serviceInfo);
// 更新磁盘缓存
DiskCache.write(serviceInfo, cacheDir);
}
} else {
// oldService为空,直接更新本地缓存,发送服务更新更新事件,更新磁盘缓存
changed = true;
NAMING_LOGGER.info("init new ips(" + serviceInfo.ipCount() + ") service: " + serviceInfo.getKey() + " -> " + JSON
.toJSONString(serviceInfo.getHosts()));
// 更新本地缓存
serviceInfoMap.put(serviceInfo.getKey(), serviceInfo);
// 发送服务更新事件
eventDispatcher.serviceChanged(serviceInfo);
serviceInfo.setJsonFromServer(json);
// 更新磁盘缓存
DiskCache.write(serviceInfo, cacheDir);
}
MetricsMonitor.getServiceInfoMapSizeMonitor().set(serviceInfoMap.size());
if (changed) {
NAMING_LOGGER.info("current ips:(" + serviceInfo.ipCount() + ") service: " + serviceInfo.getKey() +
" -> " + JSON.toJSONString(serviceInfo.getHosts()));
}
return serviceInfo;
}
}
定时更新任务:添加一个UpdateTask,每隔一段时间,发送请求到服务端,更新本地缓存、磁盘缓存等。
public class HostReactor {
public void scheduleUpdateIfAbsent(String serviceName, String clusters) {
// 判断是否已经添加
if (futureMap.get(ServiceInfo.getKey(serviceName, clusters)) != null) {
return;
}
synchronized (futureMap) {
// 双重校验:判断是否已经添加
if (futureMap.get(ServiceInfo.getKey(serviceName, clusters)) != null) {
return;
}
// 添加一个UpdateTask
ScheduledFuture<?> future = addTask(new UpdateTask(serviceName, clusters));
futureMap.put(ServiceInfo.getKey(serviceName, clusters), future);
}
}
}
更新任务,用于保证本地缓存和Nacos服务端数据一致性。
public class HostReactor {
public class UpdateTask implements Runnable {
long lastRefTime = Long.MAX_VALUE;
private String clusters;
private String serviceName;
public UpdateTask(String serviceName, String clusters) {
this.serviceName = serviceName;
this.clusters = clusters;
}
@Override
public void run() {
try {
ServiceInfo serviceObj = serviceInfoMap.get(ServiceInfo.getKey(serviceName, clusters));
// 为空直接更新
if (serviceObj == null) {
// 立即更新服务信息
updateServiceNow(serviceName, clusters);
// 循环添加UpdateTask DEFAULT_DELAY=1000
executor.schedule(this, DEFAULT_DELAY, TimeUnit.MILLISECONDS);
return;
}
// 条件判断:上一次更新时间 <= 最后更新时间
if (serviceObj.getLastRefTime() <= lastRefTime) {
// 立即更新服务信息
updateServiceNow(serviceName, clusters);
serviceObj = serviceInfoMap.get(ServiceInfo.getKey(serviceName, clusters));
} else {
// if serviceName already updated by push, we should not override it
// since the push data may be different from pull through force push
// 已经被更新过,仅仅调用一次url=/nacos/v1/ns/instance/list GET请求
refreshOnly(serviceName, clusters);
}
// 循环添加UpdateTask
executor.schedule(this, serviceObj.getCacheMillis(), TimeUnit.MILLISECONDS);
// 最后更新时间
lastRefTime = serviceObj.getLastRefTime();
} catch (Throwable e) {
NAMING_LOGGER.warn("[NA] failed to update serviceName: " + serviceName, e);
}
}
}
}
该方法是直接通过NamingProxy发送请求到服务器,获取服务信息返回
url=/nacos/v1/ns/instance/list GET请求
public class HostReactor {
public ServiceInfo getServiceInfoDirectlyFromServer(final String serviceName, final String clusters) throws NacosException {
// 发送url=/nacos/v1/ns/instance/list GET请求
String result = serverProxy.queryList(serviceName, clusters, 0, false);
if (StringUtils.isNotEmpty(result)) {
// 返回结果集
return JSON.parseObject(result, ServiceInfo.class);
}
return null;
}
}
客户端通过发送 url=/nacos/v1/ns/instance/list GET请求,查询当前服务信息,服务端收到请求,处理如下:
@RestController
@RequestMapping(UtilsAndCommons.NACOS_NAMING_CONTEXT + "/instance")
public class InstanceController {
/**
* 根据条件获取服务信息
*/
@RequestMapping(value = "/list", method = RequestMethod.GET)
public JSONObject list(HttpServletRequest request) throws Exception {
String namespaceId = WebUtils.optional(request, CommonParams.NAMESPACE_ID,
Constants.DEFAULT_NAMESPACE_ID);
String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
String agent = request.getHeader("Client-Version");
if (StringUtils.isBlank(agent)) {
agent = request.getHeader("User-Agent");
}
String clusters = WebUtils.optional(request, "clusters", StringUtils.EMPTY);
String clientIP = WebUtils.optional(request, "clientIP", StringUtils.EMPTY);
// UDP通信端口
Integer udpPort = Integer.parseInt(WebUtils.optional(request, "udpPort", "0"));
String env = WebUtils.optional(request, "env", StringUtils.EMPTY);
boolean isCheck = Boolean.parseBoolean(WebUtils.optional(request, "isCheck", "false"));
String app = WebUtils.optional(request, "app", StringUtils.EMPTY);
String tenant = WebUtils.optional(request, "tid", StringUtils.EMPTY);
boolean healthyOnly = Boolean.parseBoolean(WebUtils.optional(request, "healthyOnly", "false"));
// 获取服务信息,如果UDP通信端口不为空,添加一个UDP通信客户端
return doSrvIPXT(namespaceId, serviceName, agent, clusters, clientIP, udpPort, env, isCheck, app, tenant, healthyOnly);
}
}
BeatReactor 主要是用来发送心跳包,服务端根据心跳时间,来判断当前服务上线、下线。
public class BeatReactor {
public BeatReactor(NamingProxy serverProxy, int threadCount) {
this.serverProxy = serverProxy;
executorService = new ScheduledThreadPoolExecutor(threadCount, new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread thread = new Thread(r);
thread.setDaemon(true);
thread.setName("com.alibaba.nacos.naming.beat.sender");
return thread;
}
});
}
// 服务注册是,如果是临时节点,需要添加一个心跳信息
public void addBeatInfo(String serviceName, BeatInfo beatInfo) {
NAMING_LOGGER.info("[BEAT] adding beat: {} to beat map.", beatInfo);
dom2Beat.put(buildKey(serviceName, beatInfo.getIp(), beatInfo.getPort()), beatInfo);
// 执行心跳任务,beatInfo.getPeriod():心跳间隔
executorService.schedule(new BeatTask(beatInfo), beatInfo.getPeriod(), TimeUnit.MILLISECONDS);
MetricsMonitor.getDom2BeatSizeMonitor().set(dom2Beat.size());
}
// Beat Task
class BeatTask implements Runnable {
BeatInfo beatInfo;
public BeatTask(BeatInfo beatInfo) {
this.beatInfo = beatInfo;
}
@Override
public void run() {
if (beatInfo.isStopped()) {
return;
}
// 调用NamingProxy 中的sendBeat方法,发送心跳包,url=/instance/beat
long result = serverProxy.sendBeat(beatInfo);
// 下一次发送心跳时间
long nextTime = result > 0 ? result : beatInfo.getPeriod();
// 创建一个心跳延时任务
executorService.schedule(new BeatTask(beatInfo), nextTime, TimeUnit.MILLISECONDS);
}
}
}
在Service 服务对象中,存在一个 ClientBeatCheckTask 客户端心跳检查任务。
用于检查和更新临时实例的状态,如果它们已过期则将其删除。
public class ClientBeatCheckTask implements Runnable {
private Service service;
public ClientBeatCheckTask(Service service) {
this.service = service;
}
@JSONField(serialize = false)
public PushService getPushService() {
return SpringContext.getAppContext().getBean(PushService.class);
}
@JSONField(serialize = false)
public DistroMapper getDistroMapper() {
return SpringContext.getAppContext().getBean(DistroMapper.class);
}
public GlobalConfig getGlobalConfig() {
return SpringContext.getAppContext().getBean(GlobalConfig.class);
}
public String taskKey() {
return service.getName();
}
@Override
public void run() {
try {
// 当前服务是否已经发行、注册实例
if (!getDistroMapper().responsible(service.getName())) {
return;
}
// 获取当前服务所有短暂实例
List<Instance> instances = service.allIPs(true);
// first set health status of instances:
for (Instance instance : instances) {
// 当前时间 - 最后一次心跳时间 > 当前实例心跳超时时间
// true:已经失效
if (System.currentTimeMillis() - instance.getLastBeat() > instance.getInstanceHeartBeatTimeOut()) {
// 是否已被标记
if (!instance.isMarked()) {
if (instance.isHealthy()) {
// 设置健康状态
instance.setHealthy(false);
Loggers.EVT_LOG.info("{POS} {IP-DISABLED} valid: {}:{}@{}@{}, region: {}, msg: client timeout after {}, last beat: {}",
instance.getIp(), instance.getPort(), instance.getClusterName(), service.getName(),
UtilsAndCommons.LOCALHOST_SITE, instance.getInstanceHeartBeatTimeOut(), instance.getLastBeat());
// 发送 ServiceChangeEvent 服务信息变化时间事件
getPushService().serviceChanged(service);
// 发送 InstanceHeartbeatTimeoutEvent 实例心跳超时事件
SpringContext.getAppContext().publishEvent(new InstanceHeartbeatTimeoutEvent(this, instance));
}
}
}
}
if (!getGlobalConfig().isExpireInstance()) {
return;
}
// then remove obsolete instances:
// 删除过时的实例
for (Instance instance : instances) {
if (instance.isMarked()) {
continue;
}
// 当前时间 - 最后一次心跳时间 > 当前ip超时删除时间
if (System.currentTimeMillis() - instance.getLastBeat() > instance.getIpDeleteTimeout()) {
// delete instance
Loggers.SRV_LOG.info("[AUTO-DELETE-IP] service: {}, ip: {}", service.getName(), JSON.toJSONString(instance));
// 删除实例:调用本地实例删除请求
deleteIP(instance);
}
}
} catch (Exception e) {
Loggers.SRV_LOG.warn("Exception while processing client beat time out.", e);
}
}
// 删除实例:调用本地实例删除请求
private void deleteIP(Instance instance) {
try {
NamingProxy.Request request = NamingProxy.Request.newRequest();
request.appendParam("ip", instance.getIp())
.appendParam("port", String.valueOf(instance.getPort()))
.appendParam("ephemeral", "true")
.appendParam("clusterName", instance.getClusterName())
.appendParam("serviceName", service.getName())
.appendParam("namespaceId", service.getNamespaceId());
String url = "http://127.0.0.1:" + RunningConfig.getServerPort() + RunningConfig.getContextPath()
+ UtilsAndCommons.NACOS_NAMING_CONTEXT + "/instance?" + request.toUrl();
// delete instance asynchronously:
HttpClient.asyncHttpDelete(url, null, null, new AsyncCompletionHandler() {
@Override
public Object onCompleted(Response response) throws Exception {
if (response.getStatusCode() != HttpURLConnection.HTTP_OK) {
Loggers.SRV_LOG.error("[IP-DEAD] failed to delete ip automatically, ip: {}, caused {}, resp code: {}",
instance.toJSON(), response.getResponseBody(), response.getStatusCode());
}
return null;
}
});
} catch (Exception e) {
Loggers.SRV_LOG.error("[IP-DEAD] failed to delete ip automatically, ip: {}, error: {}", instance.toJSON(), e);
}
}
}
更新任务,用于保证本地缓存和Nacos服务端数据一致性。
public class HostReactor {
public class UpdateTask implements Runnable {
long lastRefTime = Long.MAX_VALUE;
private String clusters;
private String serviceName;
public UpdateTask(String serviceName, String clusters) {
this.serviceName = serviceName;
this.clusters = clusters;
}
@Override
public void run() {
try {
ServiceInfo serviceObj = serviceInfoMap.get(ServiceInfo.getKey(serviceName, clusters));
// 为空直接更新
if (serviceObj == null) {
// 立即更新服务信息
updateServiceNow(serviceName, clusters);
// 循环添加UpdateTask DEFAULT_DELAY=1000
executor.schedule(this, DEFAULT_DELAY, TimeUnit.MILLISECONDS);
return;
}
// 条件判断:上一次更新时间 <= 最后更新时间
if (serviceObj.getLastRefTime() <= lastRefTime) {
// 立即更新服务信息
updateServiceNow(serviceName, clusters);
serviceObj = serviceInfoMap.get(ServiceInfo.getKey(serviceName, clusters));
} else {
// if serviceName already updated by push, we should not override it
// since the push data may be different from pull through force push
// 已经被更新过,仅仅调用一次url=/nacos/v1/ns/instance/list GET请求
refreshOnly(serviceName, clusters);
}
// 循环添加UpdateTask
executor.schedule(this, serviceObj.getCacheMillis(), TimeUnit.MILLISECONDS);
// 最后更新时间
lastRefTime = serviceObj.getLastRefTime();
} catch (Throwable e) {
NAMING_LOGGER.warn("[NA] failed to update serviceName: " + serviceName, e);
}
}
}
}
当服务信息发现变化时(服务注册、服务注销、心跳超时等),Nacos Server端会发送一个 ServiceChangeEvent 事件,PushService 收到该事件后,会基于UDP通知所有客户端。
@Component
public class PushService implements ApplicationContextAware, ApplicationListener<ServiceChangeEvent> {
// 客户端UDP连接缓存
private static ConcurrentMap<String, ConcurrentMap<String, PushClient>> clientMap
= new ConcurrentHashMap<String, ConcurrentMap<String, PushClient>>();
/**
* 监听 ServiceChangeEvent 事件
*/
@Override
public void onApplicationEvent(ServiceChangeEvent event) {
Service service = event.getService();
String serviceName = service.getName();
String namespaceId = service.getNamespaceId();
Future future = udpSender.schedule(new Runnable() {
@Override
public void run() {
try {
Loggers.PUSH.info(serviceName + " is changed, add it to push queue.");
// 获取当前服务对应的 PushClient
ConcurrentMap<String, PushClient> clients = clientMap.get(UtilsAndCommons.assembleFullServiceName(namespaceId, serviceName));
if (MapUtils.isEmpty(clients)) {
return;
}
Map<String, Object> cache = new HashMap<>(16);
long lastRefTime = System.nanoTime();
for (PushClient client : clients.values()) {
// 判断当前连接事件为"僵尸"连接:无效连接
if (client.zombie()) {
Loggers.PUSH.debug("client is zombie: " + client.toString());
clients.remove(client.toString());
Loggers.PUSH.debug("client is zombie: " + client.toString());
continue;
}
Receiver.AckEntry ackEntry;
Loggers.PUSH.debug("push serviceName: {} to client: {}", serviceName, client.toString());
// 获取推送缓存key
String key = getPushCacheKey(serviceName, client.getIp(), client.getAgent());
byte[] compressData = null;
Map<String, Object> data = null;
if (switchDomain.getDefaultPushCacheMillis() >= 20000 && cache.containsKey(key)) {
org.javatuples.Pair pair = (org.javatuples.Pair) cache.get(key);
compressData = (byte[]) (pair.getValue0());
data = (Map<String, Object>) pair.getValue1();
Loggers.PUSH.debug("[PUSH-CACHE] cache hit: {}:{}", serviceName, client.getAddrStr());
}
// 准备(构建)请求报文
if (compressData != null) {
ackEntry = prepareAckEntry(client, compressData, data, lastRefTime);
} else {
ackEntry = prepareAckEntry(client, prepareHostsData(client), lastRefTime);
if (ackEntry != null) {
cache.put(key, new org.javatuples.Pair<>(ackEntry.origin.getData(), ackEntry.data));
}
}
Loggers.PUSH.info("serviceName: {} changed, schedule push for: {}, agent: {}, key: {}",
client.getServiceName(), client.getAddrStr(), client.getAgent(), (ackEntry == null ? null : ackEntry.key));
// 基于UDP发送报文
udpPush(ackEntry);
}
} catch (Exception e) {
Loggers.PUSH.error("[NACOS-PUSH] failed to push serviceName: {} to client, error: {}", serviceName, e);
} finally {
futureMap.remove(UtilsAndCommons.assembleFullServiceName(namespaceId, serviceName));
}
}
}, 1000, TimeUnit.MILLISECONDS);
futureMap.put(UtilsAndCommons.assembleFullServiceName(namespaceId, serviceName), future);
}
}
注意:本文档是基于 nacos 1.1.3 版本编写。
<dependency>
<groupId>com.alibaba.nacosgroupId>
<artifactId>nacos-clientartifactId>
<version>1.1.3version>
dependency>
基于 Nacos SDK 调用
public class NacosTest {
public static void main(String[] args) throws Exception {
Properties properties = new Properties();
properties.put("serverAddr", "服务端地址");
properties.put("namespace", "namespace");
// 通过指定参数,创建一个 configService
ConfigService configService = NacosFactory.createConfigService(properties);
String dataId = "testId";
String group = "testGroup";
// 通过dataId、group获取配置
String config = configService.getConfig(dataId, group, 3000);
System.out.println(config);
// 监听服务端配置变更
configService.addListener(dataId, group, new Listener() {
@Override
public Executor getExecutor() {
return null;
}
@Override
public void receiveConfigInfo(String configInfo) {
System.out.println("-------配置发生变更,变更后的配置:" + configInfo);
}
});
CountDownLatch countDownLatch = new CountDownLatch(1);
countDownLatch.await();
}
}
public class ConfigFactory {
/**
* Create Config
*
* @param properties init param
* @return ConfigService
* @throws NacosException Exception
*/
public static ConfigService createConfigService(Properties properties) throws NacosException {
try {
Class<?> driverImplClass = Class.forName("com.alibaba.nacos.client.config.NacosConfigService");
// 获取带Properties参数的构造函数
Constructor constructor = driverImplClass.getConstructor(Properties.class);
// 反射创建
ConfigService vendorImpl = (ConfigService) constructor.newInstance(properties);
return vendorImpl;
} catch (Throwable e) {
throw new NacosException(NacosException.CLIENT_INVALID_PARAM, e);
}
}
}
通过代码分析,底层主要是获取带 Properties 参数的构造函数,通过反射创建 NacosConfigService 对象。
NacosConfigService 构造函数。
public class NacosConfigService implements ConfigService {
private static final long POST_TIMEOUT = 3000L;
private static final String EMPTY = "";
// Http请求代理
private HttpAgent agent;
// 长轮询
private ClientWorker worker;
private String namespace;
private String encode;
// 创建一个 配置过滤器链管理器
private ConfigFilterChainManager configFilterChainManager = new ConfigFilterChainManager();
public NacosConfigService(Properties properties) throws NacosException {
String encodeTmp = properties.getProperty(PropertyKeyConst.ENCODE);
if (StringUtils.isBlank(encodeTmp)) {
encode = Constants.ENCODE;
} else {
encode = encodeTmp.trim();
}
// 初始化 namespace
initNamespace(properties);
// 创建一个Http通信代理类
agent = new MetricsHttpAgent(new ServerHttpAgent(properties));
agent.start();
// 创建客户端工作对象:长轮询机制
worker = new ClientWorker(agent, configFilterChainManager, properties);
}
}
ClientWorker 是 Nacos 中长轮询机制的实现类。
public class ClientWorker implements Closeable {
public ClientWorker(final HttpAgent agent, final ConfigFilterChainManager configFilterChainManager, final Properties properties) {
this.agent = agent;
this.configFilterChainManager = configFilterChainManager;
// 根据 properties 初始化 ClientWorker 属性
init(properties);
executor = Executors.newScheduledThreadPool(1, new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread t = new Thread(r);
t.setName("com.alibaba.nacos.client.Worker." + agent.getName());
t.setDaemon(true);
return t;
}
});
executorService = Executors.newScheduledThreadPool(Runtime.getRuntime().availableProcessors(), new ThreadFactory() {
@Override
public Thread newThread(Runnable r) {
Thread t = new Thread(r);
t.setName("com.alibaba.nacos.client.Worker.longPolling." + agent.getName());
t.setDaemon(true);
return t;
}
});
// 执行配置信息检查
executor.scheduleWithFixedDelay(new Runnable() {
@Override
public void run() {
try {
checkConfigInfo();
} catch (Throwable e) {
LOGGER.error("[" + agent.getName() + "] [sub-check] rotate check error", e);
}
}
}, 1L, 10L, TimeUnit.MILLISECONDS);
}
// 配置信息检查
public void checkConfigInfo() {
// 分任:分批处理
int listenerSize = cacheMap.get().size();
// 向上取整为批数
int longingTaskCount = (int) Math.ceil(listenerSize / ParamUtil.getPerTaskConfigSize());
if (longingTaskCount > currentLongingTaskCount) {
for (int i = (int) currentLongingTaskCount; i < longingTaskCount; i++) {
/****************长轮询************/
executorService.execute(new LongPollingRunnable(i));
}
currentLongingTaskCount = longingTaskCount;
}
}
// 根据 properties 初始化 ClientWorker 属性
private void init(Properties properties) {
// 请求超时时间:默认30s,可通过配置文件指定(configLongPollTimeout),最小10s
timeout = Math.max(ConvertUtils.toInt(properties.getProperty(PropertyKeyConst.CONFIG_LONG_POLL_TIMEOUT),
Constants.CONFIG_LONG_POLL_TIMEOUT), Constants.MIN_CONFIG_LONG_POLL_TIMEOUT);
// 任务异常后,重试时间,默认2秒
taskPenaltyTime = ConvertUtils
.toInt(properties.getProperty(PropertyKeyConst.CONFIG_RETRY_TIME), Constants.CONFIG_RETRY_TIME);
// 是否开启远程信息同步
this.enableRemoteSyncConfig = Boolean
.parseBoolean(properties.getProperty(PropertyKeyConst.ENABLE_REMOTE_SYNC_CONFIG));
}
}
nacos 配置中心提供了一个核心 API 接口,ConfigService。
public interface ConfigService {
// 获取配置
String getConfig(String dataId, String group, long timeoutMs) throws NacosException;
// 获取配置并注册监听器。
String getConfigAndSignListener(String dataId, String group, long timeoutMs, Listener listener)
throws NacosException;
// 添加监听
void addListener(String dataId, String group, Listener listener) throws NacosException;
// 发布配置
boolean publishConfig(String dataId, String group, String content) throws NacosException;
// 发布配置
boolean publishConfig(String dataId, String group, String content, String type) throws NacosException;
// 发布配置,附带md5值
boolean publishConfigCas(String dataId, String group, String content, String casMd5) throws NacosException;
// 发布配置,附带md5值
boolean publishConfigCas(String dataId, String group, String content, String casMd5, String type)
throws NacosException;
// 删除配置
boolean removeConfig(String dataId, String group) throws NacosException;
// 删除监听
void removeListener(String dataId, String group, Listener listener);
// 获取服务端状态
String getServerStatus();
// 停机
void shutDown() throws NacosException;
}
configService.getConfig(String dataId, String group, long timeoutMs)
用于获取配置,附带请求超时时间。
public class NacosConfigService implements ConfigService {
@Override
public String getConfig(String dataId, String group, long timeoutMs) throws NacosException {
return getConfigInner(namespace, dataId, group, timeoutMs);
}
private String getConfigInner(String tenant, String dataId, String group, long timeoutMs) throws NacosException {
// 如果group为空,设置为默认分组 DEFAULT_GROUP
group = null2defaultGroup(group);
ParamUtils.checkKeyParam(dataId, group);
ConfigResponse cr = new ConfigResponse();
cr.setDataId(dataId);
cr.setTenant(tenant);
cr.setGroup(group);
// use local config first
// 首先从获取本地配置:读取本地配置文件
String content = LocalConfigInfoProcessor.getFailover(agent.getName(), dataId, group, tenant);
if (content != null) {
LOGGER.warn("[{}] [get-config] get failover ok, dataId={}, group={}, tenant={}, config={}", agent.getName(),
dataId, group, tenant, ContentUtils.truncateContent(content));
cr.setContent(content);
configFilterChainManager.doFilter(null, cr);
content = cr.getContent();
return content;
}
// 本地缓存不存在
try {
// 从nacos服务端获取配置
content = worker.getServerConfig(dataId, group, tenant, timeoutMs);
cr.setContent(content);
// 过滤
configFilterChainManager.doFilter(null, cr);
content = cr.getContent();
return content;
} catch (NacosException ioe) {
if (NacosException.NO_RIGHT == ioe.getErrCode()) {
throw ioe;
}
LOGGER.warn("[{}] [get-config] get from server error, dataId={}, group={}, tenant={}, msg={}",
agent.getName(), dataId, group, tenant, ioe.toString());
}
LOGGER.warn("[{}] [get-config] get snapshot ok, dataId={}, group={}, tenant={}, config={}", agent.getName(),
dataId, group, tenant, ContentUtils.truncateContent(content));
// 本地文件、远程服务都获取不到,读取快照文件配置
content = LocalConfigInfoProcessor.getSnapshot(agent.getName(), dataId, group, tenant);
cr.setContent(content);
configFilterChainManager.doFilter(null, cr);
content = cr.getContent();
return content;
}
}
从本地文件获取配置
String content = LocalConfigInfoProcessor.getFailover(agent.getName(), dataId, group, tenant);
public class LocalConfigInfoProcessor {
public static String getFailover(String serverName, String dataId, String group, String tenant) {
// 获取本地配置文件:./user.home/nacos/config/serverName_nacos
File localPath = getFailoverFile(serverName, dataId, group, tenant);
if (!localPath.exists() || !localPath.isFile()) {
return null;
}
try {
// 读取文件内容
return readFile(localPath);
} catch (IOException ioe) {
LOGGER.error("[" + serverName + "] get failover error, " + localPath, ioe);
return null;
}
}
// 获取故障转移文件
static File getFailoverFile(String serverName, String dataId, String group, String tenant) {
File tmp = new File(LOCAL_SNAPSHOT_PATH, serverName + SUFFIX);
tmp = new File(tmp, FAILOVER_FILE_CHILD_1);
if (StringUtils.isBlank(tenant)) {
tmp = new File(tmp, FAILOVER_FILE_CHILD_2);
} else {
tmp = new File(tmp, FAILOVER_FILE_CHILD_3);
tmp = new File(tmp, tenant);
}
return new File(new File(tmp, group), dataId);
}
}
ConfigResponse response = worker.getServerConfig(dataId, group, tenant, timeoutMs, false);
调用 ClientWorker 中的 getServerConfig
方法。
public class ClientWorker implements Closeable {
public String getServerConfig(String dataId, String group, String tenant, long readTimeout)
throws NacosException {
if (StringUtils.isBlank(group)) {
// 使用默认分组
group = Constants.DEFAULT_GROUP;
}
HttpResult result = null;
try {
List<String> params = null;
if (StringUtils.isBlank(tenant)) {
params = Arrays.asList("dataId", dataId, "group", group);
} else {
params = Arrays.asList("dataId", dataId, "group", group, "tenant", tenant);
}
// 发送Get请求:/v1/cs/configs
result = agent.httpGet(Constants.CONFIG_CONTROLLER_PATH, null, params, agent.getEncode(), readTimeout);
} catch (IOException e) {
String message = String.format(
"[%s] [sub-server] get server config exception, dataId=%s, group=%s, tenant=%s", agent.getName(),
dataId, group, tenant);
LOGGER.error(message, e);
throw new NacosException(NacosException.SERVER_ERROR, e);
}
switch (result.code) {
case HttpURLConnection.HTTP_OK:
// 更新本地快照信息
LocalConfigInfoProcessor.saveSnapshot(agent.getName(), dataId, group, tenant, result.content);
return result.content;
case HttpURLConnection.HTTP_NOT_FOUND:
// 更新本地快照信息
LocalConfigInfoProcessor.saveSnapshot(agent.getName(), dataId, group, tenant, null);
return null;
case HttpURLConnection.HTTP_CONFLICT: {
LOGGER.error(
"[{}] [sub-server-error] get server config being modified concurrently, dataId={}, group={}, "
+ "tenant={}", agent.getName(), dataId, group, tenant);
throw new NacosException(NacosException.CONFLICT,
"data being modified, dataId=" + dataId + ",group=" + group + ",tenant=" + tenant);
}
case HttpURLConnection.HTTP_FORBIDDEN: {
LOGGER.error("[{}] [sub-server-error] no right, dataId={}, group={}, tenant={}", agent.getName(), dataId,
group, tenant);
throw new NacosException(result.code, result.content);
}
default: {
LOGGER.error("[{}] [sub-server-error] dataId={}, group={}, tenant={}, code={}", agent.getName(), dataId,
group, tenant, result.code);
throw new NacosException(result.code,
"http error, code=" + result.code + ",dataId=" + dataId + ",group=" + group + ",tenant=" + tenant);
}
}
}
}
configService.addListener(String dataId, String group, Listener listener)
用于给当前 dataId、group 的配置,添加一个事件监听,用于配置发生变更后,进行通知处理。
public class NacosConfigService implements ConfigService {
@Override
public void addListener(String dataId, String group, Listener listener) throws NacosException {
worker.addTenantListeners(dataId, group, Arrays.asList(listener));
}
}
NacosConfigService 中的 addListener(String dataId, String group, Listener listener)
最终会调用 ClientWorker 中的 addTenantListeners(String dataId, String group, List extends Listener> listeners)
方法
public class ClientWorker {
public void addTenantListeners(String dataId, String group, List<? extends Listener> listeners) throws NacosException {
// 如果group为空,设置默认分组DEFAULT_GROUP
group = null2defaultGroup(group);
String tenant = agent.getTenant();
// 如果不存在则添加缓存数据
CacheData cache = addCacheDataIfAbsent(dataId, group, tenant);
for (Listener listener : listeners) {
// 添加监听
cache.addListener(listener);
}
}
// 如果不存在则添加缓存数据
public CacheData addCacheDataIfAbsent(String dataId, String group, String tenant) throws NacosException {
// 获取缓存
CacheData cache = getCache(dataId, group, tenant);
if (null != cache) {
// 不为空,直接return
return cache;
}
String key = GroupKey.getKeyTenant(dataId, group, tenant);
synchronized (cacheMap) {// 加锁,线程安全性
CacheData cacheFromMap = getCache(dataId, group, tenant);
// multiple listeners on the same dataid+group and race condition,so
// double check again
// other listener thread beat me to set to cacheMap
// 双重检查机制
if (null != cacheFromMap) {
cache = cacheFromMap;
// reset so that server not hang this check
cache.setInitializing(true);
} else {
// 不存在,创建缓存对象
cache = new CacheData(configFilterChainManager, agent.getName(), dataId, group, tenant);
// fix issue # 1317
// 是否开启远程同步
if (enableRemoteSyncConfig) {
// 获取配置
String content = getServerConfig(dataId, group, tenant, 3000L);
cache.setContent(content);
}
}
// 设置缓存
Map<String, CacheData> copy = new HashMap<String, CacheData>(cacheMap.get());
copy.put(key, cache);
cacheMap.set(copy);
}
LOGGER.info("[{}] [subscribe] {}", agent.getName(), key);
MetricsMonitor.getListenConfigCountMonitor().set(cacheMap.get().size());
return cache;
}
}
在探讨 Nacos 长轮询机制前,先给大家普及一下几个概念:
Nacos 就是利用了长轮询机制,客户端会开启一个线程,不断向服务端发起一个配置是否存在变更的请求 (30s 超时),服务端收到请求后,如果配置不存在变更,并不会立即返回,而是当配置发生变更后,主动是否将消息回写给客户端。
客户端会存在两种情况:
Nacos 长轮询原理,分为了客户端 和 服务端,核心代码如下:
LongPollingRunnable 为 ClientWorker 中的一个内部类,代码如下:
public class ClientWorker {
// 检查配置信息:分批处理,一次最多3000
public void checkConfigInfo() {
// 分任务
int listenerSize = cacheMap.get().size();
// 向上取整为批数
int longingTaskCount = (int) Math.ceil(listenerSize / ParamUtil.getPerTaskConfigSize());
if (longingTaskCount > currentLongingTaskCount) {
for (int i = (int) currentLongingTaskCount; i < longingTaskCount; i++) {
// 要判断任务是否在执行 这块需要好好想想。 任务列表现在是无序的。变化过程可能有问题
// i 为当前批次,用于筛选过滤出属于当前批次的cacheData
executorService.execute(new LongPollingRunnable(i));
}
currentLongingTaskCount = longingTaskCount;
}
}
class LongPollingRunnable implements Runnable {
// 当前批次id,用于筛选过滤出属于当前批次的cacheData
private int taskId;
public LongPollingRunnable(int taskId) {
this.taskId = taskId;
}
@Override
public void run() {
List<CacheData> cacheDatas = new ArrayList<CacheData>();
List<String> inInitializingCacheList = new ArrayList<String>();
try {
// check failover config
// 获取属于当前批次的cacheData
for (CacheData cacheData : cacheMap.get().values()) {
if (cacheData.getTaskId() == taskId) {
cacheDatas.add(cacheData);
try {
// 检查本地配置
checkLocalConfig(cacheData);
if (cacheData.isUseLocalConfigInfo()) { // 使用本地配置信息
// 检查cacheData和内存缓存文件是否不一致,如果不一致,通知所有Listener
cacheData.checkListenerMd5();
}
} catch (Exception e) {
LOGGER.error("get local config info error", e);
}
}
}
// check server config
// 长轮询:将当前批次的所有cacheData通过Http请求发送给服务端,并附带30s超时时间
// 1.服务端数据无变化,请求超时,changedGroupKeys = Collections.emptyList()
// 2.服务端数据存在变更,循环遍历,通过getServerConfig获取并更新本地缓存,触发事件监听
List<String> changedGroupKeys = checkUpdateDataIds(cacheDatas, inInitializingCacheList);
// 遍历发送变更的groupKey
for (String groupKey : changedGroupKeys) {
String[] key = GroupKey.parseKey(groupKey);
String dataId = key[0];
String group = key[1];
String tenant = null;
if (key.length == 3) {
tenant = key[2];
}
try {
// 重新获取服务端配置,本更新本地配置文件缓存内容
String content = getServerConfig(dataId, group, tenant, 3000L);
// 更新本地内存配置
CacheData cache = cacheMap.get().get(GroupKey.getKeyTenant(dataId, group, tenant));
cache.setContent(content);
LOGGER.info("[{}] [data-received] dataId={}, group={}, tenant={}, md5={}, content={}",
agent.getName(), dataId, group, tenant, cache.getMd5(),
ContentUtils.truncateContent(content));
} catch (NacosException ioe) {
String message = String.format(
"[%s] [get-update] get changed config exception. dataId=%s, group=%s, tenant=%s",
agent.getName(), dataId, group, tenant);
LOGGER.error(message, ioe);
}
}
// 遍历cacheDatas,判断是否需要重新初始化本地文件缓存
for (CacheData cacheData : cacheDatas) {
if (!cacheData.isInitializing() || inInitializingCacheList
.contains(GroupKey.getKeyTenant(cacheData.dataId, cacheData.group, cacheData.tenant))) {
// 检查cacheData和内存缓存文件是否不一致,如果不一致,通知所有Listener
cacheData.checkListenerMd5();
cacheData.setInitializing(false);
}
}
inInitializingCacheList.clear();
executorService.execute(this);
} catch (Throwable e) {
// If the rotation training task is abnormal, the next execution time of the task will be punished
LOGGER.error("longPolling error : ", e);
// 如果发生异常,延迟taskPenaltyTime后执行当前任务
executorService.schedule(this, taskPenaltyTime, TimeUnit.MILLISECONDS);
}
}
}
/**
* 从Server获取值变化了的DataID列表。返回的对象里只有dataId和group是有效的。 保证不返回NULL。
*/
List<String> checkUpdateDataIds(List<CacheData> cacheDatas, List<String> inInitializingCacheList) throws IOException {
StringBuilder sb = new StringBuilder();
for (CacheData cacheData : cacheDatas) {
if (!cacheData.isUseLocalConfigInfo()) {
sb.append(cacheData.dataId).append(WORD_SEPARATOR);
sb.append(cacheData.group).append(WORD_SEPARATOR);
if (StringUtils.isBlank(cacheData.tenant)) {
sb.append(cacheData.getMd5()).append(LINE_SEPARATOR);
} else {
sb.append(cacheData.getMd5()).append(WORD_SEPARATOR);
sb.append(cacheData.getTenant()).append(LINE_SEPARATOR);
}
if (cacheData.isInitializing()) {
// cacheData 首次出现在cacheMap中&首次check更新
inInitializingCacheList
.add(GroupKey.getKeyTenant(cacheData.dataId, cacheData.group, cacheData.tenant));
}
}
}
boolean isInitializingCacheList = !inInitializingCacheList.isEmpty();
// 检查更新配置字符串
return checkUpdateConfigStr(sb.toString(), isInitializingCacheList);
}
/**
* 从Server获取值变化了的DataID列表。返回的对象里只有dataId和group是有效的。 保证不返回NULL。
*/
List<String> checkUpdateConfigStr(String probeUpdateString, boolean isInitializingCacheList) throws IOException {
List<String> params = Arrays.asList(Constants.PROBE_MODIFY_REQUEST, probeUpdateString);
List<String> headers = new ArrayList<String>(2);
headers.add("Long-Pulling-Timeout");
// 设置超时时间,默认30s
headers.add("" + timeout);
// told server do not hang me up if new initializing cacheData added in
// 是否初始化缓存列表
if (isInitializingCacheList) {
headers.add("Long-Pulling-Timeout-No-Hangup");
headers.add("true");
}
// 为空,直接return
if (StringUtils.isBlank(probeUpdateString)) {
return Collections.emptyList();
}
try {
// 发送带超时时间的Http请求,请求路径:/v1/cs/configs/listener
HttpResult result = agent.httpPost(Constants.CONFIG_CONTROLLER_PATH + "/listener", headers, params,
agent.getEncode(), timeout);
if (HttpURLConnection.HTTP_OK == result.code) {
setHealthServer(true);
// 解析更新数据 ID 响应
return parseUpdateDataIdResponse(result.content);
} else {
setHealthServer(false);
LOGGER.error("[{}] [check-update] get changed dataId error, code: {}", agent.getName(), result.code);
}
} catch (IOException e) {
setHealthServer(false);
LOGGER.error("[" + agent.getName() + "] [check-update] get changed dataId exception", e);
throw e;
}
// 超时返回 Collections.emptyList()
return Collections.emptyList();
}
// 检查本地配置
private void checkLocalConfig(CacheData cacheData) {
final String dataId = cacheData.dataId;
final String group = cacheData.group;
final String tenant = cacheData.tenant;
File path = LocalConfigInfoProcessor.getFailoverFile(agent.getName(), dataId, group, tenant);
// 没有 -> 有
if (!cacheData.isUseLocalConfigInfo() && path.exists()) {
String content = LocalConfigInfoProcessor.getFailover(agent.getName(), dataId, group, tenant);
String md5 = MD5.getInstance().getMD5String(content);
cacheData.setUseLocalConfigInfo(true);
cacheData.setLocalConfigInfoVersion(path.lastModified());
cacheData.setContent(content);
LOGGER.warn("[{}] [failover-change] failover file created. dataId={}, group={}, tenant={}, md5={}, content={}",
agent.getName(), dataId, group, tenant, md5, ContentUtils.truncateContent(content));
return;
}
// 有 -> 没有。不通知业务监听器,从server拿到配置后通知。
if (cacheData.isUseLocalConfigInfo() && !path.exists()) {
cacheData.setUseLocalConfigInfo(false);
LOGGER.warn("[{}] [failover-change] failover file deleted. dataId={}, group={}, tenant={}", agent.getName(),
dataId, group, tenant);
return;
}
// 有变更
if (cacheData.isUseLocalConfigInfo() && path.exists()
&& cacheData.getLocalConfigInfoVersion() != path.lastModified()) {
String content = LocalConfigInfoProcessor.getFailover(agent.getName(), dataId, group, tenant);
String md5 = MD5.getInstance().getMD5String(content);
cacheData.setUseLocalConfigInfo(true);
cacheData.setLocalConfigInfoVersion(path.lastModified());
cacheData.setContent(content);
LOGGER.warn("[{}] [failover-change] failover file changed. dataId={}, group={}, tenant={}, md5={}, content={}",
agent.getName(), dataId, group, tenant, md5, ContentUtils.truncateContent(content));
}
}
}
由上面客户端代码分析可知,客户端会发送一个 /v1/cs/configs/listener
的请求。
// 发送带超时时间的Http请求,请求路径:/v1/cs/configs/listener
HttpResult result = agent.httpPost(Constants.CONFIG_CONTROLLER_PATH + "/listener", headers, params,
agent.getEncode(), timeout);
服务端收到请求后,处理如下:
@Controller
// Constants.CONFIG_CONTROLLER_PATH = /v1/cs/configs
@RequestMapping(Constants.CONFIG_CONTROLLER_PATH)
public class ConfigController {
/**
* 比较MD5
*/
@RequestMapping(value = "/listener", method = RequestMethod.POST)
public void listener(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
request.setAttribute("org.apache.catalina.ASYNC_SUPPORTED", true);
// 获取需要比较的字符串
String probeModify = request.getParameter("Listening-Configs");
if (StringUtils.isBlank(probeModify)) {
throw new IllegalArgumentException("invalid probeModify");
}
// 解码
probeModify = URLDecoder.decode(probeModify, Constants.ENCODE);
// key -> groupKey value -> md5
Map<String, String> clientMd5Map;
try {
// 获取客户端传输过来的md5值
clientMd5Map = MD5Util.getClientMd5Map(probeModify);
} catch (Throwable e) {
throw new IllegalArgumentException("invalid probeModify");
}
// do long-polling
// 长轮询
inner.doPollingConfig(request, response, clientMd5Map, probeModify.length());
}
}
@Service
public class ConfigServletInner {
/**
* 轮询接口
*/
public String doPollingConfig(HttpServletRequest request, HttpServletResponse response,
Map<String, String> clientMd5Map, int probeRequestSize)
throws IOException, ServletException {
// 长轮询
if (LongPollingService.isSupportLongPolling(request)) {
// 添加长轮询客户端
longPollingService.addLongPollingClient(request, response, clientMd5Map, probeRequestSize);
return HttpServletResponse.SC_OK + "";
}
// else 兼容短轮询逻辑
List<String> changedGroups = MD5Util.compareMd5(request, response, clientMd5Map);
// 兼容短轮询result
String oldResult = MD5Util.compareMd5OldResult(changedGroups);
String newResult = MD5Util.compareMd5ResultString(changedGroups);
String version = request.getHeader(Constants.CLIENT_VERSION_HEADER);
if (version == null) {
version = "2.0.0";
}
int versionNum = Protocol.getVersionNumber(version);
/**
* 2.0.4版本以前, 返回值放入header中
*/
if (versionNum < START_LONGPOLLING_VERSION_NUM) {
response.addHeader(Constants.PROBE_MODIFY_RESPONSE, oldResult);
response.addHeader(Constants.PROBE_MODIFY_RESPONSE_NEW, newResult);
} else {
request.setAttribute("content", newResult);
}
// 禁用缓存
response.setHeader("Pragma", "no-cache");
response.setDateHeader("Expires", 0);
response.setHeader("Cache-Control", "no-cache,no-store");
response.setStatus(HttpServletResponse.SC_OK);
return HttpServletResponse.SC_OK + "";
}
}
@Service
public class LongPollingService extends AbstractEventListener {
public void addLongPollingClient(HttpServletRequest req, HttpServletResponse rsp, Map<String, String> clientMd5Map,
int probeRequestSize) {
// 获取客户端超时时间
String str = req.getHeader(LongPollingService.LONG_POLLING_HEADER);
// 挂断标志
String noHangUpFlag = req.getHeader(LongPollingService.LONG_POLLING_NO_HANG_UP_HEADER);
// 应用名称
String appName = req.getHeader(RequestUtil.CLIENT_APPNAME_HEADER);
String tag = req.getHeader("Vipserver-Tag");
// 延迟时间,服务端处理时间。0.5s
int delayTime = SwitchService.getSwitchInteger(SwitchService.FIXED_DELAY_TIME, 500);
/**
* 提前500ms返回响应,为避免客户端超时
*/
long timeout = Math.max(10000, Long.parseLong(str) - delayTime);
if (isFixedPolling()) {
timeout = Math.max(10000, getFixedPollingInterval());
// do nothing but set fix polling timeout
} else {
// 先检查是否存在变更,如果存在,直接返回
long start = System.currentTimeMillis();
List<String> changedGroups = MD5Util.compareMd5(req, rsp, clientMd5Map);
if (changedGroups.size() > 0) {
generateResponse(req, rsp, changedGroups);
LogUtil.clientLog.info("{}|{}|{}|{}|{}|{}|{}",
System.currentTimeMillis() - start, "instant", RequestUtil.getRemoteIp(req), "polling",
clientMd5Map.size(), probeRequestSize, changedGroups.size());
return;
} else if (noHangUpFlag != null && noHangUpFlag.equalsIgnoreCase(TRUE_STR)) {
LogUtil.clientLog.info("{}|{}|{}|{}|{}|{}|{}", System.currentTimeMillis() - start, "nohangup",
RequestUtil.getRemoteIp(req), "polling", clientMd5Map.size(), probeRequestSize,
changedGroups.size());
return;
}
}
/*************************不存在变更,采用servlet 3.0 异步处理***************************/
// 客户端ip
String ip = RequestUtil.getRemoteIp(req);
// 一定要由HTTP线程调用,否则离开后容器会立即发送响应
final AsyncContext asyncContext = req.startAsync();
// AsyncContext.setTimeout()的超时时间不准,所以只能自己控制
asyncContext.setTimeout(0L);
// 开启定时任务
// 其中,timeout = Math.max(10000, Long.parseLong(str) - delayTime) = Math.max(10000, 30000 - 500) = 29.5s
scheduler.execute(
new ClientLongPolling(asyncContext, clientMd5Map, ip, probeRequestSize, timeout, appName, tag));
}
}
ClientLongPolling 为 LongPollingService 的内部类,代码如下:
@Service
public class LongPollingService extends AbstractEventListener {
class ClientLongPolling implements Runnable {
@Override
public void run() {
// 开启一个延时线程,timeoutTime=29.5s
asyncTimeoutFuture = scheduler.schedule(new Runnable() {
@Override
public void run() {
try {
// 获取并设置客户端IP
getRetainIps().put(ClientLongPolling.this.ip, System.currentTimeMillis());
/**
* 删除订阅关系
*/
allSubs.remove(ClientLongPolling.this);
// 是否固定轮询
if (isFixedPolling()) {
LogUtil.clientLog.info("{}|{}|{}|{}|{}|{}",
(System.currentTimeMillis() - createTime),
"fix", RequestUtil.getRemoteIp((HttpServletRequest)asyncContext.getRequest()),
"polling",
clientMd5Map.size(), probeRequestSize);
// 通过md5值,获取当前所有变更的groups
List<String> changedGroups = MD5Util.compareMd5(
(HttpServletRequest)asyncContext.getRequest(),
(HttpServletResponse)asyncContext.getResponse(), clientMd5Map);
if (changedGroups.size() > 0) {
// 发送数据
sendResponse(changedGroups);
} else {
// 发送数据
sendResponse(null);
}
} else {
LogUtil.clientLog.info("{}|{}|{}|{}|{}|{}",
(System.currentTimeMillis() - createTime),
"timeout", RequestUtil.getRemoteIp((HttpServletRequest)asyncContext.getRequest()),
"polling",
clientMd5Map.size(), probeRequestSize);
// 发送数据
sendResponse(null);
}
} catch (Throwable t) {
LogUtil.defaultLog.error("long polling error:" + t.getMessage(), t.getCause());
}
}
}, timeoutTime, TimeUnit.MILLISECONDS);
// 添加订阅关系
allSubs.add(this);
}
void sendResponse(List<String> changedGroups) {
/**
* 取消超时任务
*/
if (null != asyncTimeoutFuture) {
asyncTimeoutFuture.cancel(false);
}
generateResponse(changedGroups);
}
void generateResponse(List<String> changedGroups) {
if (null == changedGroups) {
/**
* 告诉容器发送HTTP响应
*/
asyncContext.complete();
return;
}
HttpServletResponse response = (HttpServletResponse)asyncContext.getResponse();
try {
// 获取resp
String respString = MD5Util.compareMd5ResultString(changedGroups);
// 禁用缓存
response.setHeader("Pragma", "no-cache");
response.setDateHeader("Expires", 0);
response.setHeader("Cache-Control", "no-cache,no-store");
response.setStatus(HttpServletResponse.SC_OK);
// 回写数据
response.getWriter().println(respString);
asyncContext.complete();
} catch (Exception se) {
pullLog.error(se.toString(), se);
asyncContext.complete();
}
}
}
}
注意:如果在 29.5s 内发生变化,那么 nacos 是怎么处理的呢?
假如,在这 29.5s 内,你进入了 nacos 控制台,修改配置内容后,保存发布,那这个时候,nacos 服务端会做哪些内容呢??
通过浏览器控制台可发现,当你点击保存后,会调用 nacos 服务端的 /v1/cs/configs/
请求,最后发送一个 LocalDataChangeEvent 事件。
@Service
public class LongPollingService extends AbstractEventListener {
/**
* 长轮询订阅关系
*/
final Queue<ClientLongPolling> allSubs;
@Override
public void onEvent(Event event) {
if (isFixedPolling()) {
// ignore
} else {
if (event instanceof LocalDataChangeEvent) {
// 接收 LocalDataChangeEvent
LocalDataChangeEvent evt = (LocalDataChangeEvent)event;
// 执行 DataChangeTask
scheduler.execute(new DataChangeTask(evt.groupKey, evt.isBeta, evt.betaIps));
}
}
}
class DataChangeTask implements Runnable {
@Override
public void run() {
try {
ConfigService.getContentBetaMd5(groupKey);
// 循环遍历 allSubs Queue allSubs;
for (Iterator<ClientLongPolling> iter = allSubs.iterator(); iter.hasNext(); ) {
ClientLongPolling clientSub = iter.next();
// 如果当前 ClientLongPolling 中的 clientMd5Map key中存在当前 groupKey,则进行通知
if (clientSub.clientMd5Map.containsKey(groupKey)) {
// 如果beta发布且不在beta列表直接跳过
if (isBeta && !betaIps.contains(clientSub.ip)) {
continue;
}
// 如果tag发布且不在tag列表直接跳过
if (StringUtils.isNotBlank(tag) && !tag.equals(clientSub.tag)) {
continue;
}
getRetainIps().put(clientSub.ip, System.currentTimeMillis());
// 删除订阅关系
iter.remove();
LogUtil.clientLog.info("{}|{}|{}|{}|{}|{}|{}",
(System.currentTimeMillis() - changeTime),
"in-advance",
RequestUtil.getRemoteIp((HttpServletRequest)clientSub.asyncContext.getRequest()),
"polling",
clientSub.clientMd5Map.size(), clientSub.probeRequestSize, groupKey);
// 发送服务配置变更groupKey,完成实时通知
clientSub.sendResponse(Arrays.asList(groupKey));
}
}
} catch (Throwable t) {
LogUtil.defaultLog.error("data change error:" + t.getMessage(), t.getCause());
}
}
DataChangeTask(String groupKey) {
this(groupKey, false, null);
}
DataChangeTask(String groupKey, boolean isBeta, List<String> betaIps) {
this(groupKey, isBeta, betaIps, null);
}
DataChangeTask(String groupKey, boolean isBeta, List<String> betaIps, String tag) {
this.groupKey = groupKey;
this.isBeta = isBeta;
this.betaIps = betaIps;
this.tag = tag;
}
final String groupKey;
final long changeTime = System.currentTimeMillis();
final boolean isBeta;
final List<String> betaIps;
final String tag;
}
}
至此,Nacos 配置中心原理分析完成,下面我们回顾一下整体流程。
https://nacos.io/zh-cn/docs/v2/what-is-nacos.html
https://nacos.io/zh-cn/docs/v2/architecture.html
https://blog.csdn.net/qq_33375499/article/details/125710182
https://blog.csdn.net/qq_33375499/article/details/125703382