Nacos作为微服务注册配置中心,同时支持AP模式和CP模式,默认情况下,注册到Nacos的实例都是临时实例,本篇将会着重分析Nacos AP模式,Nacos实现了一个名为distro的协议来支持AP模式,本篇源码基于Nacos1.4.x。
分布式中的CAP理论就不阐述了,直接上源码,其实Nacos服务端就是一个SpringBoot,客户端会请求/nacos/v1/ns/instance这个接口注册服务,来到Nacos服务端注册实例的接口:
@CanDistro
@PostMapping
@Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
public String register(HttpServletRequest request) throws Exception {
//从请求中解析namespaceId
final String namespaceId = WebUtils
.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
//从请求中解析serviceName
final String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
NamingUtils.checkServiceNameFormat(serviceName);
//根据请求参数解析出一个服务实例对象
final Instance instance = parseInstance(request);
//注册实例
serviceManager.registerInstance(namespaceId, serviceName, instance);
return "ok";
}
//ServiceManager#registerInstance
public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
//需要的话创建在内存的服务注册表中创建一个空的service,服务注册表是一个嵌套的ConcurrentMap
createEmptyService(namespaceId, serviceName, instance.isEphemeral());
//从注册表中获取对应服务的Service
Service service = getService(namespaceId, serviceName);
if (service == null) {
throw new NacosException(NacosException.INVALID_PARAM,
"service not found, namespace: " + namespaceId + ", service: " + serviceName);
}
//将实例加入该service
addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
}
先看下createEmptyService这个方法:
public void createEmptyService(String namespaceId, String serviceName, boolean local) throws NacosException {
createServiceIfAbsent(namespaceId, serviceName, local, null);
}
public void createServiceIfAbsent(String namespaceId, String serviceName, boolean local, Cluster cluster)
throws NacosException {
Service service = getService(namespaceId, serviceName);
if (service == null) {
//如果没有该Service则需初始化
Loggers.SRV_LOG.info("creating empty service {}:{}", namespaceId, serviceName);
service = new Service();
service.setName(serviceName);
service.setNamespaceId(namespaceId);
service.setGroupName(NamingUtils.getGroupName(serviceName));
// now validate the service. if failed, exception will be thrown
service.setLastModifiedMillis(System.currentTimeMillis());
service.recalculateChecksum();
if (cluster != null) {
cluster.setService(service);
service.getClusterMap().put(cluster.getName(), cluster);
}
service.validate();
//初始化service
putServiceAndInit(service);
if (!local) {
//local表示是临时实例还是持久化实例,临时实例不会进入这个if
addOrReplaceService(service);
}
}
}
看一下nacos服务注册表的结构,其实是一个嵌套的Map:
/** * Map(namespace, Map(group::serviceName, Service)). */ private final Map> serviceMap = new ConcurrentHashMap<>(); 其中Service中是一个存放Cluster的map,Cluster中存放着Intance的集合
service初始化:
private void putServiceAndInit(Service service) throws NacosException {
putService(service);
service = getService(service.getNamespaceId(), service.getName());
//初始化service
service.init();
//将service对象作为listner加入持久化服务的监听器队列,这里是对应临时实例服务的DelegateConsistencyServiceImpl
consistencyService
.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), true), service);
consistencyService
.listen(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), false), service);
Loggers.SRV_LOG.info("[NEW-SERVICE] {}", service.toJson());
}
putService方法会需要的情况下创建命名空间和空的service,并加入注册表:
public void putService(Service service) {
if (!serviceMap.containsKey(service.getNamespaceId())) {
synchronized (putServiceLock) {
if (!serviceMap.containsKey(service.getNamespaceId())) {
//如果namespace不存在则创建
serviceMap.put(service.getNamespaceId(), new ConcurrentSkipListMap<>());
}
}
}
//将空的service加入该namespace的对应的map
serviceMap.get(service.getNamespaceId()).putIfAbsent(service.getName(), service);
}
下面是service类的初始化方法,主要是启动客户端健康检查:
public void init() {
//启动客户端心跳监测服务
HealthCheckReactor.scheduleCheck(clientBeatCheckTask);
for (Map.Entry entry : clusterMap.entrySet()) {
entry.getValue().setService(this);
entry.getValue().init();
}
}
心跳任务run方法:
public void run() {
try {
if (!getDistroMapper().responsible(service.getName())) {
//如果这个服务的心跳监测不是由当前server负责的(对serviceName做哈希并对集群实例数取模),直接退出
return;
}
//是否允许健康检查
if (!getSwitchDomain().isHealthCheckEnabled()) {
return;
}
List instances = service.allIPs(true);
// first set health status of instances:
//遍历所有实例,检查心跳
for (Instance instance : instances) {
if (System.currentTimeMillis() - instance.getLastBeat() > instance.getInstanceHeartBeatTimeOut()) {
//超过HeartBeatTimeOut,没有收到心跳,默认15s
if (!instance.isMarked()) {
if (instance.isHealthy()) {
//设置为不健康
instance.setHealthy(false);
Loggers.EVT_LOG
.info("{POS} {IP-DISABLED} valid: {}:{}@{}@{}, region: {}, msg: client timeout after {}, last beat: {}",
instance.getIp(), instance.getPort(), instance.getClusterName(),
service.getName(), UtilsAndCommons.LOCALHOST_SITE,
instance.getInstanceHeartBeatTimeOut(), instance.getLastBeat());
//发布服务改变的事件,会用spring的applicationContext发布
getPushService().serviceChanged(service);
//发布心跳超时事件
ApplicationUtils.publishEvent(new InstanceHeartbeatTimeoutEvent(this, instance));
}
}
}
}
if (!getGlobalConfig().isExpireInstance()) {
return;
}
// then remove obsolete instances:
//遍历实例列表,删除过期的实例,默认30s
for (Instance instance : instances) {
if (instance.isMarked()) {
continue;
}
if (System.currentTimeMillis() - instance.getLastBeat() > instance.getIpDeleteTimeout()) {
// delete instance
Loggers.SRV_LOG.info("[AUTO-DELETE-IP] service: {}, ip: {}", service.getName(),
JacksonUtils.toJson(instance));
//删除实例,其实就是向自己发送一个注销实例的http请求
deleteIp(instance);
}
}
} catch (Exception e) {
Loggers.SRV_LOG.warn("Exception while processing client beat time out.", e);
}
}
现在在注册表中创建了一个空的service(需要的话,里面还没有任何服务实例),需要更新service注册表ServiceManager#addInstance:
public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips)
throws NacosException {
//构建对应服务的key eg. com.alibaba.nacos.naming.iplist.ephemeral.#{namespaceId}.##.#{serviceName}
String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
//从注册表拿到service对象
Service service = getService(namespaceId, serviceName);
synchronized (service) {
//获取新的合并了的服务实例列表
List instanceList = addIpAddresses(service, ephemeral, ips);
Instances instances = new Instances();
instances.setInstanceList(instanceList);
//更新注册表服务实例信息,临时节点对应DistroConsistencyServiceImpl
//如果是持久化节点会做持久化
consistencyService.put(key, instances);
}
}
首先将service中原有的实例列表和新的要注册进来的实例合并:addIpAddresses
private List addIpAddresses(Service service, boolean ephemeral, Instance... ips) throws NacosException {
return updateIpAddresses(service, UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD, ephemeral, ips);
}
public List updateIpAddresses(Service service, String action, boolean ephemeral, Instance... ips)
throws NacosException {
//从DataStore中拿到原来的service中的实例集合Datum
Datum datum = consistencyService
.get(KeyBuilder.buildInstanceListKey(service.getNamespaceId(), service.getName(), ephemeral));
//得到当前注册表中service中的实例
List currentIPs = service.allIPs(ephemeral);
Map currentInstances = new HashMap<>(currentIPs.size());
Set currentInstanceIds = Sets.newHashSet();
//将当前的实例加入currentInstances,currentInstanceIds做保存
for (Instance instance : currentIPs) {
currentInstances.put(instance.toIpAddr(), instance);
currentInstanceIds.add(instance.getInstanceId());
}
Map instanceMap;
if (datum != null && null != datum.value) {
//更新一下心跳等数据
instanceMap = setValid(((Instances) datum.value).getInstanceList(), currentInstances);
} else {
instanceMap = new HashMap<>(ips.length);
}
//遍历要新加入的实例集合
for (Instance instance : ips) {
if (!service.getClusterMap().containsKey(instance.getClusterName())) {
//如果新的实例所在集群不存在则创建
Cluster cluster = new Cluster(instance.getClusterName(), service);
cluster.init();
service.getClusterMap().put(instance.getClusterName(), cluster);
Loggers.SRV_LOG
.warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
instance.getClusterName(), instance.toJson());
}
if (UtilsAndCommons.UPDATE_INSTANCE_ACTION_REMOVE.equals(action)) {
instanceMap.remove(instance.getDatumKey());
} else {
//得到与新加入的实例相同key的原来的实例
Instance oldInstance = instanceMap.get(instance.getDatumKey());
if (oldInstance != null) {
//已存在则设置InstanceId为原来的id
instance.setInstanceId(oldInstance.getInstanceId());
} else {
//不存在则生成Id
instance.setInstanceId(instance.generateInstanceId(currentInstanceIds));
}
//将实例加入老的实例集合
instanceMap.put(instance.getDatumKey(), instance);
}
}
if (instanceMap.size() <= 0 && UtilsAndCommons.UPDATE_INSTANCE_ACTION_ADD.equals(action)) {
throw new IllegalArgumentException(
"ip list can not be empty, service: " + service.getName() + ", ip list: " + JacksonUtils
.toJson(instanceMap.values()));
}
//返回合并了的新的服务实例集合
return new CopyOnWriteArrayList<>(instanceMap.values());
}
注意这里返回的新的服务实例集合为CopyOnWriteArrayList。接着是正式更新注册表信息:ConsistencyService#put,ap模式下会使用DistroConsistencyServiceImpl这个实现:
public void put(String key, Record value) throws NacosException {
//更新注册表
onPut(key, value);
//同步实例信息到其它nacos节点
distroProtocol.sync(new DistroKey(key, KeyBuilder.INSTANCE_LIST_KEY_PREFIX), DataOperation.CHANGE,
globalConfig.getTaskDispatchPeriod() / 2);
}
public void onPut(String key, Record value) {
if (KeyBuilder.matchEphemeralInstanceListKey(key)) {
Datum datum = new Datum<>();
datum.value = (Instances) value;
datum.key = key;
datum.timestamp.incrementAndGet();
//先将dataStore里面的实例信息更新一下,map里面直接替换,后面注册表会用这里的数据更新
dataStore.put(key, datum);
}
if (!listeners.containsKey(key)) {
return;
}
//发布更新注册表的任务
notifier.addTask(key, DataOperation.CHANGE);
}
用Notifier发布任务,Notifier是一个任务发布者,也可以处理任务,内部维护了一个任务的队列和一个记录待发布任务关联的service的map,它是一个线程:
public void addTask(String datumKey, DataOperation action) {
if (services.containsKey(datumKey) && action == DataOperation.CHANGE) {
//如果已经在通知器的services map里面了,那没必要重复发布,直接返回
return;
}
if (action == DataOperation.CHANGE) {
//加入通知器的services map
services.put(datumKey, StringUtils.EMPTY);
}
//往阻塞队列加入任务
tasks.offer(Pair.with(datumKey, action));
}
public void run() {
Loggers.DISTRO.info("distro notifier started");
for (; ; ) {
try {
//从阻塞队列拿任务
Pair pair = tasks.take();
//处理
handle(pair);
} catch (Throwable e) {
Loggers.DISTRO.error("[NACOS-DISTRO] Error while handling notifying task", e);
}
}
}
未完待续。。。