昨天分析了 nacos 动态配置的源码解析,今天来看下 nacos 服务注册的相关源码。
一、注册中心原理
二、客户端注册服务源码
使用 nacos 的服务注册功能,需要先引入对应的依赖:
com.alibaba.cloud
spring-cloud-starter-alibaba-nacos-discovery
依照惯例,从 spring.factories 文件找自动配置的入口
从 com.alibaba.cloud.nacos.registry.NacosServiceRegistryAutoConfiguration 中,找到创建 NacosAutoServiceRegistration 对象的方法
@Bean
@ConditionalOnBean({AutoServiceRegistrationProperties.class})
public NacosAutoServiceRegistration nacosAutoServiceRegistration(NacosServiceRegistry registry, AutoServiceRegistrationProperties autoServiceRegistrationProperties, NacosRegistration registration) {
return new NacosAutoServiceRegistration(registry, autoServiceRegistrationProperties, registration);
}
NacosAutoServiceRegistration 类继承自 AbstractAutoServiceRegistration 类,查看整个继承关系,可以看到 AbstractAutoServiceRegistration 实现了 ApplicationListener 接口
这下就好办了,我们直接找 Event 相关的方法,最终确定到了 AbstractAutoServiceRegistration 类中的 onApplicationEvent(WebServerInitializedEvent event) 方法,代码如下:
public void onApplicationEvent(WebServerInitializedEvent event) {
this.bind(event);
}
/** @deprecated */
@Deprecated
public void bind(WebServerInitializedEvent event) {
ApplicationContext context = event.getApplicationContext();
if (!(context instanceof ConfigurableWebServerApplicationContext) || !"management".equals(((ConfigurableWebServerApplicationContext)context).getServerNamespace())) {
this.port.compareAndSet(0, event.getWebServer().getPort());
// 调用了 start 方法
this.start();
}
}
start() 方法的内容如下:
public void start() {
// 判断是否已开启注册
if (!this.isEnabled()) {
if (logger.isDebugEnabled()) {
logger.debug("Discovery Lifecycle disabled. Not starting");
}
} else {
if (!this.running.get()) {
this.context.publishEvent(new InstancePreRegisteredEvent(this, this.getRegistration()));
// 调用 register() 方法
this.register();
if (this.shouldRegisterManagement()) {
this.registerManagement();
}
this.context.publishEvent(new InstanceRegisteredEvent(this, this.getConfiguration()));
this.running.compareAndSet(false, true);
}
}
}
这个 register 方法,最终会调用到 com.alibaba.cloud.nacos.registry.NacosServiceRegistry#register 方法中,内容如下:
public void register(Registration registration) {
if (StringUtils.isEmpty(registration.getServiceId())) {
log.warn("No service to register for nacos client...");
} else {
// .... 其余代码省略
try {
// 调用 registerInstance() 方法
namingService.registerInstance(serviceId, group, instance);
log.info("nacos registry, {} {} {}:{} register finished", new Object[]{group, serviceId, instance.getIp(), instance.getPort()});
} catch (Exception var7) {
log.error("nacos registry, {} register failed...{},", new Object[]{serviceId, registration.toString(), var7});
ReflectionUtils.rethrowRuntimeException(var7);
}
}
}
最终会调用到 com.alibaba.nacos.client.naming.net.NamingProxy#registerService 的方法
public void registerService(String serviceName, String groupName, Instance instance) throws NacosException {
NAMING_LOGGER.info("[REGISTER-SERVICE] {} registering service {} with instance: {}", namespaceId, serviceName,
instance);
// 拼接了一系列参数
final Map params = new HashMap(16);
params.put(CommonParams.NAMESPACE_ID, namespaceId);
params.put(CommonParams.SERVICE_NAME, serviceName);
params.put(CommonParams.GROUP_NAME, groupName);
params.put(CommonParams.CLUSTER_NAME, instance.getClusterName());
params.put("ip", instance.getIp());
params.put("port", String.valueOf(instance.getPort()));
params.put("weight", String.valueOf(instance.getWeight()));
params.put("enable", String.valueOf(instance.isEnabled()));
params.put("healthy", String.valueOf(instance.isHealthy()));
params.put("ephemeral", String.valueOf(instance.isEphemeral()));
params.put("metadata", JacksonUtils.toJson(instance.getMetadata()));
// 调用服务端的 /nacos/v1/ns/instance
reqApi(UtilAndComs.nacosUrlInstance, params, HttpMethod.POST);
}
至此,客户端启动时注册服务的代码已经分析完成,接下来一起看下服务端是怎么处理的
三、服务端接收服务注册的源码
根据调用的 /nacos/v1/ns/instance 路径,找到服务端对应的处理方法:com.alibaba.nacos.naming.controllers.InstanceController#register
@CanDistro
@PostMapping
@Secured(parser = NamingResourceParser.class, action = ActionTypes.WRITE)
public String register(HttpServletRequest request) throws Exception {
// 获取 namespaceId
final String namespaceId = WebUtils
.optional(request, CommonParams.NAMESPACE_ID, Constants.DEFAULT_NAMESPACE_ID);
// 获取 serviceName
final String serviceName = WebUtils.required(request, CommonParams.SERVICE_NAME);
NamingUtils.checkServiceNameFormat(serviceName);
final Instance instance = HttpRequestInstanceBuilder.newBuilder()
.setDefaultInstanceEphemeral(switchDomain.isDefaultInstanceEphemeral()).setRequest(request).build();
// 调用了 registerInstance 方法
getInstanceOperator().registerInstance(namespaceId, serviceName, instance);
return "ok";
}
其中 getInstanceOperator() 的实现如下:
private InstanceOperator getInstanceOperator() {
// 这里判断是否使用了 grpc
return upgradeJudgement.isUseGrpcFeatures() ? instanceServiceV2 : instanceServiceV1;
}
我们没有 grpc 的方式,所以这里最终会走到 com.alibaba.nacos.naming.core.InstanceOperatorServiceImpl#registerInstance 方法中
@Override
public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
com.alibaba.nacos.naming.core.Instance coreInstance = parseInstance(instance);
// 调用 registerInstance 方法
serviceManager.registerInstance(namespaceId, serviceName, coreInstance);
}
registerInstance 的实现如下:
public void registerInstance(String namespaceId, String serviceName, Instance instance) throws NacosException {
// 创建一个空服务
createEmptyService(namespaceId, serviceName, instance.isEphemeral());
// 获取 service
Service service = getService(namespaceId, serviceName);
checkServiceIsNull(service, namespaceId, serviceName);
// 具体的服务注册方法
addInstance(namespaceId, serviceName, instance.isEphemeral(), instance);
}
追踪一下 createEmptyService() 方法,可以看到先是尝试从一个 serviceMap 的对象中获取 service,不存在创建一个放入这个 serviceMap 中,可以看到 serviceMap 的数据结构是:
/**
* Map(namespace, Map(group::serviceName, Service)).
* 其实整个服务注册,就是为了保存到这个 serviceMap 对象中
*/
private final Map> serviceMap = new ConcurrentHashMap<>();
好了,回到上面的 addInstance 方法的实现:
public void addInstance(String namespaceId, String serviceName, boolean ephemeral, Instance... ips)
throws NacosException {
// 将 namespaceId 和 serviceName 等参数组成一个 key
String key = KeyBuilder.buildInstanceListKey(namespaceId, serviceName, ephemeral);
// 不知道为何这里又调用了一次 getService 方法 。。。。
Service service = getService(namespaceId, serviceName);
// 这里对 service 对象进行了加锁,这样同一时间只能有一个线程对 service 方法进行修改
synchronized (service) {
List instanceList = addIpAddresses(service, ephemeral, ips);
Instances instances = new Instances();
instances.setInstanceList(instanceList);
// 重要方法在这里
consistencyService.put(key, instances);
}
}
这个 key 定义后面会用到,先记着
这个 put() 方法会调用到 com.alibaba.nacos.naming.consistency.DelegateConsistencyServiceImpl#put 中
@Override
public void put(String key, Record value) throws NacosException {
mapConsistencyService(key).put(key, value);
}
Instances 是 Record 接口的子类
这里的 put 方法又会调用到 com.alibaba.nacos.naming.consistency.ephemeral.distro.DistroConsistencyServiceImpl#put 方法中。。。
@Override
public void put(String key, Record value) throws NacosException {
// 重点看下这个方法
onPut(key, value);
// If upgrade to 2.0.X, do not sync for v1.
if (ApplicationUtils.getBean(UpgradeJudgement.class).isUseGrpcFeatures()) {
return;
}
distroProtocol.sync(new DistroKey(key, KeyBuilder.INSTANCE_LIST_KEY_PREFIX), DataOperation.CHANGE,
DistroConfig.getInstance().getSyncDelayMillis());
}
onPut 方法的内容如下:
public void onPut(String key, Record value) {
if (KeyBuilder.matchEphemeralInstanceListKey(key)) {
Datum datum = new Datum<>();
datum.value = (Instances) value;
datum.key = key;
datum.timestamp.incrementAndGet();
dataStore.put(key, datum);
}
if (!listeners.containsKey(key)) {
return;
}
// 添加一个 task 任务到 notifier 对象中
notifier.addTask(key, DataOperation.CHANGE);
}
notifier 对象是一个内部类,实现了Runnable 接口,addTask 方法的内容如下:
public void addTask(String datumKey, DataOperation action) {
if (services.containsKey(datumKey) && action == DataOperation.CHANGE) {
return;
}
if (action == DataOperation.CHANGE) {
services.put(datumKey, StringUtils.EMPTY);
}
// 最终调用了这个方法
tasks.offer(Pair.with(datumKey, action));
}
其中,tasks 是一个阻塞队列
private BlockingQueue> tasks = new ArrayBlockingQueue<>(1024 * 1024);
看到这里会发现,将 namespaceId 和 serviceName 组成的 key 丢入阻塞队列后,整个注册请求就基本结束了,那么是在哪里真正进行服务注册的呢,答案就在 Notifier 对象和 DistroConsistencyServiceImpl 对象中。
查看 DistroConsistencyServiceImpl 对象的源码,可以找到一个 init() 方法,内容如下:
@PostConstruct
public void init() {
GlobalExecutor.submitDistroNotifyTask(notifier);
}
这里其实就是交给了一个线程池进行处理,而传入的对象就是前面提到的 notifier 对象,那么接下来就只需要查看 notifier 的 run() 方法就行了
@Override
public void run() {
Loggers.DISTRO.info("distro notifier started");
// 死循环
for (; ; ) {
try {
// 从阻塞队列中拿到一个 Pair 对象,调用 handle 方法
Pair pair = tasks.take();
handle(pair);
} catch (Throwable e) {
Loggers.DISTRO.error("[NACOS-DISTRO] Error while handling notifying task", e);
}
}
}
接下来看下 handle() 方法:
private void handle(Pair pair) {
try {
// ... 省略部分代码
for (RecordListener listener : listeners.get(datumKey)) {
count++;
try {
if (action == DataOperation.CHANGE) {
// 这里会调用 Service 类的 onChange 方法
listener.onChange(datumKey, dataStore.get(datumKey).value);
continue;
}
// ... 省略部分代码
} catch (Throwable e) {
Loggers.DISTRO.error("[NACOS-DISTRO] error while notifying listener of key: {}", datumKey, e);
}
}
// ... 省略部分代码
} catch (Throwable e) {
Loggers.DISTRO.error("[NACOS-DISTRO] Error while handling notifying task", e);
}
}
这里会调用到 com.alibaba.nacos.naming.core.Service#onChange 方法
@Override
public void onChange(String key, Instances value) throws Exception {
Loggers.SRV_LOG.info("[NACOS-RAFT] datum is changed, key: {}, value: {}", key, value);
for (Instance instance : value.getInstanceList()) {
if (instance == null) {
// Reject this abnormal instance list:
throw new RuntimeException("got null instance " + key);
}
// 对权重进行处理
if (instance.getWeight() > 10000.0D) {
instance.setWeight(10000.0D);
}
if (instance.getWeight() < 0.01D && instance.getWeight() > 0.0D) {
instance.setWeight(0.01D);
}
}
// 重要方法
updateIPs(value.getInstanceList(), KeyBuilder.matchEphemeralInstanceListKey(key));
recalculateChecksum();
}
updateIPs 方法就比较好理解了,简单点来说就是将注册的服务放在一个 clusterMap 的对象中
public void updateIPs(Collection instances, boolean ephemeral) {
Map> ipMap = new HashMap<>(clusterMap.size());
for (String clusterName : clusterMap.keySet()) {
ipMap.put(clusterName, new ArrayList<>());
}
for (Instance instance : instances) {
try {
if (instance == null) {
Loggers.SRV_LOG.error("[NACOS-DOM] received malformed ip: null");
continue;
}
if (StringUtils.isEmpty(instance.getClusterName())) {
instance.setClusterName(UtilsAndCommons.DEFAULT_CLUSTER_NAME);
}
// 如果之前集群不存在,创建一个集群
if (!clusterMap.containsKey(instance.getClusterName())) {
Loggers.SRV_LOG
.warn("cluster: {} not found, ip: {}, will create new cluster with default configuration.",
instance.getClusterName(), instance.toJson());
Cluster cluster = new Cluster(instance.getClusterName(), this);
cluster.init();
getClusterMap().put(instance.getClusterName(), cluster);
}
List clusterIPs = ipMap.get(instance.getClusterName());
if (clusterIPs == null) {
clusterIPs = new LinkedList<>();
ipMap.put(instance.getClusterName(), clusterIPs);
}
clusterIPs.add(instance);
} catch (Exception e) {
Loggers.SRV_LOG.error("[NACOS-DOM] failed to process ip: " + instance, e);
}
}
for (Map.Entry> entry : ipMap.entrySet()) {
//make every ip mine
List entryIPs = entry.getValue();
// 这里又会调用 updateIps 方法
clusterMap.get(entry.getKey()).updateIps(entryIPs, ephemeral);
}
setLastModifiedMillis(System.currentTimeMillis());
getPushService().serviceChanged(this);
ApplicationUtils.getBean(DoubleWriteEventListener.class).doubleWriteToV2(this, ephemeral);
StringBuilder stringBuilder = new StringBuilder();
for (Instance instance : allIPs()) {
stringBuilder.append(instance.toIpAddr()).append('_').append(instance.isHealthy()).append(',');
}
Loggers.EVT_LOG.info("[IP-UPDATED] namespace: {}, service: {}, ips: {}", getNamespaceId(), getName(),
stringBuilder.toString());
}
来看下 com.alibaba.nacos.naming.core.Cluster#updateIps 方法
public void updateIps(List ips, boolean ephemeral) {
Set toUpdateInstances = ephemeral ? ephemeralInstances : persistentInstances;
HashMap oldIpMap = new HashMap<>(toUpdateInstances.size());
for (Instance ip : toUpdateInstances) {
oldIpMap.put(ip.getDatumKey(), ip);
}
// 获取有更新的服务
List updatedIps = updatedIps(ips, oldIpMap.values());
if (updatedIps.size() > 0) {
for (Instance ip : updatedIps) {
Instance oldIP = oldIpMap.get(ip.getDatumKey());
// do not update the ip validation status of updated ips
// because the checker has the most precise result
// Only when ip is not marked, don't we update the health status of IP:
if (!ip.isMarked()) {
ip.setHealthy(oldIP.isHealthy());
}
if (ip.isHealthy() != oldIP.isHealthy()) {
// ip validation status updated
Loggers.EVT_LOG.info("{} {SYNC} IP-{} {}:{}@{}", getService().getName(),
(ip.isHealthy() ? "ENABLED" : "DISABLED"), ip.getIp(), ip.getPort(), getName());
}
if (ip.getWeight() != oldIP.getWeight()) {
// ip validation status updated
Loggers.EVT_LOG.info("{} {SYNC} {IP-UPDATED} {}->{}", getService().getName(), oldIP, ip);
}
}
}
// 获取新的服务
List newIPs = subtract(ips, oldIpMap.values());
if (newIPs.size() > 0) {
Loggers.EVT_LOG
.info("{} {SYNC} {IP-NEW} cluster: {}, new ips size: {}, content: {}", getService().getName(),
getName(), newIPs.size(), newIPs);
for (Instance ip : newIPs) {
// 对每个新服务建立健康检查
HealthCheckStatus.reset(ip);
}
}
// 获取已经失效的服务
List deadIPs = subtract(oldIpMap.values(), ips);
if (deadIPs.size() > 0) {
Loggers.EVT_LOG
.info("{} {SYNC} {IP-DEAD} cluster: {}, dead ips size: {}, content: {}", getService().getName(),
getName(), deadIPs.size(), deadIPs);
for (Instance ip : deadIPs) {
// 已经失效的服务移除健康检查
HealthCheckStatus.remv(ip);
}
}
toUpdateInstances = new HashSet<>(ips);
// 将最终的结果替换现有的对象,这里的思想类似于 COW 思想,做到了读写分离不干扰
if (ephemeral) {
ephemeralInstances = toUpdateInstances;
} else {
persistentInstances = toUpdateInstances;
}
}
四、总结
服务注册的源码相当于比较好理解,写的不怎么样,有问题的地方还请批评指正,感谢!