Marco's Java【Eureka篇之Eureka集群搭建及源码解析】

Spring Cloud Eureka 简介

说到注册中心,大家很容易会联想到Zookeeper,那么今天的主角Eureka同Zookeeper一样都是注册中心。

Eureka 来源于古希腊词汇,意为“发现了”。在软件领域, Eureka 是 Netflix
在线影片公司开源的一个服务注册与发现的组件,和其他Netflix 公司的服务组件(例如负载均衡、熔断器、网关等) 一起,被Spring Cloud 社区整合为Spring Cloud Netflix 模块。 Eureka 是Netflix 贡献给Spring
Cloud的一个框架!Netflix 给Spring Cloud 贡献了很多框架。


Spring Cloud Eureka 和Zookeeper的区别

关于Eureka 和Zookeeper的区别,我们着重从分布式特征CAP层面来分析

Consistance : 数据的一致性 (A,B,C里面的数据是一致的)
Zookeeper 注重数据的一致性。Eureka 不是很注重数据的一致性!
Available: 服务的可用性
在Zookeeper里面,若主机挂了,则Zookeeper集群整体不对外提供服务了,需要选一个新的主机出来(30-120s)才能继续对外提供服务(选举期间不对外提供服务)!Eureka 注重服务的可用性,当Eureka集群只有一台活着,它就能对外提供服务.
Partition Tolerence:分区的容错性 (在集群里面的集群,因为网络原因,机房的原因,可能导致数据不会里面同步),这是分布式产品必须需要实现的特性!

当然还有一些其他的区别,比如说,Zookeeper使用比较灵活,可作为Dubbo的注册中心,实现分布式服务的管理,可作为配置管理中心,统一管理Solr集群的配置文件,利用临时节点剔除的原理管理ActiveMQ集群,还有诸如分布式锁的应用,而Eureka作为Spring Cloud的组件,则着重作为服务的注册中心,像一张网,将Cloud中的组件连在一起,从而让各个组件发挥各自的功能。Eureka提供自我保护和自身集群监控功能,结合缓存功能实现Eureka的高可用!

总体来说Zookeeper 注重数据的一致性(CP),而Eureka 注重服务的可用性(AP)


Eureka高可用的原理

Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第1张图片
Eureka的客户端在向某个Eureka注册时,或者发现连接失败时,则会自动切换至其它节点,只要有一台Eureka还在,就能保证注册服务可用(保证可用性),只不过查到的信息可能不是最新的(不保证强一致性)。除此之外,Eureka还有一种自我保护机制,如果在15分钟内超过85%的节点都没有正常的心跳,那么Eureka就认为客户端与注册中心出现了网络故障,此时会出现以下几种情况:
1)Eureka不再从注册列表中移除因为长时间没收到心跳而应该过期的服务
2)Eureka仍然能够接受新服务的注册和查询请求,但是不会被同步到其它节点上(即保证当前节点依然可用)
3)当网络稳定时,当前实例新的注册信息会被同步到其它节点中

因此, Eureka可以很好的应对因网络故障导致部分节点失去联系的情况,而不会像zookeeper那样使整个注册服务瘫痪


Spring Cloud 其他注册中心的选择

Eureka 修改了授权协议,之前Eureka Apache2.0 的协议,后续的Eureka ,可能使用其他的授权协议。
Spring Cloud 还有别的注册中心 Consul,阿里巴巴提供Nacos 都能作为注册中心,我们的选择还是很多。
但是我们学习还是选择Eureka ,因为它的成熟度相对来说很高。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第2张图片


Eureka相关配置

Eureka Server 服务端配置

eureka-server.jar包导入


	org.springframework.cloud
	spring-cloud-starter-netflix-eureka-server

application.yml配置

server:
  port: 8080
spring:
  application:
    name: eureka-client-a
eureka:
  client:
    serviceUrl:
      defaultZone: http://peer1:8761/eureka/,http://peer2:8762/eureka/,http://peer3:8763/eureka/
  instance: 
    instance-id: ${spring.application.name}:${server.port}
    prefer-ip-address: true

这里我说明下,为什么Eureka-Server需要配置eureka.client.serviceUrl,因为Eureka-Server不仅提供让别人注册的功能,它也能注册到别人里面。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第3张图片
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第4张图片
并且根据源码分析,Eureka-Server在启动时,还会自己注册自己。通过这种相互注册,相互发现的机制,可以轻易的完成服务端集群的搭建。

补充

# 响应的缓存配置(为了eureka-server能快速的响应client)
response-cache-update-interval-ms: 3000
responseCacheAutoExpirationInSeconds: 180
# 定期删除没有需要的instance(lastupdatetime)
evictionIntervalTimerInMs: 3000
# server的自我保护模式
enableSelfPreservation: true # 本地调试时可fasle关闭。但生产建议打开,可防止因网络不稳定等原因导致误剔除服务。
renewalPercentThreshold: 0.85 # 默认85%
Eureka Client 客户端配置

eureka-client.jar包导入


	org.springframework.cloud
	spring-cloud-starter-netflix-eureka-client

application.yml配置

server:
  port: 8761
spring:
  application:
    name: eureka-server
eureka:
  client:
    register-with-eureka: false
    fetch-registry: false
    serviceUrl:
      defaultZone: http://peer1:8761/eureka/,http://peer2:8762/eureka/,http://peer3:8763/eureka/
    preferIpAddress: true # 默认false。应该始终设置为true。如果基于Docker等容器的部署,容器会生成一个随机的主机名,此时DNS不存在该名,无法解析 - John Carnell

当Eureka-Serve两两之间相互注册之后,Client端通过注册整个集群的Server的方式,形成一个高可用的集群回路
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第5张图片
运行成功之后显示如下,三个Eureka Server搭建成一个集群,当Client注册任意一个Server之后,其他的Server都会复制该数据,实现数据的共享。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第6张图片
补充

instance:
    lease-renewal-interval-in-seconds: 30 # 30s client 向server发一个续签请求,代表我还活着 
    lease-expiration-duration-in-seconds: 90 # client 多久没有向server发请求,server 会清除它
    instanceId: ${spring.cloud.client.hostname}:${spring.application.name}:${spring.application.instance_id:${server.port}}
# 详见EurekaClientConfigBean(实现EurekaClientConfig)
client:
  # 是否启用eureka客户端。默认true
  enabled: true # 本地调试时,若不想启动eureka,可配置false即可,而不需要注释掉@EnableDiscoveryClient这么麻烦。
  registerWithEureka: true # 默认true,因此也可省略。
  fetchRegistry: true # 默认true,此处可不配置。
  registry-fetch-interval-seconds: 30 # 如果想eureka server剔除服务后尽快在client体现,我觉得可缩短此时间。

Eureka概念的理解

服务的注册
当项目一启动,向eureka-server 发送自己的元数据,(运行的ip,port,健康的监控数据)eureka-server在自己内部保存这些元数据。
服务的续约
Client项目启动成功了,也向eureka-server 注册成功。 项目还会定时的去eureka-server 汇报自己。代表我还活着。
服务的下线
当项目关闭时,项目会给eureka-server 报告自己,表示自己要下线了!
服务的剔除
当服务没有向 eureka-server 汇报自己的状态超过一段时间,eureka-server则认为它挂了,并会把它剔除掉!

接下来我们会围绕上面的几个特征,对Eureka的源码进行分析。


Eureka源码解析

Eureka-Server对外会提供Restful服务(http服务 + 特定的请求方式url + 特定的url地址),只要利用这些restful服务,就可以实现项目中的服务的注册和发现。因为http是跨平台的,不管客户端是什么语言,只要能发起http请求,就可以自己实现服务的注册和发现!

服务注册源码解析

首先,在eureka-client-1.9.13.jar源码包中找到DiscoveryClient类
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第7张图片
在DiscoveryClient有个注册方法register(instanceInfo)
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第8张图片
instanceInfo 为实例信息,底层使用AbstractJerseyEurekaHttpClient类中的register()方法发起一个post请求,并返回响应的结果httpResponse,如果响应的结果的状态为NO_CONTENT无内容,则表示成功!

/**
 * Register with the eureka service by making the appropriate REST call.
 */
boolean register() throws Throwable {
    logger.info(PREFIX + "{}: registering service...", appPathIdentifier);
    EurekaHttpResponse<Void> httpResponse;
    try {
    	// 注册实例信息,实际上是向server端发送一个http注册请求
        httpResponse = eurekaTransport.registrationClient.register(instanceInfo);
    } catch (Exception e) {
        logger.warn(PREFIX + "{} - registration failed {}", appPathIdentifier, e.getMessage(), e);
        throw e;
    }
    if (logger.isInfoEnabled()) {
        logger.info(PREFIX + "{} - registration status: {}", appPathIdentifier, httpResponse.getStatusCode());
    }
    return httpResponse.getStatusCode() == Status.NO_CONTENT.getStatusCode();
}

先经过EurekaHttpClientDecorator,并执行register(final InstanceInfo info),当然这个方法也是一个跳板,实际上还是执行delegate.register(info),delegate是一个EurekaHttpClient代理对象,本质上是调用AbstractJerseyEurekaHttpClient中的register方法

@Override
  public EurekaHttpResponse<Void> register(final InstanceInfo info) {
      return execute(new RequestExecutor<Void>() {
          @Override
          public EurekaHttpResponse<Void> execute(EurekaHttpClient delegate) {
              return delegate.register(info);
          }

          @Override
          public RequestType getRequestType() {
              return RequestType.Register;
          }
      });
  }

我们再接着跟进AbstractJerseyEurekaHttpClient中的注册方法,通过urlPath结构不难看出这是一个restful风格的url地址,Eureka通过构建模式构建一个Builder用于发送http请求,底层实际上是在此处发起post请求。
然后Server端会返回一个结果对象response,在Eureka中默认状态码204为注册成功!

@Override
public EurekaHttpResponse<Void> register(InstanceInfo info) {
    String urlPath = "apps/" + info.getAppName();
    ClientResponse response = null;
    try {
        Builder resourceBuilder = jerseyClient.resource(serviceUrl).path(urlPath).getRequestBuilder();
        addExtraHeaders(resourceBuilder);
        // 底层实际上是在此处发起http请求(post)
        response = resourceBuilder
                .header("Accept-Encoding", "gzip")
                .type(MediaType.APPLICATION_JSON_TYPE)
                .accept(MediaType.APPLICATION_JSON)
                .post(ClientResponse.class, info);
        return anEurekaHttpResponse(response.getStatus()).headers(headersOf(response)).build();
    } finally {
        if (logger.isDebugEnabled()) {
            logger.debug("Jersey HTTP POST {}/{} with instance {}; statusCode={}", serviceUrl, urlPath, info.getId(),
                    response == null ? "N/A" : response.getStatus());
        }
        if (response != null) {
            response.close();
        }
    }
}

接下来我们来看服务端的InstanceRegistry类,它是负责接收客户端发过来的http请求,并返回响应结果的。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第9张图片

@Override
public void register(InstanceInfo info, int leaseDuration, boolean isReplication) {
	handleRegistration(info, leaseDuration, isReplication);
	super.register(info, leaseDuration, isReplication);
}

@Override
public void register(final InstanceInfo info, final boolean isReplication) {
	handleRegistration(info, resolveInstanceLeaseDuration(info), isReplication);
	super.register(info, isReplication);
}

register(info, leaseDuration, isReplication)意为注册自己,replicateToPeers()是将实例的info注册到集群中。在Server中使用一个双层的ConcurrentMap集合ConcurrentMap>专门用来存储注册的服务的信息,第一个String为服务的名称,第二个String为instanceId,Instance则包含实例的ip:port和状态。(由于HashMap是线程不安全的,因此需要使用ConcurrentMap,保证线程的安全)
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第10张图片

/**
 * Registers a new instance with a given duration.
 *
 * @see com.netflix.eureka.lease.LeaseManager#register(java.lang.Object, int, boolean)
 */
public void register(InstanceInfo registrant, int leaseDuration, boolean isReplication) {
    try {
        read.lock();
        // 通过服务的名称在注册中心registry中获取一个实例的列表
        Map<String, Lease<InstanceInfo>> gMap = registry.get(registrant.getAppName());
        REGISTER.increment(isReplication);
        // 若实例的列表为nul(第一次注册)
        if (gMap == null) {
        	// 新建一个实例的列表,通过map集合封装实例的信息
            final ConcurrentHashMap<String, Lease<InstanceInfo>> gNewMap = new ConcurrentHashMap<String, Lease<InstanceInfo>>();
            // 将新建的gNewMap放在注册中心中
            gMap = registry.putIfAbsent(registrant.getAppName(), gNewMap);
            if (gMap == null) {
                gMap = gNewMap;
            }
        }
        // gMap就是该服务的实例,registrant.getId()为实例的id
        Lease<InstanceInfo> existingLease = gMap.get(registrant.getId());
        // Retain the last dirty timestamp without overwriting it, if there is already a lease
        // 如果existingLease 不为null,代表有注册
        if (existingLease != null && (existingLease.getHolder() != null)) {
            Long existingLastDirtyTimestamp = existingLease.getHolder().getLastDirtyTimestamp();
            Long registrationLastDirtyTimestamp = registrant.getLastDirtyTimestamp();
            logger.debug("Existing lease found (existing={}, provided={}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);

            // this is a > instead of a >= because if the timestamps are equal, we still take the remote transmitted
            // InstanceInfo instead of the server local copy.
            if (existingLastDirtyTimestamp > registrationLastDirtyTimestamp) {
                logger.warn("There is an existing lease and the existing lease's dirty timestamp {} is greater" +
                        " than the one that is being registered {}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
                logger.warn("Using the existing instanceInfo instead of the new instanceInfo as the registrant");
                registrant = existingLease.getHolder();
            }
        } else {
            // The lease does not exist and hence it is a new registration
            synchronized (lock) {
                if (this.expectedNumberOfClientsSendingRenews > 0) {
                    // Since the client wants to register it, increase the number of clients sending renews
                    this.expectedNumberOfClientsSendingRenews = this.expectedNumberOfClientsSendingRenews + 1;
                    updateRenewsPerMinThreshold();
                }
            }
            logger.debug("No previous lease information found; it is new registration");
        }
        // 如果服务没有注册,则执行注册,将该实例放入到gMap中
        Lease<InstanceInfo> lease = new Lease<InstanceInfo>(registrant, leaseDuration);
        if (existingLease != null) {
            lease.setServiceUpTimestamp(existingLease.getServiceUpTimestamp());
        }
        gMap.put(registrant.getId(), lease);
        synchronized (recentRegisteredQueue) {
            recentRegisteredQueue.add(new Pair<Long, String>(
                    System.currentTimeMillis(),
                    registrant.getAppName() + "(" + registrant.getId() + ")"));
        }
        // This is where the initial state transfer of overridden status happens
        if (!InstanceStatus.UNKNOWN.equals(registrant.getOverriddenStatus())) {
            logger.debug("Found overridden status {} for instance {}. Checking to see if needs to be add to the "
                            + "overrides", registrant.getOverriddenStatus(), registrant.getId());
            if (!overriddenInstanceStatusMap.containsKey(registrant.getId())) {
                logger.info("Not found overridden id {} and hence adding it", registrant.getId());
                overriddenInstanceStatusMap.put(registrant.getId(), registrant.getOverriddenStatus());
            }
        }
        InstanceStatus overriddenStatusFromMap = overriddenInstanceStatusMap.get(registrant.getId());
        if (overriddenStatusFromMap != null) {
            logger.info("Storing overridden status {} from map", overriddenStatusFromMap);
            registrant.setOverriddenStatus(overriddenStatusFromMap);
        }

        // Set the status based on the overridden status rules
        InstanceStatus overriddenInstanceStatus = getOverriddenInstanceStatus(registrant, existingLease, isReplication);
        registrant.setStatusWithoutDirty(overriddenInstanceStatus);

        // If the lease is registered with UP status, set lease service up timestamp
        if (InstanceStatus.UP.equals(registrant.getStatus())) {
            lease.serviceUp();
        }
        registrant.setActionType(ActionType.ADDED);
        recentlyChangedQueue.add(new RecentlyChangedItem(lease));
        registrant.setLastUpdatedTimestamp();
        invalidateCache(registrant.getAppName(), registrant.getVIPAddress(), registrant.getSecureVipAddress());
        logger.info("Registered instance {}/{} with status {} (replication={})",
                registrant.getAppName(), registrant.getId(), registrant.getStatus(), isReplication);
    } finally {
        read.unlock();
    }
}
关于服务注册的总结

1)通过DiscoveryClient(eureka-client)中的register方法完成服务的注册
2)Eureka Client 本质上是使用AbstractJerseyEurekaHttpClient中的register方法,调用http的post请求,向Server发起注册的请求。
3)Eureka Server 中的InstanceRegistry中的register方法用于实际注册Eureka Client 客户端,客户端的信息都被保存在一个ConcurrentMap中。


服务的续约

由Eureka Client 向Eureka Server 发起续约请求(renew),切入点还是DiscoveryClient,其本质上是通过定时任务每隔30s向服务端发送心跳续约。

/**
 * Renew with the eureka service by making the appropriate REST call
 */
boolean renew() {
    EurekaHttpResponse<InstanceInfo> httpResponse;
    try {
    	// 发送心跳到服务端,包含客户端基本信息
        httpResponse = eurekaTransport.registrationClient.sendHeartBeat(instanceInfo.getAppName(), instanceInfo.getId(), instanceInfo, null);
        logger.debug(PREFIX + "{} - Heartbeat status: {}", appPathIdentifier, httpResponse.getStatusCode());
        if (httpResponse.getStatusCode() == Status.NOT_FOUND.getStatusCode()) {
            REREGISTER_COUNTER.increment();
            logger.info(PREFIX + "{} - Re-registering apps/{}", appPathIdentifier, instanceInfo.getAppName());
            long timestamp = instanceInfo.setIsDirtyWithTime();
            boolean success = register();
            if (success) {
                instanceInfo.unsetIsDirty(timestamp);
            }
            return success;
        }
        return httpResponse.getStatusCode() == Status.OK.getStatusCode();
    } catch (Throwable e) {
        logger.error(PREFIX + "{} - was unable to send heartbeat!", appPathIdentifier, e);
        return false;
    }
}

发送心跳的代码如下,将需要续约的客户端的信息以http的形式发送给服务端

@Override
public EurekaHttpResponse<InstanceInfo> sendHeartBeat(String appName, String id, InstanceInfo info, InstanceStatus overriddenStatus) {
    String urlPath = "apps/" + appName + '/' + id;
    ClientResponse response = null;
    try {
        WebResource webResource = jerseyClient.resource(serviceUrl)
                .path(urlPath)
                .queryParam("status", info.getStatus().toString())
                .queryParam("lastDirtyTimestamp", info.getLastDirtyTimestamp().toString());
        if (overriddenStatus != null) {
            webResource = webResource.queryParam("overriddenstatus", overriddenStatus.name());
        }
        Builder requestBuilder = webResource.getRequestBuilder();
        addExtraHeaders(requestBuilder);
        response = requestBuilder.put(ClientResponse.class);
        EurekaHttpResponseBuilder<InstanceInfo> eurekaResponseBuilder = anEurekaHttpResponse(response.getStatus(), InstanceInfo.class).headers(headersOf(response));
        if (response.hasEntity()) {
            eurekaResponseBuilder.entity(response.getEntity(InstanceInfo.class));
        }
        return eurekaResponseBuilder.build();
    } finally {
        if (logger.isDebugEnabled()) {
            logger.debug("Jersey HTTP PUT {}/{}; statusCode={}", serviceUrl, urlPath, response == null ? "N/A" : response.getStatus());
        }
        if (response != null) {
            response.close();
        }
    }
}

同样在服务端也有一个renew()方法,切入点依然是InstanceRegistry
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第11张图片
该renew方法本质上是调用InstanceRegistry的父类PeerAwareInstanceRegistryImpl的方法,而PeerAwareInstanceRegistryImpl中的renew又是调用AbstractInstanceRegistry中得renew方法

public boolean renew(String appName, String id, boolean isReplication) {
	RENEW.increment(isReplication);
	// 获取该服务的实例列表(Map)
	Map<String, Lease<InstanceInfo>> gMap = registry.get(appName);
	Lease<InstanceInfo> leaseToRenew = null;
	if (gMap != null) {
		// 通过实例的id 得到要续约的实例
		leaseToRenew = gMap.get(id);
	}
	if (leaseToRenew == null) {
		// 没有找到服务实例
		RENEW_NOT_FOUND.increment(isReplication);
		logger.warn("DS: Registry: lease doesn't exist, registering resource: {} - {}", appName, id);
		return false;
	} else {
		// 得到续约的实例
		InstanceInfo instanceInfo = leaseToRenew.getHolder();
		if (instanceInfo != null) {
			// touchASGCache(instanceInfo.getASGName());
			InstanceStatus overriddenInstanceStatus = this.getOverriddenInstanceStatus(instanceInfo, leaseToRenew,
					isReplication);
			if (overriddenInstanceStatus == InstanceStatus.UNKNOWN) {
				logger.info("Instance status UNKNOWN possibly due to deleted override for instance {}"
						+ "; re-register required", instanceInfo.getId());
				RENEW_NOT_FOUND.increment(isReplication);
				return false;
			}
			if (!instanceInfo.getStatus().equals(overriddenInstanceStatus)) {
				logger.info(
						"The instance status {} is different from overridden instance status {} for instance {}. "
								+ "Hence setting the status to overridden status",
						instanceInfo.getStatus().name(), instanceInfo.getOverriddenStatus().name(),
						instanceInfo.getId());
				instanceInfo.setStatusWithoutDirty(overriddenInstanceStatus);

			}
		}
		renewsLastMin.increment();
		// 正常的发生了续约
		leaseToRenew.renew();
		return true;
	}
}

续约的本质是修改最后的修改时间
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第12张图片
duration为服务器"忍耐"时间,也就是说当超过30s,Server端不会马上剔除过期的Client,而是在30s+duration之后剔除。

/**
 * Renew the lease, use renewal duration if it was specified by the
 * associated {@link T} during registration, otherwise default duration is
 * {@link #DEFAULT_DURATION_IN_SECS}.
 */
public void renew() {
    lastUpdateTimestamp = System.currentTimeMillis() + duration;
}

注意,lastUpdateTimestamp是用volatile修饰的,目的是让此变量在线程中可见,当我这个线程中的lastUpdateTimestamp被修改时,其他的线程也都能够看到。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第13张图片


服务的剔除

当eureka-server发现有的实例长期没有发送续约(没有心跳),则剔除该Client服务。在AbstractInstanceRegistry中通过evict方法实现剔除功能。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第14张图片

public void evict(long additionalLeaseMs) {
     logger.debug("Running the evict task");

     if (!isLeaseExpirationEnabled()) {
         logger.debug("DS: lease expiration is currently disabled.");
         return;
     }

     // We collect first all expired items, to evict them in random order. For large eviction sets,
     // if we do not that, we might wipe out whole apps before self preservation kicks in. By randomizing it,
     // the impact should be evenly distributed across all applications.
     // 新建一个实例的列表
     List<Lease<InstanceInfo>> expiredLeases = new ArrayList<>();
     // 开始循环registry注册中心,来完成对过期实例的检测工作
     for (Entry<String, Map<String, Lease<InstanceInfo>>> groupEntry : registry.entrySet()) {
         Map<String, Lease<InstanceInfo>> leaseMap = groupEntry.getValue();
         if (leaseMap != null) {
             for (Entry<String, Lease<InstanceInfo>> leaseEntry : leaseMap.entrySet()) {
             	// 获取实例对象
                 Lease<InstanceInfo> lease = leaseEntry.getValue();
                 // 判断该实例是否过期
                 if (lease.isExpired(additionalLeaseMs) && lease.getHolder() != null) {		
                    // 将该实例信息放在过期列表中 
                    expiredLeases.add(lease);
                 }
             }
         }
     }

     // To compensate for GC pauses or drifting local time, we need to use current registry size as a base for
     // triggering self-preservation. Without that we would wipe out full registry.
     int registrySize = (int) getLocalRegistrySize();
     int registrySizeThreshold = (int) (registrySize * serverConfig.getRenewalPercentThreshold());
     int evictionLimit = registrySize - registrySizeThreshold;

     int toEvict = Math.min(expiredLeases.size(), evictionLimit);
     if (toEvict > 0) {
         logger.info("Evicting {} items (expired={}, evictionLimit={})", toEvict, expiredLeases.size(), evictionLimit);

         Random random = new Random(System.currentTimeMillis());
         for (int i = 0; i < toEvict; i++) {
             // Pick a random item (Knuth shuffle algorithm)
             int next = i + random.nextInt(expiredLeases.size() - i);
             // 交换元素位置,保证剔除的公平性(元素被剔除的先后顺序和放入到list的顺序无关)
             Collections.swap(expiredLeases, i, next);
             // 使用Knuth shuffle 算法,在expiredLeases中
             // 从expiredLeases中获取一个过期的实例
             Lease<InstanceInfo> lease = expiredLeases.get(i);

             String appName = lease.getHolder().getAppName();
             String id = lease.getHolder().getId();
             EXPIRED.increment();
             logger.warn("DS: Registry: expired lease for {}/{}", appName, id);
             // 实际上元素(client)被剔除的方法
             internalCancel(appName, id, false);
         }
     }
 }

internalCancel(String appName, String id, boolean isReplication)方法中,最终是执行leaseToCancel = gMap.remove(id),从存储客户端信息的Map中remove掉这个instance。判断实例是否过期的方法如下。

public boolean isExpired(long additionalLeaseMs) {
    return (evictionTimestamp > 0 
    || System.currentTimeMillis() > (lastUpdateTimestamp + duration + additionalLeaseMs));
}

其实追根溯源,可以发现Eureka Server本质上是使用一个定期删除的策略来完成对Eureka Client的剔除。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第15张图片
发现该定时任务默认定期检查时间为60s
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第16张图片


服务的下线

当eureka-client下线时,项目不会立马被关闭,而是做好"善后"工作才关闭。
首先eureka-client会发送请求给eureka-server,表示客户端要下线了。在DiscoveryClient中通过unregister()方法发出服务下线的请求。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第17张图片
接着跟进cancle方法(AbstractJerseyEurekaHttpClient),可以看到,通过请求下线服务的名称拼接得到urlPath,并构建一个resourceBuilder对象,最终调用resourceBuilder.delete(ClientResponse.class)向Server发出http请求下线。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第18张图片
那么eureka-server怎么处理该请求呢?我们在AbstractInstanceRegistry中找到对应得cancel方法
在这里插入图片描述
我们可以看见服务的下线在eureka-server 里面直接调用

protected boolean internalCancel(String appName, String id, boolean isReplication) {
    try {
        read.lock();
        CANCEL.increment(isReplication);
        // 从注册中心中通过请求的应用名称获取对饮的gMap
        Map<String, Lease<InstanceInfo>> gMap = registry.get(appName);
        Lease<InstanceInfo> leaseToCancel = null;
        // 
        if (gMap != null) {
        	// 从gMap中移除实例
            leaseToCancel = gMap.remove(id);
        }
        ....
    }   
}

服务的发现

之前在手写Dubbo RPC框架的时候,有讲到过服务的发现,其实理解起来很简单,无非就是通过服务的名称发现服务的实例。比如说A服务要调用B服务的话,那么B服务必须提前注册,格式为{serviceName,{instanceid:instanceObject}}(Map中套Map的结构,上面有提到),那么通过服务的名称,就可以发现这个服务,进而获取这个服务的实例信息。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第19张图片
在DiscoveryClient的子类CompositeDiscoveryClient中找到getInstance方法
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第20张图片
再进到EurekaDiscoveryClient中,发现本上还是调用DiscoveryClient的getInstancesByVipAddress(serviceId, false)方法

 @Override
public List<InstanceInfo> getInstancesByVipAddress(String vipAddress, boolean secure,
                                                   @Nullable String region) {
    if (vipAddress == null) {
        throw new IllegalArgumentException(
                "Supplied VIP Address cannot be null");
    }
    // 存储的服务列表
    Applications applications;
    if (instanceRegionChecker.isLocalRegion(region)) {
        applications = this.localRegionApps.get();
    } else {
        applications = remoteRegionVsApps.get(region);
        if (null == applications) {
            logger.debug("No applications are defined for region {}, so returning an empty instance list for vip "
                    + "address {}.", region, vipAddress);
            return Collections.emptyList();
        }
    }

    if (!secure) {
    	// 真正的完成服务的发现的方法
        return applications.getInstancesByVirtualHostName(vipAddress);
    } else {
        return applications.getInstancesBySecureVirtualHostName(vipAddress);

    }
}

以上代码中比较关键的就是applications对象,那么我们点开看看Applications中究竟是什么结构。不难发现Applications中有个专门存放InstanceInfo(服务实例信息)的List集合,只不过被AtomicReference包装了一下,目的是使得List集合在高并发的情况下是线程安全的。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第21张图片
回到上面的代码,applications.getInstancesByVirtualHostName(vipAddress)是真正的完成服务的发现的方法,那么点开这个方法(这个方法比较难懂,是一串lambda表达式),本质上就是通过virtualHostName(服务应用名称) 获取服务注册列表
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第22张图片
在Applcaitions中存在以下四个Map集合,存储的数据都相同,只不过应用的场景不同,这些数据全部来自Server端,由Server发送给Client。
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第23张图片
并且这四个Map的key都是大写的服务名称
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第24张图片
此时我们还没有做服务的发现,但是该集合里面已经有值了,说明我们的项目启动后,会自动去拉取服务,并且将拉取的服务缓存起来。那么具体什么时候被放进去的呢?
在DiscoveryClient的构造器里面,有个任务的调度线程池,该线程池将用来在服务列表的拉取。

cacheRefreshExecutor = new ThreadPoolExecutor(
                   1, clientConfig.getCacheRefreshExecutorThreadPoolSize(), 
                   0, TimeUnit.SECONDS,
                   new SynchronousQueue<Runnable>(),
                   new ThreadFactoryBuilder()
                           .setNameFormat("DiscoveryClient-CacheRefreshExecutor-%d")
                           .setDaemon(true)
                           .build()
           );  // use direct handoff

该线程的调度 ,调度的是下面的定时任务,在项目启动的时候就去拉取服务列表
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第25张图片
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第26张图片
我们点开看看这个刷新注册列表的方法refreshRegistry()

@VisibleForTesting
void refreshRegistry() {
      boolean isFetchingRemoteRegionRegistries = isFetchingRemoteRegionRegistries();

      boolean remoteRegionsModified = false;
      // This makes sure that a dynamic change to remote regions to fetch is honored.
      String latestRemoteRegions = clientConfig.fetchRegistryForRemoteRegions();
      if (null != latestRemoteRegions) {
          String currentRemoteRegions = remoteRegionsToFetch.get();
          if (!latestRemoteRegions.equals(currentRemoteRegions)) {
              // Both remoteRegionsToFetch and AzToRegionMapper.regionsToFetch need to be in sync
              synchronized (instanceRegionChecker.getAzToRegionMapper()) {
                  if (remoteRegionsToFetch.compareAndSet(currentRemoteRegions, latestRemoteRegions)) {
                      String[] remoteRegions = latestRemoteRegions.split(",");
                      remoteRegionsRef.set(remoteRegions);
                      instanceRegionChecker.getAzToRegionMapper().setRegionsToFetch(remoteRegions);
                      remoteRegionsModified = true;
                  } else {
                      logger.info("Remote regions to fetch modified concurrently," +
                              " ignoring change from {} to {}", currentRemoteRegions, latestRemoteRegions);
                  }
              }
          } else {
              // Just refresh mapping to reflect any DNS/Property change
              instanceRegionChecker.getAzToRegionMapper().refreshMapping();
          }
      }
	// 在在里面真正的执行服务列表的获取
      boolean success = fetchRegistry(remoteRegionsModified);
      if (success) {
          registrySize = localRegionApps.get().size();
          lastSuccessfulRegistryFetchTimestamp = System.currentTimeMillis();
      }
}

找到fetchRegistry拉取注册列表方法

private boolean fetchRegistry(boolean forceFullRegistryFetch) {
	Stopwatch tracer = FETCH_REGISTRY_TIMER.start();
	try {
	    // If the delta is disabled or if it is the first time, get all
	    // applications
	    Applications applications = getApplications();
	
	    if (clientConfig.shouldDisableDelta()
	            || (!Strings.isNullOrEmpty(clientConfig.getRegistryRefreshSingleVipAddress()))
	            || forceFullRegistryFetch
	            || (applications == null)
	            || (applications.getRegisteredApplications().size() == 0)
	            || (applications.getVersion() == -1)) //Client application does not have latest library supporting delta
	    {
	        logger.info("Disable delta property : {}", clientConfig.shouldDisableDelta());
	        logger.info("Single vip registry refresh property : {}", clientConfig.getRegistryRefreshSingleVipAddress());
	        logger.info("Force full registry fetch : {}", forceFullRegistryFetch);
	        logger.info("Application is null : {}", (applications == null));
	        logger.info("Registered Applications size is zero : {}",
	                (applications.getRegisteredApplications().size() == 0));
	        logger.info("Application version is -1: {}", (applications.getVersion() == -1));
		     // 全量的拉取(代表拉取所有的注册中心)
		     // 当缓存为null 或者它里面的数据为empty时,进行全量的拉取
			getAndStoreFullRegistry();
	    } else {
	       // 增量(只拉取修改的注册中心)
	        getAndUpdateDelta(applications);
	    }
	    applications.setAppsHashCode(applications.getReconcileHashCode());
	    logTotalInstances();
	} catch (Throwable e) {
	    logger.error(PREFIX + "{} - was unable to refresh its cache! status = {}", appPathIdentifier, e.getMessage(), e);
	    return false;
	} finally {
	    if (tracer != null) {
	        tracer.stop();
	    }
}

通过上面的代码不难发现,Eureka提供了两种拉取服务列表的方案,分别是全量拉取,和增量拉取

全量的拉取:

Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第27张图片
全量拉取之后将applications对象放入localRegionApps中

增量的拉取
private void getAndUpdateDelta(Applications applications) throws Throwable {
  long currentUpdateGeneration = fetchRegistryGeneration.get();

    Applications delta = null;
 	// 发起Http 请求来完成一个增量的拉取
    EurekaHttpResponse<Applications> httpResponse = eurekaTransport.queryClient.getDelta(remoteRegionsRef.get());
    if (httpResponse.getStatusCode() == Status.OK.getStatusCode()) {
       // 在eureka-server 里面修改或新增的集合
		delta = httpResponse.getEntity();
    }
     // 若拉取的集合为null
    if (delta == null) {
        logger.warn("The server does not allow the delta revision to be applied because it is not safe. "
                + "Hence got the full registry.");
       	// 进行全量拉取
        getAndStoreFullRegistry();
    } else if (fetchRegistryGeneration.compareAndSet(currentUpdateGeneration, currentUpdateGeneration + 1)) {
        logger.debug("Got delta update with apps hashcode {}", delta.getAppsHashCode());
        String reconcileHashCode = "";
        if (fetchRegistryUpdateLock.tryLock()) {
            try {
               	// eureka-server 里面变化的集合,通过它里面的值更新本地的缓存
                updateDelta(delta);
             	// 一致性的hashcode值
             	// 一致性hash 用来校验远程的Eureka-server 集合和本地的eureka-server 集合是否一样
                reconcileHashCode = getReconcileHashCode(applications);
            } finally {
                fetchRegistryUpdateLock.unlock();
            }
        } else {
            logger.warn("Cannot acquire update lock, aborting getAndUpdateDelta");
        }
		//若hashcode值相等,则不拉取,否则,在拉取一次
        // There is a diff in number of instances for some reason
        if (!reconcileHashCode.equals(delta.getAppsHashCode()) || clientConfig.shouldLogDeltaDiff()) {
            reconcileAndLogDifference(delta, reconcileHashCode);  // this makes a remoteCall
        }
    } else {
        logger.warn("Not updating application delta as another thread is updating it already");
        logger.debug("Ignoring delta update with apps hashcode {}, as another thread is updating it already", delta.getAppsHashCode());
    }
}
Eureka服务发现总结

1)服务列表的拉取并不是在服务调用的时候拉取,而是在项目启动时就有定时任务去拉取了,在DiscoveryClient的构造器里面有体现
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第28张图片
2)我们的服务的实例并不是实时的 eureka-server 里面的数据,而是一个本地的(内存)缓存数据
Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第29张图片
3)缓存的脏读和更新的解决
全量拉取发生在:当服务列表为 null的情况
增量拉取发生在当列表不为 null ,只拉取eureka-server的修改的数据(注册新的服务,上线服务)

Marco's Java【Eureka篇之Eureka集群搭建及源码解析】_第30张图片

你可能感兴趣的:(Eureka)