说到注册中心,大家很容易会联想到Zookeeper,那么今天的主角Eureka同Zookeeper一样都是注册中心。
Eureka 来源于古希腊词汇,意为“发现了”。在软件领域, Eureka 是 Netflix
在线影片公司开源的一个服务注册与发现的组件,和其他Netflix 公司的服务组件(例如负载均衡、熔断器、网关等) 一起,被Spring Cloud 社区整合为Spring Cloud Netflix 模块。 Eureka 是Netflix 贡献给Spring
Cloud的一个框架!Netflix 给Spring Cloud 贡献了很多框架。
关于Eureka 和Zookeeper的区别,我们着重从分布式特征CAP层面来分析
Consistance : 数据的一致性 (A,B,C里面的数据是一致的)
Zookeeper 注重数据的一致性。Eureka 不是很注重数据的一致性!
Available: 服务的可用性
在Zookeeper里面,若主机挂了,则Zookeeper集群整体不对外提供服务了,需要选一个新的主机出来(30-120s)才能继续对外提供服务(选举期间不对外提供服务)!Eureka 注重服务的可用性,当Eureka集群只有一台活着,它就能对外提供服务.
Partition Tolerence:分区的容错性 (在集群里面的集群,因为网络原因,机房的原因,可能导致数据不会里面同步),这是分布式产品必须需要实现的特性!
当然还有一些其他的区别,比如说,Zookeeper使用比较灵活,可作为Dubbo的注册中心,实现分布式服务的管理,可作为配置管理中心,统一管理Solr集群的配置文件,利用临时节点剔除的原理管理ActiveMQ集群,还有诸如分布式锁的应用,而Eureka作为Spring Cloud的组件,则着重作为服务的注册中心,像一张网,将Cloud中的组件连在一起,从而让各个组件发挥各自的功能。Eureka提供自我保护和自身集群监控功能,结合缓存功能实现Eureka的高可用!
总体来说Zookeeper 注重数据的一致性(CP),而Eureka 注重服务的可用性(AP)
Eureka的客户端在向某个Eureka注册时,或者发现连接失败时,则会自动切换至其它节点,只要有一台Eureka还在,就能保证注册服务可用(保证可用性),只不过查到的信息可能不是最新的(不保证强一致性)。除此之外,Eureka还有一种自我保护机制,如果在15分钟内超过85%的节点都没有正常的心跳,那么Eureka就认为客户端与注册中心出现了网络故障,此时会出现以下几种情况:
1)Eureka不再从注册列表中移除因为长时间没收到心跳而应该过期的服务
2)Eureka仍然能够接受新服务的注册和查询请求,但是不会被同步到其它节点上(即保证当前节点依然可用)
3)当网络稳定时,当前实例新的注册信息会被同步到其它节点中
因此, Eureka可以很好的应对因网络故障导致部分节点失去联系的情况,而不会像zookeeper那样使整个注册服务瘫痪
Eureka 修改了授权协议,之前Eureka Apache2.0 的协议,后续的Eureka ,可能使用其他的授权协议。
Spring Cloud 还有别的注册中心 Consul,阿里巴巴提供Nacos 都能作为注册中心,我们的选择还是很多。
但是我们学习还是选择Eureka ,因为它的成熟度相对来说很高。
eureka-server.jar包导入
org.springframework.cloud
spring-cloud-starter-netflix-eureka-server
application.yml配置
server:
port: 8080
spring:
application:
name: eureka-client-a
eureka:
client:
serviceUrl:
defaultZone: http://peer1:8761/eureka/,http://peer2:8762/eureka/,http://peer3:8763/eureka/
instance:
instance-id: ${spring.application.name}:${server.port}
prefer-ip-address: true
这里我说明下,为什么Eureka-Server需要配置eureka.client.serviceUrl
,因为Eureka-Server不仅提供让别人注册的功能,它也能注册到别人里面。
并且根据源码分析,Eureka-Server在启动时,还会自己注册自己。通过这种相互注册,相互发现的机制,可以轻易的完成服务端集群的搭建。
补充:
# 响应的缓存配置(为了eureka-server能快速的响应client)
response-cache-update-interval-ms: 3000
responseCacheAutoExpirationInSeconds: 180
# 定期删除没有需要的instance(lastupdatetime)
evictionIntervalTimerInMs: 3000
# server的自我保护模式
enableSelfPreservation: true # 本地调试时可fasle关闭。但生产建议打开,可防止因网络不稳定等原因导致误剔除服务。
renewalPercentThreshold: 0.85 # 默认85%
eureka-client.jar包导入
org.springframework.cloud
spring-cloud-starter-netflix-eureka-client
application.yml配置
server:
port: 8761
spring:
application:
name: eureka-server
eureka:
client:
register-with-eureka: false
fetch-registry: false
serviceUrl:
defaultZone: http://peer1:8761/eureka/,http://peer2:8762/eureka/,http://peer3:8763/eureka/
preferIpAddress: true # 默认false。应该始终设置为true。如果基于Docker等容器的部署,容器会生成一个随机的主机名,此时DNS不存在该名,无法解析 - John Carnell
当Eureka-Serve两两之间相互注册之后,Client端通过注册整个集群的Server的方式,形成一个高可用的集群回路
运行成功之后显示如下,三个Eureka Server搭建成一个集群,当Client注册任意一个Server之后,其他的Server都会复制该数据,实现数据的共享。
补充:
instance:
lease-renewal-interval-in-seconds: 30 # 30s client 向server发一个续签请求,代表我还活着
lease-expiration-duration-in-seconds: 90 # client 多久没有向server发请求,server 会清除它
instanceId: ${spring.cloud.client.hostname}:${spring.application.name}:${spring.application.instance_id:${server.port}}
# 详见EurekaClientConfigBean(实现EurekaClientConfig)
client:
# 是否启用eureka客户端。默认true
enabled: true # 本地调试时,若不想启动eureka,可配置false即可,而不需要注释掉@EnableDiscoveryClient这么麻烦。
registerWithEureka: true # 默认true,因此也可省略。
fetchRegistry: true # 默认true,此处可不配置。
registry-fetch-interval-seconds: 30 # 如果想eureka server剔除服务后尽快在client体现,我觉得可缩短此时间。
服务的注册
当项目一启动,向eureka-server 发送自己的元数据,(运行的ip,port,健康的监控数据)eureka-server在自己内部保存这些元数据。
服务的续约
Client项目启动成功了,也向eureka-server 注册成功。 项目还会定时的去eureka-server 汇报自己。代表我还活着。
服务的下线
当项目关闭时,项目会给eureka-server 报告自己,表示自己要下线了!
服务的剔除
当服务没有向 eureka-server 汇报自己的状态超过一段时间,eureka-server则认为它挂了,并会把它剔除掉!
接下来我们会围绕上面的几个特征,对Eureka的源码进行分析。
Eureka-Server对外会提供Restful服务(http服务 + 特定的请求方式url + 特定的url地址),只要利用这些restful服务,就可以实现项目中的服务的注册和发现。因为http是跨平台的,不管客户端是什么语言,只要能发起http请求,就可以自己实现服务的注册和发现!
首先,在eureka-client-1.9.13.jar
源码包中找到DiscoveryClient类
在DiscoveryClient有个注册方法register(instanceInfo)
instanceInfo
为实例信息,底层使用AbstractJerseyEurekaHttpClient类中的register()
方法发起一个post请求,并返回响应的结果httpResponse,如果响应的结果的状态为NO_CONTENT
无内容,则表示成功!
/**
* Register with the eureka service by making the appropriate REST call.
*/
boolean register() throws Throwable {
logger.info(PREFIX + "{}: registering service...", appPathIdentifier);
EurekaHttpResponse<Void> httpResponse;
try {
// 注册实例信息,实际上是向server端发送一个http注册请求
httpResponse = eurekaTransport.registrationClient.register(instanceInfo);
} catch (Exception e) {
logger.warn(PREFIX + "{} - registration failed {}", appPathIdentifier, e.getMessage(), e);
throw e;
}
if (logger.isInfoEnabled()) {
logger.info(PREFIX + "{} - registration status: {}", appPathIdentifier, httpResponse.getStatusCode());
}
return httpResponse.getStatusCode() == Status.NO_CONTENT.getStatusCode();
}
先经过EurekaHttpClientDecorator
,并执行register(final InstanceInfo info)
,当然这个方法也是一个跳板,实际上还是执行delegate.register(info)
,delegate是一个EurekaHttpClient
代理对象,本质上是调用AbstractJerseyEurekaHttpClient
中的register方法
@Override
public EurekaHttpResponse<Void> register(final InstanceInfo info) {
return execute(new RequestExecutor<Void>() {
@Override
public EurekaHttpResponse<Void> execute(EurekaHttpClient delegate) {
return delegate.register(info);
}
@Override
public RequestType getRequestType() {
return RequestType.Register;
}
});
}
我们再接着跟进AbstractJerseyEurekaHttpClient
中的注册方法,通过urlPath结构不难看出这是一个restful风格的url地址,Eureka通过构建模式构建一个Builder用于发送http请求,底层实际上是在此处发起post请求。
然后Server端会返回一个结果对象response,在Eureka中默认状态码204为注册成功!
@Override
public EurekaHttpResponse<Void> register(InstanceInfo info) {
String urlPath = "apps/" + info.getAppName();
ClientResponse response = null;
try {
Builder resourceBuilder = jerseyClient.resource(serviceUrl).path(urlPath).getRequestBuilder();
addExtraHeaders(resourceBuilder);
// 底层实际上是在此处发起http请求(post)
response = resourceBuilder
.header("Accept-Encoding", "gzip")
.type(MediaType.APPLICATION_JSON_TYPE)
.accept(MediaType.APPLICATION_JSON)
.post(ClientResponse.class, info);
return anEurekaHttpResponse(response.getStatus()).headers(headersOf(response)).build();
} finally {
if (logger.isDebugEnabled()) {
logger.debug("Jersey HTTP POST {}/{} with instance {}; statusCode={}", serviceUrl, urlPath, info.getId(),
response == null ? "N/A" : response.getStatus());
}
if (response != null) {
response.close();
}
}
}
接下来我们来看服务端的InstanceRegistry类,它是负责接收客户端发过来的http请求,并返回响应结果的。
@Override
public void register(InstanceInfo info, int leaseDuration, boolean isReplication) {
handleRegistration(info, leaseDuration, isReplication);
super.register(info, leaseDuration, isReplication);
}
@Override
public void register(final InstanceInfo info, final boolean isReplication) {
handleRegistration(info, resolveInstanceLeaseDuration(info), isReplication);
super.register(info, isReplication);
}
register(info, leaseDuration, isReplication)
意为注册自己,replicateToPeers()
是将实例的info注册到集群中。在Server中使用一个双层的ConcurrentMap集合ConcurrentMap
专门用来存储注册的服务的信息,第一个String为服务的名称,第二个String为instanceId,Instance则包含实例的ip:port和状态。(由于HashMap是线程不安全的,因此需要使用ConcurrentMap,保证线程的安全)
/**
* Registers a new instance with a given duration.
*
* @see com.netflix.eureka.lease.LeaseManager#register(java.lang.Object, int, boolean)
*/
public void register(InstanceInfo registrant, int leaseDuration, boolean isReplication) {
try {
read.lock();
// 通过服务的名称在注册中心registry中获取一个实例的列表
Map<String, Lease<InstanceInfo>> gMap = registry.get(registrant.getAppName());
REGISTER.increment(isReplication);
// 若实例的列表为nul(第一次注册)
if (gMap == null) {
// 新建一个实例的列表,通过map集合封装实例的信息
final ConcurrentHashMap<String, Lease<InstanceInfo>> gNewMap = new ConcurrentHashMap<String, Lease<InstanceInfo>>();
// 将新建的gNewMap放在注册中心中
gMap = registry.putIfAbsent(registrant.getAppName(), gNewMap);
if (gMap == null) {
gMap = gNewMap;
}
}
// gMap就是该服务的实例,registrant.getId()为实例的id
Lease<InstanceInfo> existingLease = gMap.get(registrant.getId());
// Retain the last dirty timestamp without overwriting it, if there is already a lease
// 如果existingLease 不为null,代表有注册
if (existingLease != null && (existingLease.getHolder() != null)) {
Long existingLastDirtyTimestamp = existingLease.getHolder().getLastDirtyTimestamp();
Long registrationLastDirtyTimestamp = registrant.getLastDirtyTimestamp();
logger.debug("Existing lease found (existing={}, provided={}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
// this is a > instead of a >= because if the timestamps are equal, we still take the remote transmitted
// InstanceInfo instead of the server local copy.
if (existingLastDirtyTimestamp > registrationLastDirtyTimestamp) {
logger.warn("There is an existing lease and the existing lease's dirty timestamp {} is greater" +
" than the one that is being registered {}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
logger.warn("Using the existing instanceInfo instead of the new instanceInfo as the registrant");
registrant = existingLease.getHolder();
}
} else {
// The lease does not exist and hence it is a new registration
synchronized (lock) {
if (this.expectedNumberOfClientsSendingRenews > 0) {
// Since the client wants to register it, increase the number of clients sending renews
this.expectedNumberOfClientsSendingRenews = this.expectedNumberOfClientsSendingRenews + 1;
updateRenewsPerMinThreshold();
}
}
logger.debug("No previous lease information found; it is new registration");
}
// 如果服务没有注册,则执行注册,将该实例放入到gMap中
Lease<InstanceInfo> lease = new Lease<InstanceInfo>(registrant, leaseDuration);
if (existingLease != null) {
lease.setServiceUpTimestamp(existingLease.getServiceUpTimestamp());
}
gMap.put(registrant.getId(), lease);
synchronized (recentRegisteredQueue) {
recentRegisteredQueue.add(new Pair<Long, String>(
System.currentTimeMillis(),
registrant.getAppName() + "(" + registrant.getId() + ")"));
}
// This is where the initial state transfer of overridden status happens
if (!InstanceStatus.UNKNOWN.equals(registrant.getOverriddenStatus())) {
logger.debug("Found overridden status {} for instance {}. Checking to see if needs to be add to the "
+ "overrides", registrant.getOverriddenStatus(), registrant.getId());
if (!overriddenInstanceStatusMap.containsKey(registrant.getId())) {
logger.info("Not found overridden id {} and hence adding it", registrant.getId());
overriddenInstanceStatusMap.put(registrant.getId(), registrant.getOverriddenStatus());
}
}
InstanceStatus overriddenStatusFromMap = overriddenInstanceStatusMap.get(registrant.getId());
if (overriddenStatusFromMap != null) {
logger.info("Storing overridden status {} from map", overriddenStatusFromMap);
registrant.setOverriddenStatus(overriddenStatusFromMap);
}
// Set the status based on the overridden status rules
InstanceStatus overriddenInstanceStatus = getOverriddenInstanceStatus(registrant, existingLease, isReplication);
registrant.setStatusWithoutDirty(overriddenInstanceStatus);
// If the lease is registered with UP status, set lease service up timestamp
if (InstanceStatus.UP.equals(registrant.getStatus())) {
lease.serviceUp();
}
registrant.setActionType(ActionType.ADDED);
recentlyChangedQueue.add(new RecentlyChangedItem(lease));
registrant.setLastUpdatedTimestamp();
invalidateCache(registrant.getAppName(), registrant.getVIPAddress(), registrant.getSecureVipAddress());
logger.info("Registered instance {}/{} with status {} (replication={})",
registrant.getAppName(), registrant.getId(), registrant.getStatus(), isReplication);
} finally {
read.unlock();
}
}
1)通过DiscoveryClient(eureka-client)中的register方法完成服务的注册
2)Eureka Client 本质上是使用AbstractJerseyEurekaHttpClient中的register方法,调用http的post请求,向Server发起注册的请求。
3)Eureka Server 中的InstanceRegistry中的register方法用于实际注册Eureka Client 客户端,客户端的信息都被保存在一个ConcurrentMap中。
由Eureka Client 向Eureka Server 发起续约请求(renew),切入点还是DiscoveryClient,其本质上是通过定时任务每隔30s向服务端发送心跳续约。
/**
* Renew with the eureka service by making the appropriate REST call
*/
boolean renew() {
EurekaHttpResponse<InstanceInfo> httpResponse;
try {
// 发送心跳到服务端,包含客户端基本信息
httpResponse = eurekaTransport.registrationClient.sendHeartBeat(instanceInfo.getAppName(), instanceInfo.getId(), instanceInfo, null);
logger.debug(PREFIX + "{} - Heartbeat status: {}", appPathIdentifier, httpResponse.getStatusCode());
if (httpResponse.getStatusCode() == Status.NOT_FOUND.getStatusCode()) {
REREGISTER_COUNTER.increment();
logger.info(PREFIX + "{} - Re-registering apps/{}", appPathIdentifier, instanceInfo.getAppName());
long timestamp = instanceInfo.setIsDirtyWithTime();
boolean success = register();
if (success) {
instanceInfo.unsetIsDirty(timestamp);
}
return success;
}
return httpResponse.getStatusCode() == Status.OK.getStatusCode();
} catch (Throwable e) {
logger.error(PREFIX + "{} - was unable to send heartbeat!", appPathIdentifier, e);
return false;
}
}
发送心跳的代码如下,将需要续约的客户端的信息以http的形式发送给服务端
@Override
public EurekaHttpResponse<InstanceInfo> sendHeartBeat(String appName, String id, InstanceInfo info, InstanceStatus overriddenStatus) {
String urlPath = "apps/" + appName + '/' + id;
ClientResponse response = null;
try {
WebResource webResource = jerseyClient.resource(serviceUrl)
.path(urlPath)
.queryParam("status", info.getStatus().toString())
.queryParam("lastDirtyTimestamp", info.getLastDirtyTimestamp().toString());
if (overriddenStatus != null) {
webResource = webResource.queryParam("overriddenstatus", overriddenStatus.name());
}
Builder requestBuilder = webResource.getRequestBuilder();
addExtraHeaders(requestBuilder);
response = requestBuilder.put(ClientResponse.class);
EurekaHttpResponseBuilder<InstanceInfo> eurekaResponseBuilder = anEurekaHttpResponse(response.getStatus(), InstanceInfo.class).headers(headersOf(response));
if (response.hasEntity()) {
eurekaResponseBuilder.entity(response.getEntity(InstanceInfo.class));
}
return eurekaResponseBuilder.build();
} finally {
if (logger.isDebugEnabled()) {
logger.debug("Jersey HTTP PUT {}/{}; statusCode={}", serviceUrl, urlPath, response == null ? "N/A" : response.getStatus());
}
if (response != null) {
response.close();
}
}
}
同样在服务端也有一个renew()
方法,切入点依然是InstanceRegistry
该renew方法本质上是调用InstanceRegistry的父类PeerAwareInstanceRegistryImpl的方法,而PeerAwareInstanceRegistryImpl中的renew又是调用AbstractInstanceRegistry中得renew方法
public boolean renew(String appName, String id, boolean isReplication) {
RENEW.increment(isReplication);
// 获取该服务的实例列表(Map)
Map<String, Lease<InstanceInfo>> gMap = registry.get(appName);
Lease<InstanceInfo> leaseToRenew = null;
if (gMap != null) {
// 通过实例的id 得到要续约的实例
leaseToRenew = gMap.get(id);
}
if (leaseToRenew == null) {
// 没有找到服务实例
RENEW_NOT_FOUND.increment(isReplication);
logger.warn("DS: Registry: lease doesn't exist, registering resource: {} - {}", appName, id);
return false;
} else {
// 得到续约的实例
InstanceInfo instanceInfo = leaseToRenew.getHolder();
if (instanceInfo != null) {
// touchASGCache(instanceInfo.getASGName());
InstanceStatus overriddenInstanceStatus = this.getOverriddenInstanceStatus(instanceInfo, leaseToRenew,
isReplication);
if (overriddenInstanceStatus == InstanceStatus.UNKNOWN) {
logger.info("Instance status UNKNOWN possibly due to deleted override for instance {}"
+ "; re-register required", instanceInfo.getId());
RENEW_NOT_FOUND.increment(isReplication);
return false;
}
if (!instanceInfo.getStatus().equals(overriddenInstanceStatus)) {
logger.info(
"The instance status {} is different from overridden instance status {} for instance {}. "
+ "Hence setting the status to overridden status",
instanceInfo.getStatus().name(), instanceInfo.getOverriddenStatus().name(),
instanceInfo.getId());
instanceInfo.setStatusWithoutDirty(overriddenInstanceStatus);
}
}
renewsLastMin.increment();
// 正常的发生了续约
leaseToRenew.renew();
return true;
}
}
续约的本质是修改最后的修改时间
duration为服务器"忍耐"时间,也就是说当超过30s,Server端不会马上剔除过期的Client,而是在30s+duration
之后剔除。
/**
* Renew the lease, use renewal duration if it was specified by the
* associated {@link T} during registration, otherwise default duration is
* {@link #DEFAULT_DURATION_IN_SECS}.
*/
public void renew() {
lastUpdateTimestamp = System.currentTimeMillis() + duration;
}
注意,lastUpdateTimestamp是用volatile修饰的,目的是让此变量在线程中可见,当我这个线程中的lastUpdateTimestamp被修改时,其他的线程也都能够看到。
当eureka-server发现有的实例长期没有发送续约(没有心跳),则剔除该Client服务。在AbstractInstanceRegistry中通过evict方法实现剔除功能。
public void evict(long additionalLeaseMs) {
logger.debug("Running the evict task");
if (!isLeaseExpirationEnabled()) {
logger.debug("DS: lease expiration is currently disabled.");
return;
}
// We collect first all expired items, to evict them in random order. For large eviction sets,
// if we do not that, we might wipe out whole apps before self preservation kicks in. By randomizing it,
// the impact should be evenly distributed across all applications.
// 新建一个实例的列表
List<Lease<InstanceInfo>> expiredLeases = new ArrayList<>();
// 开始循环registry注册中心,来完成对过期实例的检测工作
for (Entry<String, Map<String, Lease<InstanceInfo>>> groupEntry : registry.entrySet()) {
Map<String, Lease<InstanceInfo>> leaseMap = groupEntry.getValue();
if (leaseMap != null) {
for (Entry<String, Lease<InstanceInfo>> leaseEntry : leaseMap.entrySet()) {
// 获取实例对象
Lease<InstanceInfo> lease = leaseEntry.getValue();
// 判断该实例是否过期
if (lease.isExpired(additionalLeaseMs) && lease.getHolder() != null) {
// 将该实例信息放在过期列表中
expiredLeases.add(lease);
}
}
}
}
// To compensate for GC pauses or drifting local time, we need to use current registry size as a base for
// triggering self-preservation. Without that we would wipe out full registry.
int registrySize = (int) getLocalRegistrySize();
int registrySizeThreshold = (int) (registrySize * serverConfig.getRenewalPercentThreshold());
int evictionLimit = registrySize - registrySizeThreshold;
int toEvict = Math.min(expiredLeases.size(), evictionLimit);
if (toEvict > 0) {
logger.info("Evicting {} items (expired={}, evictionLimit={})", toEvict, expiredLeases.size(), evictionLimit);
Random random = new Random(System.currentTimeMillis());
for (int i = 0; i < toEvict; i++) {
// Pick a random item (Knuth shuffle algorithm)
int next = i + random.nextInt(expiredLeases.size() - i);
// 交换元素位置,保证剔除的公平性(元素被剔除的先后顺序和放入到list的顺序无关)
Collections.swap(expiredLeases, i, next);
// 使用Knuth shuffle 算法,在expiredLeases中
// 从expiredLeases中获取一个过期的实例
Lease<InstanceInfo> lease = expiredLeases.get(i);
String appName = lease.getHolder().getAppName();
String id = lease.getHolder().getId();
EXPIRED.increment();
logger.warn("DS: Registry: expired lease for {}/{}", appName, id);
// 实际上元素(client)被剔除的方法
internalCancel(appName, id, false);
}
}
}
在internalCancel(String appName, String id, boolean isReplication)
方法中,最终是执行leaseToCancel = gMap.remove(id)
,从存储客户端信息的Map中remove掉这个instance。判断实例是否过期的方法如下。
public boolean isExpired(long additionalLeaseMs) {
return (evictionTimestamp > 0
|| System.currentTimeMillis() > (lastUpdateTimestamp + duration + additionalLeaseMs));
}
其实追根溯源,可以发现Eureka Server本质上是使用一个定期删除的策略来完成对Eureka Client的剔除。
发现该定时任务默认定期检查时间为60s
当eureka-client下线时,项目不会立马被关闭,而是做好"善后"工作才关闭。
首先eureka-client会发送请求给eureka-server,表示客户端要下线了。在DiscoveryClient中通过unregister()
方法发出服务下线的请求。
接着跟进cancle方法(AbstractJerseyEurekaHttpClient),可以看到,通过请求下线服务的名称拼接得到urlPath,并构建一个resourceBuilder对象,最终调用resourceBuilder.delete(ClientResponse.class)
向Server发出http请求下线。
那么eureka-server怎么处理该请求呢?我们在AbstractInstanceRegistry中找到对应得cancel方法
我们可以看见服务的下线在eureka-server 里面直接调用
protected boolean internalCancel(String appName, String id, boolean isReplication) {
try {
read.lock();
CANCEL.increment(isReplication);
// 从注册中心中通过请求的应用名称获取对饮的gMap
Map<String, Lease<InstanceInfo>> gMap = registry.get(appName);
Lease<InstanceInfo> leaseToCancel = null;
//
if (gMap != null) {
// 从gMap中移除实例
leaseToCancel = gMap.remove(id);
}
....
}
}
之前在手写Dubbo RPC框架的时候,有讲到过服务的发现,其实理解起来很简单,无非就是通过服务的名称发现服务的实例。比如说A服务要调用B服务的话,那么B服务必须提前注册,格式为{serviceName,{instanceid:instanceObject}}
(Map中套Map的结构,上面有提到),那么通过服务的名称,就可以发现这个服务,进而获取这个服务的实例信息。
在DiscoveryClient的子类CompositeDiscoveryClient中找到getInstance方法
再进到EurekaDiscoveryClient中,发现本上还是调用DiscoveryClient的getInstancesByVipAddress(serviceId, false)
方法
@Override
public List<InstanceInfo> getInstancesByVipAddress(String vipAddress, boolean secure,
@Nullable String region) {
if (vipAddress == null) {
throw new IllegalArgumentException(
"Supplied VIP Address cannot be null");
}
// 存储的服务列表
Applications applications;
if (instanceRegionChecker.isLocalRegion(region)) {
applications = this.localRegionApps.get();
} else {
applications = remoteRegionVsApps.get(region);
if (null == applications) {
logger.debug("No applications are defined for region {}, so returning an empty instance list for vip "
+ "address {}.", region, vipAddress);
return Collections.emptyList();
}
}
if (!secure) {
// 真正的完成服务的发现的方法
return applications.getInstancesByVirtualHostName(vipAddress);
} else {
return applications.getInstancesBySecureVirtualHostName(vipAddress);
}
}
以上代码中比较关键的就是applications对象,那么我们点开看看Applications中究竟是什么结构。不难发现Applications中有个专门存放InstanceInfo(服务实例信息)的List集合,只不过被AtomicReference包装了一下,目的是使得List集合在高并发的情况下是线程安全的。
回到上面的代码,applications.getInstancesByVirtualHostName(vipAddress)
是真正的完成服务的发现的方法,那么点开这个方法(这个方法比较难懂,是一串lambda表达式),本质上就是通过virtualHostName(服务应用名称) 获取服务注册列表
在Applcaitions中存在以下四个Map集合,存储的数据都相同,只不过应用的场景不同,这些数据全部来自Server端,由Server发送给Client。
并且这四个Map的key都是大写的服务名称
此时我们还没有做服务的发现,但是该集合里面已经有值了,说明我们的项目启动后,会自动去拉取服务,并且将拉取的服务缓存起来。那么具体什么时候被放进去的呢?
在DiscoveryClient的构造器里面,有个任务的调度线程池,该线程池将用来在服务列表的拉取。
cacheRefreshExecutor = new ThreadPoolExecutor(
1, clientConfig.getCacheRefreshExecutorThreadPoolSize(),
0, TimeUnit.SECONDS,
new SynchronousQueue<Runnable>(),
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-CacheRefreshExecutor-%d")
.setDaemon(true)
.build()
); // use direct handoff
该线程的调度 ,调度的是下面的定时任务,在项目启动的时候就去拉取服务列表
我们点开看看这个刷新注册列表的方法refreshRegistry()
@VisibleForTesting
void refreshRegistry() {
boolean isFetchingRemoteRegionRegistries = isFetchingRemoteRegionRegistries();
boolean remoteRegionsModified = false;
// This makes sure that a dynamic change to remote regions to fetch is honored.
String latestRemoteRegions = clientConfig.fetchRegistryForRemoteRegions();
if (null != latestRemoteRegions) {
String currentRemoteRegions = remoteRegionsToFetch.get();
if (!latestRemoteRegions.equals(currentRemoteRegions)) {
// Both remoteRegionsToFetch and AzToRegionMapper.regionsToFetch need to be in sync
synchronized (instanceRegionChecker.getAzToRegionMapper()) {
if (remoteRegionsToFetch.compareAndSet(currentRemoteRegions, latestRemoteRegions)) {
String[] remoteRegions = latestRemoteRegions.split(",");
remoteRegionsRef.set(remoteRegions);
instanceRegionChecker.getAzToRegionMapper().setRegionsToFetch(remoteRegions);
remoteRegionsModified = true;
} else {
logger.info("Remote regions to fetch modified concurrently," +
" ignoring change from {} to {}", currentRemoteRegions, latestRemoteRegions);
}
}
} else {
// Just refresh mapping to reflect any DNS/Property change
instanceRegionChecker.getAzToRegionMapper().refreshMapping();
}
}
// 在在里面真正的执行服务列表的获取
boolean success = fetchRegistry(remoteRegionsModified);
if (success) {
registrySize = localRegionApps.get().size();
lastSuccessfulRegistryFetchTimestamp = System.currentTimeMillis();
}
}
找到fetchRegistry
拉取注册列表方法
private boolean fetchRegistry(boolean forceFullRegistryFetch) {
Stopwatch tracer = FETCH_REGISTRY_TIMER.start();
try {
// If the delta is disabled or if it is the first time, get all
// applications
Applications applications = getApplications();
if (clientConfig.shouldDisableDelta()
|| (!Strings.isNullOrEmpty(clientConfig.getRegistryRefreshSingleVipAddress()))
|| forceFullRegistryFetch
|| (applications == null)
|| (applications.getRegisteredApplications().size() == 0)
|| (applications.getVersion() == -1)) //Client application does not have latest library supporting delta
{
logger.info("Disable delta property : {}", clientConfig.shouldDisableDelta());
logger.info("Single vip registry refresh property : {}", clientConfig.getRegistryRefreshSingleVipAddress());
logger.info("Force full registry fetch : {}", forceFullRegistryFetch);
logger.info("Application is null : {}", (applications == null));
logger.info("Registered Applications size is zero : {}",
(applications.getRegisteredApplications().size() == 0));
logger.info("Application version is -1: {}", (applications.getVersion() == -1));
// 全量的拉取(代表拉取所有的注册中心)
// 当缓存为null 或者它里面的数据为empty时,进行全量的拉取
getAndStoreFullRegistry();
} else {
// 增量(只拉取修改的注册中心)
getAndUpdateDelta(applications);
}
applications.setAppsHashCode(applications.getReconcileHashCode());
logTotalInstances();
} catch (Throwable e) {
logger.error(PREFIX + "{} - was unable to refresh its cache! status = {}", appPathIdentifier, e.getMessage(), e);
return false;
} finally {
if (tracer != null) {
tracer.stop();
}
}
通过上面的代码不难发现,Eureka提供了两种拉取服务列表的方案,分别是全量拉取,和增量拉取
全量拉取之后将applications对象放入localRegionApps中
private void getAndUpdateDelta(Applications applications) throws Throwable {
long currentUpdateGeneration = fetchRegistryGeneration.get();
Applications delta = null;
// 发起Http 请求来完成一个增量的拉取
EurekaHttpResponse<Applications> httpResponse = eurekaTransport.queryClient.getDelta(remoteRegionsRef.get());
if (httpResponse.getStatusCode() == Status.OK.getStatusCode()) {
// 在eureka-server 里面修改或新增的集合
delta = httpResponse.getEntity();
}
// 若拉取的集合为null
if (delta == null) {
logger.warn("The server does not allow the delta revision to be applied because it is not safe. "
+ "Hence got the full registry.");
// 进行全量拉取
getAndStoreFullRegistry();
} else if (fetchRegistryGeneration.compareAndSet(currentUpdateGeneration, currentUpdateGeneration + 1)) {
logger.debug("Got delta update with apps hashcode {}", delta.getAppsHashCode());
String reconcileHashCode = "";
if (fetchRegistryUpdateLock.tryLock()) {
try {
// eureka-server 里面变化的集合,通过它里面的值更新本地的缓存
updateDelta(delta);
// 一致性的hashcode值
// 一致性hash 用来校验远程的Eureka-server 集合和本地的eureka-server 集合是否一样
reconcileHashCode = getReconcileHashCode(applications);
} finally {
fetchRegistryUpdateLock.unlock();
}
} else {
logger.warn("Cannot acquire update lock, aborting getAndUpdateDelta");
}
//若hashcode值相等,则不拉取,否则,在拉取一次
// There is a diff in number of instances for some reason
if (!reconcileHashCode.equals(delta.getAppsHashCode()) || clientConfig.shouldLogDeltaDiff()) {
reconcileAndLogDifference(delta, reconcileHashCode); // this makes a remoteCall
}
} else {
logger.warn("Not updating application delta as another thread is updating it already");
logger.debug("Ignoring delta update with apps hashcode {}, as another thread is updating it already", delta.getAppsHashCode());
}
}
1)服务列表的拉取并不是在服务调用的时候拉取,而是在项目启动时就有定时任务去拉取了,在DiscoveryClient的构造器里面有体现
2)我们的服务的实例并不是实时的 eureka-server 里面的数据,而是一个本地的(内存)缓存数据
3)缓存的脏读和更新的解决
全量拉取发生在:当服务列表为 null的情况
增量拉取发生在当列表不为 null ,只拉取eureka-server的修改的数据(注册新的服务,上线服务)