我们从服务的发现,服务下线,服务租约,服务剔出等方面分析源码
LeaseManager(写操作接口)
public interface LeaseManager<T> {
void register(T r, int leaseDuration, boolean isReplication);
boolean cancel(String appName, String id, boolean isReplication);
boolean renew(String appName, String id, boolean isReplication);
void evict();
}
根据idea的启动日志找到了AbstractInstanceRegistry该处日志打印,跟踪到了 com.netflix.eureka.resources.ApplicationResource#addInstance
@POST
@Consumes({"application/json", "application/xml"})
public Response addInstance(InstanceInfo info,@HeaderParam(PeerEurekaNode.HEADER_REPLICATION) String isReplication) {
//对instanceInfo的信息进行非空校验()
//....省略部分代码....
//同步到注册中心其他节点
registry.register(info, "true".equals(isReplication));
}
public void register(final InstanceInfo info, final boolean isReplication) {
//默认续约时间
int leaseDuration = Lease.DEFAULT_DURATION_IN_SECS;
if (info.getLeaseInfo() != null && info.getLeaseInfo().getDurationInSecs() > 0) {
leaseDuration = info.getLeaseInfo().getDurationInSecs();
}
//3.1.2服务注册
super.register(info, leaseDuration, isReplication);
//3.1.3复制到其他节点
replicateToPeers(Action.Register, info.getAppName(), info.getId(), info, null, isReplication);
}
3.1.2 注册过程
在initEurekaServerContext()调用了,用于获取并初始化InstanceInfo
public void register(InstanceInfo registrant, int leaseDuration, boolean isReplication) {
try {
//使用读写锁保证线程的安全性
read.lock();
//本质上是ConcurrentHashMap
Map<String, Lease<InstanceInfo>> gMap = registry.get(registrant.getAppName());
REGISTER.increment(isReplication);
if (gMap == null) {
final ConcurrentHashMap<String, Lease<InstanceInfo>> gNewMap = new ConcurrentHashMap<String, Lease<InstanceInfo>>();
gMap = registry.putIfAbsent(registrant.getAppName(), gNewMap);
if (gMap == null) {
gMap = gNewMap;
}
}
Lease<InstanceInfo> existingLease = gMap.get(registrant.getId());
// 如果已经有租约,则保留最后一个脏的时间戳而不覆盖它
if (existingLease != null && (existingLease.getHolder() != null)) {
Long existingLastDirtyTimestamp = existingLease.getHolder().getLastDirtyTimestamp();
Long registrationLastDirtyTimestamp = registrant.getLastDirtyTimestamp();
logger.debug("Existing lease found (existing={}, provided={}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
//如果本次注册时间小于注册中心保存的该实例注册最近一次注册时间,说明之前已经注册成功了,那么就用已存在的替换请求进来的;
if (existingLastDirtyTimestamp > registrationLastDirtyTimestamp) {
logger.warn("There is an existing lease and the existing lease's dirty timestamp {} is greater" +
" than the one that is being registered {}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
logger.warn("Using the existing instanceInfo instead of the new instanceInfo as the registrant");
registrant = existingLease.getHolder();
}
} else {
// 租约不存在,因此是新登记
synchronized (lock) {
if (this.expectedNumberOfRenewsPerMin > 0) {
// (1
// for 30 seconds, 2 for a minute)
this.expectedNumberOfRenewsPerMin = this.expectedNumberOfRenewsPerMin + 2;
this.numberOfRenewsPerMinThreshold =
(int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());
}
}
logger.debug("No previous lease information found; it is new registration");
}
Lease<InstanceInfo> lease = new Lease<InstanceInfo>(registrant, leaseDuration);
if (existingLease != null) {
//设置服务注册时间戳
lease.setServiceUpTimestamp(existingLease.getServiceUpTimestamp());
}
gMap.put(registrant.getId(), lease);
synchronized (recentRegisteredQueue) {
recentRegisteredQueue.add(new Pair<Long, String>(
System.currentTimeMillis(),
registrant.getAppName() + "(" + registrant.getId() + ")"));
}
// 重写状态的初始状态
if (!InstanceStatus.UNKNOWN.equals(registrant.getOverriddenStatus())) {
logger.debug("Found overridden status {} for instance {}. Checking to see if needs to be add to the "
+ "overrides", registrant.getOverriddenStatus(), registrant.getId());
if (!overriddenInstanceStatusMap.containsKey(registrant.getId())) {
logger.info("Not found overridden id {} and hence adding it", registrant.getId());
overriddenInstanceStatusMap.put(registrant.getId(), registrant.getOverriddenStatus());
}
}
InstanceStatus overriddenStatusFromMap = overriddenInstanceStatusMap.get(registrant.getId());
if (overriddenStatusFromMap != null) {
logger.info("Storing overridden status {} from map", overriddenStatusFromMap);
registrant.setOverriddenStatus(overriddenStatusFromMap);
}
// 设置状态规则
InstanceStatus overriddenInstanceStatus = getOverriddenInstanceStatus(registrant, existingLease, isReplication);
registrant.setStatusWithoutDirty(overriddenInstanceStatus);
// If the lease is registered with UP status, set lease service up timestamp
if (InstanceStatus.UP.equals(registrant.getStatus())) {
lease.serviceUp();
}
registrant.setActionType(ActionType.ADDED);
recentlyChangedQueue.add(new RecentlyChangedItem(lease));
registrant.setLastUpdatedTimestamp();
invalidateCache(registrant.getAppName(), registrant.getVIPAddress(), registrant.getSecureVipAddress());
logger.info("Registered instance {}/{} with status {} (replication={})",
registrant.getAppName(), registrant.getId(), registrant.getStatus(), isReplication);
} finally {
read.unlock();
}
}
3.1.3:用于将client信息复制到其他peer中
当注册中心集群时,每个节点只会向其配置文件所配置的其他节点同步信息,且是单向的。从ServerCodecs 获取配置文件信息
private void replicateToPeers(Action action, String appName, String id,
InstanceInfo info /* optional */,
InstanceStatus newStatus /* optional */, boolean isReplication) {
Stopwatch tracer = action.getTimer().start();
try {
if (isReplication) {
numberOfReplicationsLastMin.increment();
}
// 如果已经复制client信息则不再复制
if (peerEurekaNodes == Collections.EMPTY_LIST || isReplication) {
return;
}
for (final PeerEurekaNode node : peerEurekaNodes.getPeerEurekaNodes()) {
// 如果目标节点和本机的hostName一致则不会同步
if (peerEurekaNodes.isThisMyUrl(node.getServiceUrl())) {
continue;
}
//复制到别的Server注册,指定了要同步到其他节点信息的类型,如实例的注册,续约等,再根据不同动作执行不同的方法,所有的动作都通过这里分发同步到其他节点;
replicateInstanceActionsToPeers(action, appName, id, info, newStatus, node);
}
} finally {
tracer.stop();
}
}
3.2 续约
3.2.1:通过跟踪方法调用找到了续约的接收客户端请求的入口:com.netflix.eureka.resources.InstanceResource#renewLease
使用registry.renew()方法更新注册表
@PUT
public Response renewLease(
@HeaderParam(PeerEurekaNode.HEADER_REPLICATION) String isReplication,
@QueryParam("overriddenstatus") String overriddenStatus,
@QueryParam("status") String status,
@QueryParam("lastDirtyTimestamp") String lastDirtyTimestamp) {
boolean isFromReplicaNode = "true".equals(isReplication);
//renew()用于更新注册表逻辑
boolean isSuccess = registry.renew(app.getName(), id, isFromReplicaNode);
// 在注册表中找不到,请立即请求注册
if (!isSuccess) {
logger.warn("Not Found (Renew): {} - {}", app.getName(), id);
return Response.status(Status.NOT_FOUND).build();
}
// 检查是否需要基于脏时间戳进行同步,客户端实例可能更改了一些值
Response response = null;
if (lastDirtyTimestamp != null && serverConfig.shouldSyncWhenTimestampDiffers()) {
response = this.validateDirtyTimestamp(Long.valueOf(lastDirtyTimestamp), isFromReplicaNode);
// Store the overridden status since the validation found out the node that replicates wins
if (response.getStatus() == Response.Status.NOT_FOUND.getStatusCode()
&& (overriddenStatus != null)
&& !(InstanceStatus.UNKNOWN.name().equals(overriddenStatus))
&& isFromReplicaNode) {
registry.storeOverriddenStatusIfRequired(app.getAppName(), id, InstanceStatus.valueOf(overriddenStatus));
}
} else {
response = Response.ok().build();
}
logger.debug("Found (Renew): {} - {}; reply status={}", app.getName(), id, response.getStatus());
return response;
}
3.2.2:最终调用AbstractInstanceRegistry#renew方法进行更新,最终调用 leaseToRenew.renew()更新到最新的更新时间
public boolean renew(String appName, String id, boolean isReplication) {
RENEW.increment(isReplication);
//从注册列表中获取对应实例信息
Map<String, Lease<InstanceInfo>> gMap = registry.get(appName);
Lease<InstanceInfo> leaseToRenew = null;
if (gMap != null) {
leaseToRenew = gMap.get(id);
}
if (leaseToRenew == null) {
RENEW_NOT_FOUND.increment(isReplication);
logger.warn("DS: Registry: lease doesn't exist, registering resource: {} - {}", appName, id);
return false;
} else {
InstanceInfo instanceInfo = leaseToRenew.getHolder();
if (instanceInfo != null) {
// touchASGCache(instanceInfo.getASGName());
InstanceStatus overriddenInstanceStatus = this.getOverriddenInstanceStatus(
instanceInfo, leaseToRenew, isReplication);
if (overriddenInstanceStatus == InstanceStatus.UNKNOWN) {
logger.info("Instance status UNKNOWN possibly due to deleted override for instance {}"
+ "; re-register required", instanceInfo.getId());
RENEW_NOT_FOUND.increment(isReplication);
return false;
}
if (!instanceInfo.getStatus().equals(overriddenInstanceStatus)) {
logger.info(
"The instance status {} is different from overridden instance status {} for instance {}. "
+ "Hence setting the status to overridden status", instanceInfo.getStatus().name(),
instanceInfo.getOverriddenStatus().name(),
instanceInfo.getId());
instanceInfo.setStatusWithoutDirty(overriddenInstanceStatus);
}
}
renewsLastMin.increment();
//最终的方法:更新最近的更新时间
leaseToRenew.renew();
return true;
}
}
3.3 剔除,服务下线(EvictionTask)
首先我们看到这个方法,
if (!isLeaseExpirationEnabled()) {
logger.debug("DS: lease expiration is currently disabled.");
return;
}
看日志可以看出当前租户过期不可用,也就是说不会因为有实例续约过期而被剔除,那么我们看一下是取决于哪些条件,点进去看一下:
@Override
public boolean isSelfPreservationModeEnabled() {
return serverConfig.shouldEnableSelfPreservation();
}
一是来自于配置文件
eureka.server.enableSelfPreservation自我保护模式是否开启,自我保护模式,当出现出现网络分区、eureka在短时间内丢失过多客户端时,会进入自我保护模式,即一个服务长时间没有发送心跳,eureka也不会将其删除,默认为true.如果该值配置为false,则永远不会进入保护模式,那么一旦遇到网络波动,会有大量的服务实例被剔除,但是他们却都是可用的,这是很危险的;如果是内网则另当别论了;
二是通过阈值控制;
如果Eureka Server最近1分钟收到renew的次数小于阈值(即预期的最小值),则会触发自我保护模式,此时Eureka
Server此时会认为这是网络问题,它不会注销任何过期的实例。等到最近收到renew的次数大于阈值后,则Eureka
Server退出自我保护模式。 自我保护模式阈值计算:每个instance的预期心跳数目 = 60/每个instance的心跳间隔秒数 阈值 =
所有注册到服务的instance的数量的预期心跳之和 *自我保护系数 以上的参数都可配置的:
instance的心跳间隔秒数:eureka.instance.lease-renewal-interval-in-seconds
自我保护系数:eureka.server.renewal-percent-threshold
3.3.2 随机剔除服务实例
遍历服务列表,并判断是否过期,若过期将其add到expiredLeases中;
List<Lease<InstanceInfo>> expiredLeases = new ArrayList<>();
for (Entry<String, Map<String, Lease<InstanceInfo>>> groupEntry : registry.entrySet()) {
Map<String, Lease<InstanceInfo>> leaseMap = groupEntry.getValue();
if (leaseMap != null) {
for (Entry<String, Lease<InstanceInfo>> leaseEntry : leaseMap.entrySet()) {
Lease<InstanceInfo> lease = leaseEntry.getValue();
if (lease.isExpired(additionalLeaseMs) && lease.getHolder() != null) {
expiredLeases.add(lease);
}
}
}
}
接下来注册中心有执行了一个保护的操作:根据本地服务的数量重新计算了续约阈值,然后与注册的服务数量做差作为本次剔除服务数量的最大值,再对比放在过期待剔除服务列表中的数量,取最小值作为本次剔除过期服务的数量.该计算过程为了避免某些原因使得该注册中心节点服务实例被全部剔除;
若计算后最终要剔除的服务数量小于待剔除服务列表中的数量,则采取随机方式剔除;
`
//重新计算剔除数量
int registrySize = (int) getLocalRegistrySize();
int registrySizeThreshold = (int) (registrySize * serverConfig.getRenewalPercentThreshold());
int evictionLimit = registrySize - registrySizeThreshold;
int toEvict = Math.min(expiredLeases.size(), evictionLimit);
if (toEvict > 0) {
logger.info("Evicting {} items (expired={}, evictionLimit={})", toEvict, expiredLeases.size(), evictionLimit);
//随机剔除
Random random = new Random(System.currentTimeMillis());
for (int i = 0; i < toEvict; i++) {
// Pick a random item (Knuth shuffle algorithm)
int next = i + random.nextInt(expiredLeases.size() - i);
Collections.swap(expiredLeases, i, next);
Lease<InstanceInfo> lease = expiredLeases.get(i);
String appName = lease.getHolder().getAppName();
String id = lease.getHolder().getId();
EXPIRED.increment();
logger.warn("DS: Registry: expired lease for {}/{}", appName, id);
internalCancel(appName, id, false);
}
}
剔除后通知到其他注册中心节点;