本文会基于Eureka源码,一步一步分析Client是如何刷新注册表缓存信息的
前言
何为刷新缓存
Client中启动了一个定时任务,周期性的通过Http请求去Server端拉取最新的注册表信息,并缓存到本地
如何开启缓存刷新
Eureka Client默认开启缓存刷新,即表明会从Eureka Server 抓取注册表信息
可以通过配置项
eureka:
client:
fetch-registry: true # 默认开启
刷新周期
默认每隔30s执行一次刷新任务
eureka:
client:
registry-fetch-interval-seconds: 30
可以通过该参数自定义刷新任务间隔时间
只钟情获取某单个Server的注册表信息
当Server集群化时,如果想要Client从固定的某个server节点获取注册表信息,可通过以下参数配置
eureka:
client:
registry-refresh-single-vip-address: xxx # 表示只钟情该server节点的注册表信息
何为全量获取
每次刷新注册表缓存任务执行时,都去server端获取所有注册表信息,这就是全量获取,全量获取没有参数可配置,当禁用了增量获取以后就会总是执行全量获取了
何为增量更新
我们知道在Eureka中,注册表实例发生改变的频率要远远小于每隔客户点获取注册表信息请求的频率,如果每次请求都是拉取Server注册表中所有的实例信息,那么会造成一定的带宽浪费和传输性能降低
Eureka提供了增量获取的特性,可以提高获取请求的传输效率和降低网络堵塞,这就是增量更新
-
配置项
eureka: client: disable-delta: false # 默认开启增量更新注册表缓存,即不禁用增量
可以通过该参数调整缓存刷新策略
时序图
源码
DiscoveryClient
localRegionApps.set(new Applications());
@Override
public Applications getApplications() {
return localRegionApps.get();
}
@Inject
DiscoveryClient(ApplicationInfoManager applicationInfoManager, EurekaClientConfig config, AbstractDiscoveryClientOptionalArgs args,
Provider backupRegistryProvider) {
// ...
//
localRegionApps.set(new Applications());
// 1. 若获取注册表信息配置参数开启,则会在初始化DiscoverClient且启动定时器之前,主动获取一次注册表信息
if (clientConfig.shouldFetchRegistry() && !fetchRegistry(false)) {
// 如果fetchRegistry(false)返回false, 代表所有的Eureka Server都不可用,那么会从备用的服务里面去取数据
fetchRegistryFromBackup();
}
// 2. 初始化定时器
initScheduledTasks();
// ...
}
可以看到会在DiscoveryClient初始化时,且还未启动心跳续约,缓存刷新以及实例信息复制等定时任务前
就会立即去Eureka Server主动获取一次注册表信息,若获取失败,还会调用fetchRegistryFromBackup()方法从备用服务器获取
暂不考虑定时刷新注册表缓存任务的初始化过程,我们直接去看第一次主动获取注册表的代码fetchRegistry(false)
/**
*
* 获取注册表信息
*
* @param forceFullRegistryFetch 是否强制全量获取注册表信息
*
* @return 如果注册表被获取到则返回true,否则返回false
*/
private boolean fetchRegistry(boolean forceFullRegistryFetch) {
// 1. 启动获取注册表信息跑表
Stopwatch tracer = FETCH_REGISTRY_TIMER.start();
try {
// 2. 获取本地已有的注册表信息
Applications applications = getApplications();
// 3. 如果增量获取被禁用,则总是会获取全量注册表信息,否则会在首次获取到注册表实例信息后(applications.getRegisteredApplications().size() > 0)开始执行增量获取
if (clientConfig.shouldDisableDelta() // 根据配置决定是否禁用增量获取
|| (!Strings.isNullOrEmpty(clientConfig.getRegistryRefreshSingleVipAddress()))
|| forceFullRegistryFetch // 根据参数forceFullRegistryFetch 决定是否执行全量获取
|| (applications == null) // 总是false
|| (applications.getRegisteredApplications().size() == 0) // 只要本地没有获取到一个注册实例,就会一致执行全量获取
|| (applications.getVersion() == -1)) //Client application does not have latest library supporting delta
{
logger.info("Disable delta property : {}", clientConfig.shouldDisableDelta());
logger.info("Single vip registry refresh property : {}", clientConfig.getRegistryRefreshSingleVipAddress());
logger.info("Force full registry fetch : {}", forceFullRegistryFetch);
logger.info("Application is null : {}", (applications == null));
logger.info("Registered Applications size is zero : {}",
(applications.getRegisteredApplications().size() == 0));
logger.info("Application version is -1: {}", (applications.getVersion() == -1));
// 3.1. 全量获取并存储本地注册表信息
getAndStoreFullRegistry();
} else {
// 3.2. 增量获取注册表信息并更新本地已有注册表信息
getAndUpdateDelta(applications);
}
applications.setAppsHashCode(applications.getReconcileHashCode());
logTotalInstances();
} catch (Throwable e) {
logger.error(PREFIX + "{} - was unable to refresh its cache! status = {}", appPathIdentifier, e.getMessage(), e);
return false;
} finally {
if (tracer != null) {
tracer.stop();
}
}
// Notify about cache refresh before updating the instance remote status
onCacheRefreshed();
// Update remote status based on refreshed data held in the cache
updateInstanceRemoteStatus();
// registry was fetched successfully, so return true
return true;
}
这里简单做个分析
当满足以下任一条件时会进行全量注册信息获取
增量获取被禁用即disableDelta = true
参数forceFullRegistryFetch = true
-
applications == null,这里总是false,因为在调用该方法前已经进行了初始化locaRegionApps
localRegionApps.set(new Applications());
当applications.getRegisteredApplications().size() == 0,即表示还未获取到任何注册表信息时
applications.getVersion() == -1或clientConfig.getRegistryRefreshSingleVipAddress() 为不为空或null
全量获取注册表信息之 getAndStoreFullRegistry();
方法仍然定义在了DiscoveryClient类中
/**
* Gets the full registry information from the eureka server and stores it locally.
* When applying the full registry, the following flow is observed:
*
* if (update generation have not advanced (due to another thread))
* atomically set the registry to the new registry
* fi
*
* @return the full registry information.
* @throws Throwable
* on error.
*/
private void getAndStoreFullRegistry() throws Throwable {
// 1. 当前获取注册表年龄(次数)
long currentUpdateGeneration = fetchRegistryGeneration.get();
logger.info("Getting all instance registry info from the eureka server");
Applications apps = null;
// 2. 根据是否启用只获取单个server节点的注册表,从而调用queryClient的不同方法去获取注册表信息
EurekaHttpResponse httpResponse = clientConfig.getRegistryRefreshSingleVipAddress() == null
? eurekaTransport.queryClient.getApplications(remoteRegionsRef.get())
: eurekaTransport.queryClient.getVip(clientConfig.getRegistryRefreshSingleVipAddress(), remoteRegionsRef.get());
// 3. 判断获取响应状态码是否时成功,成功则取出获取到的注册表信息即Applications对象
if (httpResponse.getStatusCode() == Status.OK.getStatusCode()) {
apps = httpResponse.getEntity();
}
logger.info("The response status is {}", httpResponse.getStatusCode());
// 如果整个Applications对象为null,则表示server端出现某种问题,打印error级别日志
if (apps == null) {
logger.error("The application is null for some reason. Not storing this information");
} else if (fetchRegistryGeneration.compareAndSet(currentUpdateGeneration, currentUpdateGeneration + 1)) {
// 3.1 CAS操作当前获取注册表次数,设置成功,表示没有其他线程竞争
// 过滤所有UP状态的注册信息且将其利用洗牌算法打乱
// 再替换本地注册表为处理后的注册表信息
localRegionApps.set(this.filterAndShuffle(apps));
logger.debug("Got full registry with apps hashcode {}", apps.getAppsHashCode());
} else {
logger.warn("Not updating applications as another thread is updating it already");
}
}
AbstractJerseyEurekaHttpClient.getApplications() 发起全量获取注册表信息请求
// 该方法就是构建了Http Get请求调用server获取注册表信息
private EurekaHttpResponse getApplicationsInternal(String urlPath, String[] regions) {
ClientResponse response = null;
String regionsParamValue = null;
try {
WebResource webResource = jerseyClient.resource(serviceUrl).path(urlPath);
if (regions != null && regions.length > 0) {
regionsParamValue = StringUtil.join(regions);
webResource = webResource.queryParam("regions", regionsParamValue);
}
Builder requestBuilder = webResource.getRequestBuilder();
addExtraHeaders(requestBuilder);
response = requestBuilder.accept(MediaType.APPLICATION_JSON_TYPE).get(ClientResponse.class);
Applications applications = null;
// 1. 如果响应码为200且有响应体返回,则将响应体转为Applications对象
if (response.getStatus() == Status.OK.getStatusCode() && response.hasEntity()) {
applications = response.getEntity(Applications.class);
}
// 返回调用方
return anEurekaHttpResponse(response.getStatus(), Applications.class)
.headers(headersOf(response))
.entity(applications)
.build();
} finally {
if (logger.isDebugEnabled()) {
logger.debug("Jersey HTTP GET {}/{}?{}; statusCode={}",
serviceUrl, urlPath,
regionsParamValue == null ? "" : "regions=" + regionsParamValue,
response == null ? "N/A" : response.getStatus()
);
}
// response关闭掉
if (response != null) {
response.close();
}
}
}
Eureka Server端
ApplicationsResource
@Path("/{version}/apps")
@Produces({"application/xml", "application/json"})
public class ApplicationsResource {
// ...
@GET
public Response getContainers(@PathParam("version") String version,
@HeaderParam(HEADER_ACCEPT) String acceptHeader,
@HeaderParam(HEADER_ACCEPT_ENCODING) String acceptEncoding,
@HeaderParam(EurekaAccept.HTTP_X_EUREKA_ACCEPT) String eurekaAccept,
@Context UriInfo uriInfo,
@Nullable @QueryParam("regions") String regionsStr) {
boolean isRemoteRegionRequested = null != regionsStr && !regionsStr.isEmpty();
String[] regions = null;
if (!isRemoteRegionRequested) {
EurekaMonitors.GET_ALL.increment();
} else {
regions = regionsStr.toLowerCase().split(",");
Arrays.sort(regions); // So we don't have different caches for same regions queried in different order.
EurekaMonitors.GET_ALL_WITH_REMOTE_REGIONS.increment();
}
// Check if the server allows the access to the registry. The server can
// restrict access if it is not
// ready to serve traffic depending on various reasons.
if (!registry.shouldAllowAccess(isRemoteRegionRequested)) {
return Response.status(Status.FORBIDDEN).build();
}
CurrentRequestVersion.set(Version.toEnum(version));
// 1. keyType和returnMediaType默认为JSON
KeyType keyType = Key.KeyType.JSON;
String returnMediaType = MediaType.APPLICATION_JSON;
// 1.1. 若请求参数acceptHeader 为null或不包含JSON的话,则转为xml形式的keyType
if (acceptHeader == null || !acceptHeader.contains(HEADER_JSON_VALUE)) {
keyType = Key.KeyType.XML;
returnMediaType = MediaType.APPLICATION_XML;
}
// 2. 生成一个新的key,用于从缓存中获取对应的value值,生成规则:entityType + entityName + requestType + requestVersion + eurekaAccept
Key cacheKey = new Key(Key.EntityType.Application,
ResponseCacheImpl.ALL_APPS,
keyType, CurrentRequestVersion.get(), EurekaAccept.fromString(eurekaAccept), regions
);
Response response;
// 3. 根据请求参数acceptEncoding类型,返回不同类型的编码流
if (acceptEncoding != null && acceptEncoding.contains(HEADER_GZIP_VALUE)) {
// 3.1. 返回GZIP压缩后的注册表信息
response = Response.ok(responseCache.getGZIP(cacheKey))
.header(HEADER_CONTENT_ENCODING, HEADER_GZIP_VALUE)
.header(HEADER_CONTENT_TYPE, returnMediaType)
.build();
} else {
// 3.2. 返回普通的注册表信息
response = Response.ok(responseCache.get(cacheKey))
.build();
}
return response;
}
}
可见Server端就是根据请求client的参数构建了一个cacheKey,并根据client请求参数acceptEncoding是否需要压缩,而从ResponseCacheImpl中获取该key对应的value返回client端
ResponseCacheImpl 响应缓存类
Eureka Server针对Client端的注册表信息获取请求进行了结果的缓存,并不会每次请求都去注册表中获取信息,而是将信息存在了ResponseCache实例中,当第一次请求过来时,请求Key对应的缓存会被生成,之后再请求,都会从缓存中直接取,以提高请求性能
至于请求缓存的信息更新,Eureka会利用一个后台线程去定期更新
缓存Key分类规则
根据client端请求的类型,缓存被分为了三种类型:
all applications 全量注册信息
delta changes 增量信息
individual applications 个体注册信息
针对每一种类型,缓存信息格式又分为压缩和非压缩两种
压缩格式的缓存信息尤其对于查询全量注册信息时,能提供高效的网络传输性能
对于客户端接受的媒体类型不同,缓存的payload被分为了JSON和XML两种类型
最后一点是请求的版本号不同,缓存也会不同
缓存相关参数配置
Eureka Server 针对请求响应ResponseCache,采用了一个2级缓存:readWrite cache读写缓存(存在过期时间) 和 readonly cache只读缓存(没有过期时间)
eureka:
server:
use-read-only-response-cache: true # 是否使用只读缓存作为请求响应的缓存,默认启用
response-cache-update-interval-ms: 30000 # 用于一级响应缓存多久更新一次,默认30秒
initial-capacity-of-response-cache: 1000 # 用于定义二级响应缓存的容量大小,默认1000
response-cache-auto-expiration-in-seconds: 180 # 二级响应缓存自动失效时间,默认180秒
- 一级二级缓存获取规则
当开启使用一级缓存时,会从一级缓存获取信息,否则直接从二级缓存取
当第一次获取缓存时(一级缓存为null),会从二级缓存获取并将信息存入一级缓存(只读缓存)中
源码
// 响应缓存类
public class ResponseCacheImpl implements ResponseCache {
// 用于启用一级缓存,定时更新一级缓存的定时器
private final java.util.Timer timer = new java.util.Timer("Eureka-CacheFillTimer", true);
// 一级缓存(只读缓存),采用了一个ConcurrentHashMap实现
private final ConcurrentMap readOnlyCacheMap = new ConcurrentHashMap();
// 二级缓存(读写缓存),采用了google的LoadingCache实现
private final LoadingCache readWriteCacheMap;
// 是否启用一级缓存作为请求的响应缓存首选,通过配置参数自定义
private final boolean shouldUseReadOnlyResponseCache;
// 对等实例注册器,PeerAwareInstanceRegistryImpl
private final AbstractInstanceRegistry registry;
// Eureka Server配置
private final EurekaServerConfig serverConfig;
// 省略其他属性
...
// 构造方法,会在PeerAwareInstanceRegistryImpl 实例初始化时触发调用响应缓存实例的构造
ResponseCacheImpl(EurekaServerConfig serverConfig, ServerCodecs serverCodecs, AbstractInstanceRegistry registry) {
// 省略其他初始化
...
// 1. 是否启用一级缓存
this.shouldUseReadOnlyResponseCache = serverConfig.shouldUseReadOnlyResponseCache();
// 2. 缓存持有的实例注册器实例
this.registry = registry;
// 3. 获取client响应一级缓存刷新时间间隔,默认30s
long responseCacheUpdateIntervalMs = serverConfig.getResponseCacheUpdateIntervalMs();
// 4. 最重要的一步是初始化这个二级缓存,因为一级缓存都是从二级缓存中获取数据或更新数据的
this.readWriteCacheMap = initReadWriteCacheMap();
// 5. 如果启用了一级缓存,则启用定时任务,周期性的更新二级缓存中的数据到一级缓存中
if (shouldUseReadOnlyResponseCache) {
timer.schedule(getCacheUpdateTask(),
new Date(((System.currentTimeMillis() / responseCacheUpdateIntervalMs) * responseCacheUpdateIntervalMs)
+ responseCacheUpdateIntervalMs),
responseCacheUpdateIntervalMs);
}
try {
Monitors.registerObject(this);
} catch (Throwable e) {
logger.warn("Cannot register the JMX monitor for the InstanceRegistry", e);
}
}
// 一级缓存刷新定时任务
// 获取一级缓存中的所有key,从二级缓存中取出key对应的value,并比较二级缓存中的value和一级缓存中的值,不相等则替换掉
private TimerTask getCacheUpdateTask() {
return new TimerTask() {
@Override
public void run() {
logger.debug("Updating the client cache from response cache");
for (Key key : readOnlyCacheMap.keySet()) {
if (logger.isDebugEnabled()) {
logger.debug("Updating the client cache from response cache for key : {} {} {} {}",
key.getEntityType(), key.getName(), key.getVersion(), key.getType());
}
try {
CurrentRequestVersion.set(key.getVersion());
Value cacheValue = readWriteCacheMap.get(key);
Value currentCacheValue = readOnlyCacheMap.get(key);
if (cacheValue != currentCacheValue) {
readOnlyCacheMap.put(key, cacheValue);
}
} catch (Throwable th) {
logger.error("Error while updating the client cache from response cache for key {}", key.toStringCompact(), th);
}
}
}
};
}
这里把最重要的初始化二级缓存的一步单独拿出来分析
this.readWriteCacheMap =
// 利用google CacheBuilder类构造二级缓存
CacheBuilder.newBuilder()
// 1. 定义二级缓存初始容量,默认1000
.initialCapacity(serverConfig.getInitialCapacityOfResponseCache())
// 2. 定义缓存自动失效时间,即写后多久自动失效,默认180秒
.expireAfterWrite(serverConfig.getResponseCacheAutoExpirationInSeconds(), TimeUnit.SECONDS)
// 3. 定义缓存清除监听器,用于解决github上曾经的一个issue https://github.com/Netflix/eureka/issues/118
// 该issue的问题是,所有包含region的已缓存的的key,都不会默认在180秒内失效清除
.removalListener(new RemovalListener() {
@Override
public void onRemoval(RemovalNotification notification) {
Key removedKey = notification.getKey();
if (removedKey.hasRegions()) {
Key cloneWithNoRegions = removedKey.cloneWithoutRegions();
regionSpecificKeys.remove(cloneWithNoRegions, removedKey);
}
}
})
// 4. 重点是这个CacheLoader 加载器,当调用readWriteCacheMap.get(key)方法,且该缓存中没有对应的key时,会执行该load方法,进而调用generatePayload(key)方法
// 生成key对应的value后,将该value set进缓存,再返回该value值,这是guava cache的缓存机制
.build(new CacheLoader() {
@Override
public Value load(Key key) throws Exception {
if (key.hasRegions()) {
Key cloneWithNoRegions = key.cloneWithoutRegions();
regionSpecificKeys.put(cloneWithNoRegions, key);
}
Value value = generatePayload(key);
return value;
}
});
生成缓存key对应value的方法 generatePayload(key)
// 这里就是根据key的entityType类型以及key的Name,调用registry不同的方法以获取对应的Applications信息
// 再调用getPayLoad方法根据key的type,以生成JSON或XML格式的的Applications信息后返回
// 最后封装成为Value对象返回,注意这里再new Value对象时,同时内部生成了压缩版的格式的payLoad 即gzipped参数
private Value generatePayload(Key key) {
Stopwatch tracer = null;
try {
String payload;
switch (key.getEntityType()) {
case Application:
boolean isRemoteRegionRequested = key.hasRegions();
if (ALL_APPS.equals(key.getName())) {
if (isRemoteRegionRequested) {
tracer = serializeAllAppsWithRemoteRegionTimer.start();
payload = getPayLoad(key, registry.getApplicationsFromMultipleRegions(key.getRegions()));
} else {
tracer = serializeAllAppsTimer.start();
payload = getPayLoad(key, registry.getApplications());
}
} else if (ALL_APPS_DELTA.equals(key.getName())) {
if (isRemoteRegionRequested) {
tracer = serializeDeltaAppsWithRemoteRegionTimer.start();
versionDeltaWithRegions.incrementAndGet();
versionDeltaWithRegionsLegacy.incrementAndGet();
payload = getPayLoad(key,
registry.getApplicationDeltasFromMultipleRegions(key.getRegions()));
} else {
tracer = serializeDeltaAppsTimer.start();
versionDelta.incrementAndGet();
versionDeltaLegacy.incrementAndGet();
payload = getPayLoad(key, registry.getApplicationDeltas());
}
} else {
tracer = serializeOneApptimer.start();
payload = getPayLoad(key, registry.getApplication(key.getName()));
}
break;
case VIP:
case SVIP:
tracer = serializeViptimer.start();
payload = getPayLoad(key, getApplicationsForVip(key, registry));
break;
default:
logger.error("Unidentified entity type: {} found in the cache key.", key.getEntityType());
payload = "";
break;
}
return new Value(payload);
} finally {
if (tracer != null) {
tracer.stop();
}
}
}
缓存失效
一级缓存不存在自动失效期和手动清除
二级缓存存在默认180s自动清除以及当注册服务下线,过期,注册,状态变更,都会来清除里面的数据
另外当二级缓存数据被清除以后以后,只能依靠定时任务刷新一级缓存里面的数据,也就是说最快也要等默认的30s才能更新一级缓存
- 一级缓存是默认开启的,如果不能忍受这30秒的响应缓存变更延迟,可以手动禁止使用一级缓存
至此整个定时刷新流程基本走完,server端关于一级/二级缓存的存取流程图奉上
写在最后
问几个小问题:
client在启动时就会去server拉取一次注册表信息吗?
client端缓存刷新机制是如何实现的?
server端针对client频繁获取注册表信息请求做了哪些优化?当client出现注册,下线,失效,状态变更等情况时,client端会立即获取到变更后的注册表信息吗?
有没有可能出现某个client已经下线,但是其他client仍然认为该client服务仍旧在线?如何优化?