Apollo-配置更新原理与源码解析

文章目录

  • 1. 基础模型
  • 2. 用户配置修改并发布页面
  • 3. 服务端设计-配置发布后的实时推送
    • 3.1 用户在Portal操作配置发布
    • 3.2 Portal调用Admin Service的接口操作发布
      • 3.2.1 RetryableRestTemplate如何轮询
      • 3.2.3 请求到admin service
    • 3.3 Admin Service发布配置后,发送ReleaseMessage给各个Config Service
      • 3.3.1 ReleaseMessageScanner是通过@Bean方式生成的bean
      • 3.3.2 ReleaseMessageScanner
    • 3.4 Config Service收到ReleaseMessage后,通知对应的客户端
  • 4.客户端设计-Apollo客户端从配置中心拉取最新的配置、更新本地配置并通知到应用
    • 4.1 RemoteConfigRepository定时拉取配置
    • 4.2 RemoteConfigLongPollService发起长连接
  • 参考资料

1. 基础模型

如下图所示,

  1. 用户在配置中心对配置进行修改并发布
  2. 配置中心通知Apollo客户端有配置更新
  3. Apollo客户端从配置中心拉取最新的配置、更新本地配置并通知到应用

Apollo-配置更新原理与源码解析_第1张图片

2. 用户配置修改并发布页面

Apollo-配置更新原理与源码解析_第2张图片

3. 服务端设计-配置发布后的实时推送

如下图,大体有四个步骤:

  1. 用户在Portal操作配置发布
  2. Portal调用Admin Service的接口操作发布
  3. Admin Service发布配置后,发送ReleaseMessage给各个Config Service
  4. Config Service收到ReleaseMessage后,通知对应的客户端

如需debug,搭建教程如下;

  1. Apollo-本地开发环境搭建

  2. Apollo-SpringBoot应用集成Apollo客户端Apollo-配置更新原理与源码解析_第3张图片

3.1 用户在Portal操作配置发布

Portal端的类com.ctrip.framework.apollo.portal.controller.ReleaseController。

// @PreAuthorize 权限校验
@PreAuthorize(value = "@permissionValidator.hasReleaseNamespacePermission(#appId, #namespaceName, #env)")
  @PostMapping(value = "/apps/{appId}/envs/{env}/clusters/{clusterName}/namespaces/{namespaceName}/releases")
  public ReleaseDTO createRelease(@PathVariable String appId,
                                  @PathVariable String env, @PathVariable String clusterName,
                                  @PathVariable String namespaceName, @RequestBody NamespaceReleaseModel model) {
    model.setAppId(appId);
    model.setEnv(env);
    model.setClusterName(clusterName);
    model.setNamespaceName(namespaceName);

    if (model.isEmergencyPublish() && !portalConfig.isEmergencyPublishAllowed(Env.valueOf(env))) {
      throw new BadRequestException(String.format("Env: %s is not supported emergency publish now", env));
    }
	// 发布配置
    ReleaseDTO createdRelease = releaseService.publish(model);

    ConfigPublishEvent event = ConfigPublishEvent.instance();
    event.withAppId(appId)
        .withCluster(clusterName)
        .withNamespace(namespaceName)
        .withReleaseId(createdRelease.getId())
        .setNormalPublishEvent(true)
        .setEnv(Env.valueOf(env));
	// 触发配置发布事件,主要是触发webhook,发邮件等,不是本次关注的重点
    publisher.publishEvent(event);

    return createdRelease;
  }

com.ctrip.framework.apollo.portal.service.ReleaseService,
接着看releaseService.publish(model)方法,

 public ReleaseDTO publish(NamespaceReleaseModel model) {
    Env env = model.getEnv();
    boolean isEmergencyPublish = model.isEmergencyPublish();
    String appId = model.getAppId();
    String clusterName = model.getClusterName();
    String namespaceName = model.getNamespaceName();
    // 获取当前用户的userId
    String releaseBy = StringUtils.isEmpty(model.getReleasedBy()) ?
                       userInfoHolder.getUser().getUserId() : model.getReleasedBy();

	// 发布配置
    ReleaseDTO releaseDTO = releaseAPI.createRelease(appId, env, clusterName, namespaceName,
                                                     model.getReleaseTitle(), model.getReleaseComment(),
                                                     releaseBy, isEmergencyPublish);

    Tracer.logEvent(TracerEventType.RELEASE_NAMESPACE,
                    String.format("%s+%s+%s+%s", appId, env, clusterName, namespaceName));

    return releaseDTO;
  }

3.2 Portal调用Admin Service的接口操作发布

com.ctrip.framework.apollo.portal.api.AdminServiceAPI,接着看releaseAPI.createRelease(appId, env, clusterName,…),

 public ReleaseDTO createRelease(String appId, Env env, String clusterName, String namespace,
        String releaseName, String releaseComment, String operator,
        boolean isEmergencyPublish) {
      // 封装http请求
      HttpHeaders headers = new HttpHeaders();
      headers.setContentType(MediaType.parseMediaType(MediaType.APPLICATION_FORM_URLENCODED_VALUE + ";charset=UTF-8"));
      MultiValueMap<String, String> parameters = new LinkedMultiValueMap<>();
      parameters.add("name", releaseName);
      parameters.add("comment", releaseComment);
      parameters.add("operator", operator);
      parameters.add("isEmergencyPublish", String.valueOf(isEmergencyPublish));
      HttpEntity<MultiValueMap<String, String>> entity =
          new HttpEntity<>(parameters, headers);
      // http远程调用,restTemplate实际是RetryableRestTemplate,apoll自己封装的轮询重试的RestTemplate
      ReleaseDTO response = restTemplate.post(
          env, "apps/{appId}/clusters/{clusterName}/namespaces/{namespaceName}/releases", entity,
          ReleaseDTO.class, appId, clusterName, namespace);
      return response;
    }

3.2.1 RetryableRestTemplate如何轮询

该类上面官方的注释:封装RestTemplate. admin server集群在某些机器宕机或者超时的情况下轮询重试。那么它是如何轮询重试的,带着好奇,我debug看看。
Apollo-配置更新原理与源码解析_第4张图片
继续F7,
Apollo-配置更新原理与源码解析_第5张图片
继续F7,
Apollo-配置更新原理与源码解析_第6张图片
进入类AdminServiceAddressLocator的方法getServiceList(Env env),
Apollo-配置更新原理与源码解析_第7张图片
那么缓存cache中的服务又是怎么来的? 看AdminServiceAddressLocator 的源码。

  • @PostConstruct修饰的init()方法,在bean实例化后执行,方法中创建了定时任务线程池refreshServiceAddressService,并且在1ms后执行RefreshAdminServerAddressTask任务。
  • RefreshAdminServerAddressTask的run()方法如果获取服务正常,就每5min重新刷新服务,如果不正常就每10s刷新服务。这个地方的设计很妙。
/*
 * Copyright 2022 Apollo Authors
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 * http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 *
 */
package com.ctrip.framework.apollo.portal.component;

import com.ctrip.framework.apollo.portal.environment.PortalMetaDomainService;
import com.ctrip.framework.apollo.core.dto.ServiceDTO;
import com.ctrip.framework.apollo.portal.environment.Env;
import com.ctrip.framework.apollo.core.utils.ApolloThreadFactory;
import com.ctrip.framework.apollo.tracer.Tracer;
import com.google.common.collect.Lists;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.autoconfigure.http.HttpMessageConverters;
import org.springframework.stereotype.Component;
import org.springframework.util.CollectionUtils;
import org.springframework.web.client.RestTemplate;

import javax.annotation.PostConstruct;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;

@Component
public class AdminServiceAddressLocator {

  private static final long NORMAL_REFRESH_INTERVAL = 5 * 60 * 1000;
  private static final long OFFLINE_REFRESH_INTERVAL = 10 * 1000;
  private static final int RETRY_TIMES = 3;
  private static final String ADMIN_SERVICE_URL_PATH = "/services/admin";
  private static final Logger logger = LoggerFactory.getLogger(AdminServiceAddressLocator.class);

  private ScheduledExecutorService refreshServiceAddressService;
  private RestTemplate restTemplate;
  private List<Env> allEnvs;
  private Map<Env, List<ServiceDTO>> cache = new ConcurrentHashMap<>();

  private final PortalSettings portalSettings;
  private final RestTemplateFactory restTemplateFactory;
  private final PortalMetaDomainService portalMetaDomainService;

// 构造函数注入bean
  public AdminServiceAddressLocator(
      final HttpMessageConverters httpMessageConverters,
      final PortalSettings portalSettings,
      final RestTemplateFactory restTemplateFactory,
      final PortalMetaDomainService portalMetaDomainService
  ) {
    this.portalSettings = portalSettings;
    this.restTemplateFactory = restTemplateFactory;
    this.portalMetaDomainService = portalMetaDomainService;
  }
// bean实例化后,执行该方法
  @PostConstruct
  public void init() {
    allEnvs = portalSettings.getAllEnvs();

    //init restTemplate
    restTemplate = restTemplateFactory.getObject();

	// 创建线程数量为1的定时任务线程池
    refreshServiceAddressService =
        Executors.newScheduledThreadPool(1, ApolloThreadFactory.create("ServiceLocator", true));
	// 1ms之后, 执行RefreshAdminServerAddressTask任务一次
    refreshServiceAddressService.schedule(new RefreshAdminServerAddressTask(), 1, TimeUnit.MILLISECONDS);
  }

  public List<ServiceDTO> getServiceList(Env env) {
  	// 从缓存获取服务
    List<ServiceDTO> services = cache.get(env);
    if (CollectionUtils.isEmpty(services)) {
      return Collections.emptyList();
    }
    List<ServiceDTO> randomConfigServices = Lists.newArrayList(services);
    // 随机排序后,返回list
    Collections.shuffle(randomConfigServices);
    return randomConfigServices;
  }

  //maintain admin server address
  private class RefreshAdminServerAddressTask implements Runnable {

    @Override
    public void run() {
      boolean refreshSuccess = true;
      //refresh fail if get any env address fail
      for (Env env : allEnvs) {
      // 刷新cache里面的服务地址,只要有一个env的服务获取失败,refreshSuccess 就为false
        boolean currentEnvRefreshResult = refreshServerAddressCache(env);
        refreshSuccess = refreshSuccess && currentEnvRefreshResult;
      }
	// 这段代码相当于定时任务里面又创建新的定时任务,不停的执行刷新服务地址。这里妙啊!
      if (refreshSuccess) {
      // 如果服务获取成功,就每5min再执行RefreshAdminServerAddressTask任务
        refreshServiceAddressService
            .schedule(new RefreshAdminServerAddressTask(), NORMAL_REFRESH_INTERVAL, TimeUnit.MILLISECONDS);
      } else {
      // 如果有任一服务获取失败,就每10s再执行RefreshAdminServerAddressTask任务
        refreshServiceAddressService
            .schedule(new RefreshAdminServerAddressTask(), OFFLINE_REFRESH_INTERVAL, TimeUnit.MILLISECONDS);
      }
    }
  }

  private boolean refreshServerAddressCache(Env env) {
	// RETRY_TIMES重试3次,这里好像是硬编码,没看到哪里可以改RETRY_TIMES
    for (int i = 0; i < RETRY_TIMES; i++) {

      try {
        ServiceDTO[] services = getAdminServerAddress(env);
        // 获取服务地址数据为空或者大小为0,继续执行循环。
        if (services == null || services.length == 0) {
          continue;
        }
        cache.put(env, Arrays.asList(services));
        return true;
      } catch (Throwable e) {
        logger.error(String.format("Get admin server address from meta server failed. env: %s, meta server address:%s",
                                   env, portalMetaDomainService.getDomain(env)), e);
        Tracer
            .logError(String.format("Get admin server address from meta server failed. env: %s, meta server address:%s",
                                    env, portalMetaDomainService.getDomain(env)), e);
      }
    }
    return false;
  }
// 获取服务地址
  private ServiceDTO[] getAdminServerAddress(Env env) {
    String domainName = portalMetaDomainService.getDomain(env);
    String url = domainName + ADMIN_SERVICE_URL_PATH;
    return restTemplate.getForObject(url, ServiceDTO[].class);
  }
}

3.2.3 请求到admin service

com.ctrip.framework.apollo.adminservice.controller.ReleaseController.

@Transactional
  @PostMapping("/apps/{appId}/clusters/{clusterName}/namespaces/{namespaceName}/releases")
  public ReleaseDTO publish(@PathVariable("appId") String appId,
                            @PathVariable("clusterName") String clusterName,
                            @PathVariable("namespaceName") String namespaceName,
                            @RequestParam("name") String releaseName,
                            @RequestParam(name = "comment", required = false) String releaseComment,
                            @RequestParam("operator") String operator,
                            @RequestParam(name = "isEmergencyPublish", defaultValue = "false") boolean isEmergencyPublish) {
    Namespace namespace = namespaceService.findOne(appId, clusterName, namespaceName);
    if (namespace == null) {
      throw new NotFoundException("Could not find namespace for %s %s %s", appId, clusterName,
          namespaceName);
    }
    // 发布配置,保存数据到release,releasehistory等表。
    Release release = releaseService.publish(namespace, releaseName, releaseComment, operator, isEmergencyPublish);

    //send release message
    Namespace parentNamespace = namespaceService.findParentNamespace(namespace);
    String messageCluster;
    if (parentNamespace != null) {
      messageCluster = parentNamespace.getClusterName();
    } else {
      messageCluster = clusterName;
    }
    // 发送ReleaseMessage的实现方式在这里
    messageSender.sendMessage(ReleaseMessageKeyGenerator.generate(appId, messageCluster, namespaceName),
                              Topics.APOLLO_RELEASE_TOPIC);
    return BeanUtils.transform(ReleaseDTO.class, release);
  }
  • ReleaseMessageKeyGenerator,生成ReleaseMessage,形式:appId + cluster + namespace。例子:test-app+default+application-dev.xml。
public class ReleaseMessageKeyGenerator {

  private static final Joiner STRING_JOINER = Joiner.on(ConfigConsts.CLUSTER_NAMESPACE_SEPARATOR);

  public static String generate(String appId, String cluster, String namespace) {
    return STRING_JOINER.join(appId, cluster, namespace);
  }
}

Apollo-配置更新原理与源码解析_第8张图片

  • DatabaseMessageSender,保存ReleaseMessage到releasemessage表。
@Component
public class DatabaseMessageSender implements MessageSender {
  private static final Logger logger = LoggerFactory.getLogger(DatabaseMessageSender.class);
  private static final int CLEAN_QUEUE_MAX_SIZE = 100;
  private BlockingQueue<Long> toClean = Queues.newLinkedBlockingQueue(CLEAN_QUEUE_MAX_SIZE);
  private final ExecutorService cleanExecutorService;
  private final AtomicBoolean cleanStopped;

  private final ReleaseMessageRepository releaseMessageRepository;

  public DatabaseMessageSender(final ReleaseMessageRepository releaseMessageRepository) {
    cleanExecutorService = Executors.newSingleThreadExecutor(ApolloThreadFactory.create("DatabaseMessageSender", true));
    cleanStopped = new AtomicBoolean(false);
    this.releaseMessageRepository = releaseMessageRepository;
  }

  @Override
  @Transactional
  public void sendMessage(String message, String channel) {
    logger.info("Sending message {} to channel {}", message, channel);
    if (!Objects.equals(channel, Topics.APOLLO_RELEASE_TOPIC)) {
      logger.warn("Channel {} not supported by DatabaseMessageSender!", channel);
      return;
    }

    Tracer.logEvent("Apollo.AdminService.ReleaseMessage", message);
    Transaction transaction = Tracer.newTransaction("Apollo.AdminService", "sendMessage");
    try {
    // 保存ReleaseMessage信息
      ReleaseMessage newMessage = releaseMessageRepository.save(new ReleaseMessage(message));
       // 将ReleaseMessage信息的id添加进将清理队列,offer方法类似add方法。
      toClean.offer(newMessage.getId());
      transaction.setStatus(Transaction.SUCCESS);
    } catch (Throwable ex) {
      logger.error("Sending message to database failed", ex);
      transaction.setStatus(ex);
      throw ex;
    } finally {
      transaction.complete();
    }
  }
	// bean实例化后,初始化一个拉姆达方法
  @PostConstruct
  private void initialize() {
    cleanExecutorService.submit(() -> {
    // 清除标志没停,并且线程没被打断
      while (!cleanStopped.get() && !Thread.currentThread().isInterrupted()) {
        try {
        // 从队列取出id
          Long rm = toClean.poll(1, TimeUnit.SECONDS);
          if (rm != null) {
          // 清理ReleaseMessage
            cleanMessage(rm);
          } else {
            TimeUnit.SECONDS.sleep(5);
          }
        } catch (Throwable ex) {
          Tracer.logError(ex);
        }
      }
    });
  }

  private void cleanMessage(Long id) {
    //double check in case the release message is rolled back
    // 因为admin service和config service可能是多个实例运行,因此message被清理掉,所以这里作双重判断
    ReleaseMessage releaseMessage = releaseMessageRepository.findById(id).orElse(null);
    if (releaseMessage == null) {
      return;
    }
    boolean hasMore = true;
    while (hasMore && !Thread.currentThread().isInterrupted()) {
      // 根据id和message查询小于当前id的信息,按照id降序排序的前100个message,因为同个配置可能多次发布,有多个版本,需要把旧版本删除
      List<ReleaseMessage> messages = releaseMessageRepository.findFirst100ByMessageAndIdLessThanOrderByIdAsc(
          releaseMessage.getMessage(), releaseMessage.getId());
     // 删除message
      releaseMessageRepository.deleteAll(messages);
      // 如果size不等于100,说明旧版本删除完毕
      hasMore = messages.size() == 100;

      messages.forEach(toRemove -> Tracer.logEvent(
          String.format("ReleaseMessage.Clean.%s", toRemove.getMessage()), String.valueOf(toRemove.getId())));
    }
  }

  void stopClean() {
    cleanStopped.set(true);
  }
}

3.3 Admin Service发布配置后,发送ReleaseMessage给各个Config Service

官方文档:Config Service有一个线程会每秒扫描一次ReleaseMessage表,看看是否有新的消息记录,参见ReleaseMessageScanner。那看看ReleaseMessageScanner究竟是个什么东西。
示意图:
在这里插入图片描述

3.3.1 ReleaseMessageScanner是通过@Bean方式生成的bean

具体看:com.ctrip.framework.apollo.configservice.ConfigServiceAutoConfiguration。

@Bean
    public ReleaseMessageScanner releaseMessageScanner() {
      ReleaseMessageScanner releaseMessageScanner = new ReleaseMessageScanner();
	// 注册了很多监听器
      //0. handle release message cache
      releaseMessageScanner.addMessageListener(releaseMessageServiceWithCache);
      //1. handle gray release rule
      releaseMessageScanner.addMessageListener(grayReleaseRulesHolder);
      //2. handle server cache
      releaseMessageScanner.addMessageListener(configService);
      releaseMessageScanner.addMessageListener(configFileController);
      //3. notify clients 重点关注NotificationControllerV2
      releaseMessageScanner.addMessageListener(notificationControllerV2);
      releaseMessageScanner.addMessageListener(notificationController);
      return releaseMessageScanner;
    }

3.3.2 ReleaseMessageScanner

public class ReleaseMessageScanner implements InitializingBean {
  private static final Logger logger = LoggerFactory.getLogger(ReleaseMessageScanner.class);
  private static final int missingReleaseMessageMaxAge = 10; // hardcoded to 10, could be configured via BizConfig if necessary
  @Autowired
  private BizConfig bizConfig;
  @Autowired
  private ReleaseMessageRepository releaseMessageRepository;
  private int databaseScanInterval;
  private final List<ReleaseMessageListener> listeners;
  private final ScheduledExecutorService executorService;
  private final Map<Long, Integer> missingReleaseMessages; // missing release message id => age counter
  private long maxIdScanned;

  public ReleaseMessageScanner() {
    listeners = Lists.newCopyOnWriteArrayList();
    // 创建一个定时任务线程池,线程数为1
    executorService = Executors.newScheduledThreadPool(1, ApolloThreadFactory
        .create("ReleaseMessageScanner", true));
    missingReleaseMessages = Maps.newHashMap();
  }
 // ReleaseMessageScanner 实现了接口InitializingBean 
// bean在初始化后,即相关成员变量已经完成注入,比如bizConfig完成注入,执行如下的方法
  @Override
  public void afterPropertiesSet() throws Exception {
  // 获取间隔时长,默认为1000,单位毫秒,可以通过apollo.message-scan.interval自己设置时长
    databaseScanInterval = bizConfig.releaseMessageScanIntervalInMilli();
    // 获取最大id
    maxIdScanned = loadLargestMessageId();
    // 等1000ms后,每隔1000ms执行定时线程任务。
    executorService.scheduleWithFixedDelay(() -> {
      Transaction transaction = Tracer.newTransaction("Apollo.ReleaseMessageScanner", "scanMessage");
      try {
 
        scanMissingMessages();
       // 扫描信息
        scanMessages();
        transaction.setStatus(Transaction.SUCCESS);
      } catch (Throwable ex) {
        transaction.setStatus(ex);
        logger.error("Scan and send message failed", ex);
      } finally {
        transaction.complete();
      }
    }, databaseScanInterval, databaseScanInterval, TimeUnit.MILLISECONDS);

  }

  /**
   * add message listeners for release message
   * @param listener
   */
  public void addMessageListener(ReleaseMessageListener listener) {
    if (!listeners.contains(listener)) {
      listeners.add(listener);
    }
  }

  /**
   * Scan messages, continue scanning until there is no more messages
   */
  private void scanMessages() {
    boolean hasMoreMessages = true;
    while (hasMoreMessages && !Thread.currentThread().isInterrupted()) {
    // 扫描发送信息
      hasMoreMessages = scanAndSendMessages();
    }
  }

  /**
   * scan messages and send
   *
   * @return whether there are more messages
   */
  private boolean scanAndSendMessages() {
    //current batch is 500
   // 查找当前大于最大id的500条信息
    List<ReleaseMessage> releaseMessages =
        releaseMessageRepository.findFirst500ByIdGreaterThanOrderByIdAsc(maxIdScanned);
    if (CollectionUtils.isEmpty(releaseMessages)) {
      return false;
    }
   // 向监听器发送信息
    fireMessageScanned(releaseMessages);
    int messageScanned = releaseMessages.size();
    // 获取新的最大id
    long newMaxIdScanned = releaseMessages.get(messageScanned - 1).getId();
    // check id gaps, possible reasons are release message not committed yet or already rolled back
    // 有可能message没有提交或回滚,记录missing message
    if (newMaxIdScanned - maxIdScanned > messageScanned) {
      recordMissingReleaseMessageIds(releaseMessages, maxIdScanned);
    }
    maxIdScanned = newMaxIdScanned;
    return messageScanned == 500;
  }

  private void scanMissingMessages() {
    Set<Long> missingReleaseMessageIds = missingReleaseMessages.keySet();
    Iterable<ReleaseMessage> releaseMessages = releaseMessageRepository
        .findAllById(missingReleaseMessageIds);
    fireMessageScanned(releaseMessages);
    releaseMessages.forEach(releaseMessage -> {
      missingReleaseMessageIds.remove(releaseMessage.getId());
    });
    growAndCleanMissingMessages();
  }

  private void growAndCleanMissingMessages() {
    Iterator<Entry<Long, Integer>> iterator = missingReleaseMessages.entrySet()
        .iterator();
    while (iterator.hasNext()) {
      Entry<Long, Integer> entry = iterator.next();
      if (entry.getValue() > missingReleaseMessageMaxAge) {
        iterator.remove();
      } else {
        entry.setValue(entry.getValue() + 1);
      }
    }
  }

  private void recordMissingReleaseMessageIds(List<ReleaseMessage> messages, long startId) {
    for (ReleaseMessage message : messages) {
      long currentId = message.getId();
      if (currentId - startId > 1) {
        for (long i = startId + 1; i < currentId; i++) {
          missingReleaseMessages.putIfAbsent(i, 1);
        }
      }
      startId = currentId;
    }
  }

  /**
   * find largest message id as the current start point
   * @return current largest message id
   */
  private long loadLargestMessageId() {
    ReleaseMessage releaseMessage = releaseMessageRepository.findTopByOrderByIdDesc();
    return releaseMessage == null ? 0 : releaseMessage.getId();
  }

  /**
   * Notify listeners with messages loaded
   * @param messages
   */
  private void fireMessageScanned(Iterable<ReleaseMessage> messages) {
    for (ReleaseMessage message : messages) {
      for (ReleaseMessageListener listener : listeners) {
        try {
        // 通知监听器
          listener.handleMessage(message, Topics.APOLLO_RELEASE_TOPIC);
        } catch (Throwable ex) {
          Tracer.logError(ex);
          logger.error("Failed to invoke message listener {}", listener.getClass(), ex);
        }
      }
    }
  }
}

3.4 Config Service收到ReleaseMessage后,通知对应的客户端

官方描述的实现方式如下:

  • 客户端会发起一个Http请求到Config Service的notifications/v2接口,也就是NotificationControllerV2,参见RemoteConfigLongPollService
  • NotificationControllerV2不会立即返回结果,而是通过Spring DeferredResult把请求挂起
  • 如果在60秒内没有该客户端关心的配置发布,那么会返回Http状态码304给客户端
  • 如果有该客户端关心的配置发布,NotificationControllerV2会调用DeferredResult的setResult方法,传入有配置变化的namespace信息,同时该请求会立即返回。客户端从返回的结果中获取到配置变化的namespace后,会立即请求Config Service获取该namespace的最新配置。

com.ctrip.framework.apollo.configservice.controller.NotificationControllerV2。

监听器处理消息的逻辑。

@Override
  public void handleMessage(ReleaseMessage message, String channel) {
    logger.info("message received - channel: {}, message: {}", channel, message);
	// content的形式:freedom-code+default+application-dev.yml
    String content = message.getMessage();
    Tracer.logEvent("Apollo.LongPoll.Messages", content);
    if (!Topics.APOLLO_RELEASE_TOPIC.equals(channel) || Strings.isNullOrEmpty(content)) {
      return;
    }
    // 根据+号切割成属组,并且取第三个元素,即application-dev.yml
    String changedNamespace = retrieveNamespaceFromReleaseMessage.apply(content);

    if (Strings.isNullOrEmpty(changedNamespace)) {
      logger.error("message format invalid - {}", content);
      return;
    }
	// 判断有没有客户端的请求
    if (!deferredResults.containsKey(content)) {
      return;
    }

    //create a new list to avoid ConcurrentModificationException
    List<DeferredResultWrapper> results = Lists.newArrayList(deferredResults.get(content));

    ApolloConfigNotification configNotification = new ApolloConfigNotification(changedNamespace, message.getId());
    configNotification.addMessage(content, message.getId());

    //do async notification if too many clients
    // 如果客户端很多,异步推送通知。NotificationBatch默认是100,可以通过apollo.release-message.notification.batch设定
    if (results.size() > bizConfig.releaseMessageNotificationBatch()) {
      largeNotificationBatchExecutorService.submit(() -> {
        logger.debug("Async notify {} clients for key {} with batch {}", results.size(), content,
            bizConfig.releaseMessageNotificationBatch());
        for (int i = 0; i < results.size(); i++) {
          if (i > 0 && i % bizConfig.releaseMessageNotificationBatch() == 0) {
            try {
              TimeUnit.MILLISECONDS.sleep(bizConfig.releaseMessageNotificationBatchIntervalInMilli());
            } catch (InterruptedException e) {
              //ignore
            }
          }
          logger.debug("Async notify {}", results.get(i));
          results.get(i).setResult(configNotification);
        }
      });
      return;
    }

    logger.debug("Notify {} clients for key {}", results.size(), content);
	// DeferredResultWrapper是DeferredResult的包装类
    for (DeferredResultWrapper result : results) {
    // 响应请求
      result.setResult(configNotification);
    }
    logger.debug("Notification completed");
  }
 private static final Function<String, String> retrieveNamespaceFromReleaseMessage =
      releaseMessage -> {
        if (Strings.isNullOrEmpty(releaseMessage)) {
          return null;
        }
        根据+号切割成list
        List<String> keys = STRING_SPLITTER.splitToList(releaseMessage);
        //message should be appId+cluster+namespace
        if (keys.size() != 3) {
          logger.error("message format invalid - {}", releaseMessage);
          return null;
        }
        return keys.get(2);
      };

4.客户端设计-Apollo客户端从配置中心拉取最新的配置、更新本地配置并通知到应用

官方描述:
在这里插入图片描述
上图简要描述了Apollo客户端的实现原理:

  • 1.客户端和服务端保持了一个长连接,从而能第一时间获得配置更新的推送。(通过Http Long Polling实现)(什么是long polling,参考:Long Polling长轮询详解,Long Polling长轮询实现进阶)
  • 2.客户端还会定时从Apollo配置中心服务端拉取应用的最新配置。
    • 这是一个fallback机制,为了防止推送机制失效导致配置不更新
    • 客户端定时拉取会上报本地版本,所以一般情况下,对于定时拉取的操作,服务端都会返回304 - Not Modified
    • 定时频率默认为每5分钟拉取一次,客户端也可以通过在运行时指定System Property: apollo.refreshInterval来覆盖,单位为分钟。
  • 3.客户端从Apollo配置中心服务端获取到应用的最新配置后,会保存在内存中
  • 4.客户端会把从服务端获取到的配置在本地文件系统缓存一份
    • 在遇到服务不可用,或网络不通的时候,依然能从本地恢复配置
  • 5.应用程序可以从Apollo客户端获取最新的配置、订阅配置更新通知

我们的应用获取配置更新有apollo服务端主动推送(3.4点),也有客户端定时轮询。为什么要有客户端定时轮询,个人理解:服务端主动推送可能因为网络中断或者其他原因而导致配置更新通知不到应用,这时客户端定时轮询就能保证应用一定能收到配置更新。

4.1 RemoteConfigRepository定时拉取配置

每个namespace对应一个RemoteConfigRepository。

  • 构造函数
/**
   * Constructor.
   *
   * @param namespace the namespace
   */
  public RemoteConfigRepository(String namespace) {
    m_namespace = namespace;
    m_configCache = new AtomicReference<>();
    m_configUtil = ApolloInjector.getInstance(ConfigUtil.class);
    m_httpClient = ApolloInjector.getInstance(HttpClient.class);
    m_serviceLocator = ApolloInjector.getInstance(ConfigServiceLocator.class);
    remoteConfigLongPollService = ApolloInjector.getInstance(RemoteConfigLongPollService.class);
    m_longPollServiceDto = new AtomicReference<>();
    m_remoteMessages = new AtomicReference<>();
    m_loadConfigRateLimiter = RateLimiter.create(m_configUtil.getLoadConfigQPS());
    m_configNeedForceRefresh = new AtomicBoolean(true);
    m_loadConfigFailSchedulePolicy = new ExponentialSchedulePolicy(m_configUtil.getOnErrorRetryInterval(),
        m_configUtil.getOnErrorRetryInterval() * 8);
    // 试着拉配置
    this.trySync();
    // 客户端定时拉配置
    this.schedulePeriodicRefresh();
    // 客户端发起长连接(通过Http Long Polling实现)
    this.scheduleLongPollingRefresh();
  }
// 客户端定时拉配置
private void schedulePeriodicRefresh() {
    logger.debug("Schedule periodic refresh with interval: {} {}",
        m_configUtil.getRefreshInterval(), m_configUtil.getRefreshIntervalTimeUnit());
        // 定时任务线程池
    m_executorService.scheduleAtFixedRate(
        new Runnable() {
          @Override
          public void run() {
            Tracer.logEvent("Apollo.ConfigService", String.format("periodicRefresh: %s", m_namespace));
            logger.debug("refresh config for namespace: {}", m_namespace);
            // 试着获取配置
            trySync();
            Tracer.logEvent("Apollo.Client.Version", Apollo.VERSION);
          }
        }, m_configUtil.getRefreshInterval(), m_configUtil.getRefreshInterval(),
        m_configUtil.getRefreshIntervalTimeUnit());
  }
// 客户端发起长连接(通过Http Long Polling实现)
 private void scheduleLongPollingRefresh() {
    remoteConfigLongPollService.submit(m_namespace, this);
  }
// 这个在抽象类AbstractConfigRepository中,sync()实现在RemoteConfigRepository,典型的模板方法设计模式
 protected boolean trySync() {
    try {
      sync();
      return true;
    } catch (Throwable ex) {
      Tracer.logEvent("ApolloConfigException", ExceptionUtil.getDetailMessage(ex));
      logger
          .warn("Sync config failed, will retry. Repository {}, reason: {}", this.getClass(), ExceptionUtil
              .getDetailMessage(ex));
    }
    return false;
  }
 @Override
  protected synchronized void sync() {
    Transaction transaction = Tracer.newTransaction("Apollo.ConfigService", "syncRemoteConfig");

    try {
      ApolloConfig previous = m_configCache.get();
      ApolloConfig current = loadApolloConfig();

      //reference equals means HTTP 304
      if (previous != current) {
        logger.debug("Remote Config refreshed!");
        m_configCache.set(current);
        // 通知监听器
        this.fireRepositoryChange(m_namespace, this.getConfig());
      }

      if (current != null) {
        Tracer.logEvent(String.format("Apollo.Client.Configs.%s", current.getNamespaceName()),
            current.getReleaseKey());
      }

      transaction.setStatus(Transaction.SUCCESS);
    } catch (Throwable ex) {
      transaction.setStatus(ex);
      throw ex;
    } finally {
      transaction.complete();
    }
  }

Apollo-配置更新原理与源码解析_第9张图片

 protected void fireRepositoryChange(String namespace, Properties newProperties) {
    for (RepositoryChangeListener listener : m_listeners) {
      try {
        listener.onRepositoryChange(namespace, newProperties);
      } catch (Throwable ex) {
        Tracer.logError(ex);
        logger.error("Failed to invoke repository change listener {}", listener.getClass(), ex);
      }
    }
  }

// LocalFileConfigRepository

 @Override
  public void onRepositoryChange(String namespace, Properties newProperties) {
    if (newProperties.equals(m_fileProperties)) {
      return;
    }
    Properties newFileProperties = propertiesFactory.getPropertiesInstance();
    newFileProperties.putAll(newProperties);
    // 配置保存到缓存文件夹
    updateFileProperties(newFileProperties, m_upstream.getSourceType());
    this.fireRepositoryChange(namespace, newProperties);
  }

Apollo-配置更新原理与源码解析_第10张图片
// YmlConfigFile的父类AbstractConfigFile

 @Override
  public synchronized void onRepositoryChange(String namespace, Properties newProperties) {
    if (newProperties.equals(m_configProperties.get())) {
      return;
    }
    Properties newConfigProperties = propertiesFactory.getPropertiesInstance();
    newConfigProperties.putAll(newProperties);

    String oldValue = getContent();

    update(newProperties);
    m_sourceType = m_configRepository.getSourceType();

    String newValue = getContent();

    PropertyChangeType changeType = PropertyChangeType.MODIFIED;

    if (oldValue == null) {
      changeType = PropertyChangeType.ADDED;
    } else if (newValue == null) {
      changeType = PropertyChangeType.DELETED;
    }
	// 触发
    this.fireConfigChange(new ConfigFileChangeEvent(m_namespace, oldValue, newValue, changeType));

    Tracer.logEvent("Apollo.Client.ConfigChanges", m_namespace);
  }

Apollo-配置更新原理与源码解析_第11张图片

private void fireConfigChange(final ConfigFileChangeEvent changeEvent) {
    for (final ConfigFileChangeListener listener : m_listeners) {
      m_executorService.submit(new Runnable() {
        @Override
        public void run() {
          String listenerName = listener.getClass().getName();
          Transaction transaction = Tracer.newTransaction("Apollo.ConfigFileChangeListener", listenerName);
          try {
            listener.onChange(changeEvent);
            transaction.setStatus(Transaction.SUCCESS);
          } catch (Throwable ex) {
            transaction.setStatus(ex);
            Tracer.logError(ex);
            logger.error("Failed to invoke config file change listener {}", listenerName, ex);
          } finally {
            transaction.complete();
          }
        }
      });
    }
  }

Apollo-配置更新原理与源码解析_第12张图片
// PropertiesCompatibleFileConfigRepository

  @Override
  public void onChange(ConfigFileChangeEvent changeEvent) {
    this.trySync();
  }
@Override
  protected synchronized void sync() {
    Properties current = configFile.asProperties();

    Preconditions.checkState(current != null, "PropertiesCompatibleConfigFile.asProperties should never return null");

    if (cachedProperties != current) {
      cachedProperties = current;
      this.fireRepositoryChange(configFile.getNamespace(), cachedProperties);
    }
  }
 protected void fireRepositoryChange(String namespace, Properties newProperties) {
    for (RepositoryChangeListener listener : m_listeners) {
      try {
        listener.onRepositoryChange(namespace, newProperties);
      } catch (Throwable ex) {
        Tracer.logError(ex);
        logger.error("Failed to invoke repository change listener {}", listener.getClass(), ex);
      }
    }
  }

Apollo-配置更新原理与源码解析_第13张图片
// DefaultConfig

@Override
  public synchronized void onRepositoryChange(String namespace, Properties newProperties) {
    if (newProperties.equals(m_configProperties.get())) {
      return;
    }

    ConfigSourceType sourceType = m_configRepository.getSourceType();
    Properties newConfigProperties = propertiesFactory.getPropertiesInstance();
    newConfigProperties.putAll(newProperties);

    Map<String, ConfigChange> actualChanges = updateAndCalcConfigChanges(newConfigProperties,
        sourceType);

    //check double checked result
    if (actualChanges.isEmpty()) {
      return;
    }

    this.fireConfigChange(m_namespace, actualChanges);

    Tracer.logEvent("Apollo.Client.ConfigChanges", m_namespace);
  }

Apollo-配置更新原理与源码解析_第14张图片

 /**
   * @param changes map's key is config property's key
   */
  protected void fireConfigChange(String namespace, Map<String, ConfigChange> changes) {
    final Set<String> changedKeys = changes.keySet();
    final List<ConfigChangeListener> listeners = this.findMatchedConfigChangeListeners(changedKeys);

    // notify those listeners
    for (ConfigChangeListener listener : listeners) {
      Set<String> interestedChangedKeys = resolveInterestedChangedKeys(listener, changedKeys);
      InterestedConfigChangeEvent interestedConfigChangeEvent = new InterestedConfigChangeEvent(
          namespace, changes, interestedChangedKeys);
      this.notifyAsync(listener, interestedConfigChangeEvent);
    }
  }

Apollo-配置更新原理与源码解析_第15张图片

// AutoUpdateConfigChangeListener

@Override
  public void onChange(ConfigChangeEvent changeEvent) {
    Set<String> keys = changeEvent.changedKeys();
    if (CollectionUtils.isEmpty(keys)) {
      return;
    }
    for (String key : keys) {
      // 1. check whether the changed key is relevant
      Collection<SpringValue> targetValues = springValueRegistry.get(beanFactory, key);
      if (targetValues == null || targetValues.isEmpty()) {
        continue;
      }

      // 2. update the value
      for (SpringValue val : targetValues) {
        updateSpringValue(val);
      }
    }
  }

Apollo-配置更新原理与源码解析_第16张图片

private void updateSpringValue(SpringValue springValue) {
    try {
      Object value = resolvePropertyValue(springValue);
      springValue.update(value);

      logger.info("Auto update apollo changed value successfully, new value: {}, {}", value,
          springValue);
    } catch (Throwable ex) {
      logger.error("Auto update apollo changed value failed, {}", springValue.toString(), ex);
    }
  }

Apollo-配置更新原理与源码解析_第17张图片
// SpringValue
Apollo-配置更新原理与源码解析_第18张图片
Apollo-配置更新原理与源码解析_第19张图片
对于通过@Value注入的变量,最终是通过发射改变值的。

原来如此!!!!!

4.2 RemoteConfigLongPollService发起长连接

客户端提交长连接。

public boolean submit(String namespace, RemoteConfigRepository remoteConfigRepository) {
    boolean added = m_longPollNamespaces.put(namespace, remoteConfigRepository);
    m_notifications.putIfAbsent(namespace, INIT_NOTIFICATION_ID);
    if (!m_longPollStarted.get()) {
      startLongPolling();
    }
    return added;
  }
private void startLongPolling() {
// m_longPollStarted是AtomicBoolean,这里限制只发起一个长连接
    if (!m_longPollStarted.compareAndSet(false, true)) {
      //already started
      return;
    }
    try {
      final String appId = m_configUtil.getAppId();
      final String cluster = m_configUtil.getCluster();
      final String dataCenter = m_configUtil.getDataCenter();
      final String secret = m_configUtil.getAccessKeySecret();
      final long longPollingInitialDelayInMills = m_configUtil.getLongPollingInitialDelayInMills();
      m_longPollingService.submit(new Runnable() {
        @Override
        public void run() {
          if (longPollingInitialDelayInMills > 0) {
            try {
              logger.debug("Long polling will start in {} ms.", longPollingInitialDelayInMills);
              TimeUnit.MILLISECONDS.sleep(longPollingInitialDelayInMills);
            } catch (InterruptedException e) {
              //ignore
            }
          }
          // 真正发起长连接的方法
          doLongPollingRefresh(appId, cluster, dataCenter, secret);
        }
      });
    } catch (Throwable ex) {
      m_longPollStarted.set(false);
      ApolloConfigException exception =
          new ApolloConfigException("Schedule long polling refresh failed", ex);
      Tracer.logError(exception);
      logger.warn(ExceptionUtil.getDetailMessage(exception));
    }
  }
private void doLongPollingRefresh(String appId, String cluster, String dataCenter, String secret) {
    final Random random = new Random();
    ServiceDTO lastServiceDto = null;
    // m_longPollingStopped标志为false,线程每被中断,就不停的执行
    while (!m_longPollingStopped.get() && !Thread.currentThread().isInterrupted()) {
	// 限流
      if (!m_longPollRateLimiter.tryAcquire(5, TimeUnit.SECONDS)) {
        //wait at most 5 seconds
        try {
          TimeUnit.SECONDS.sleep(5);
        } catch (InterruptedException e) {
        }
      }
      Transaction transaction = Tracer.newTransaction("Apollo.ConfigService", "pollNotification");
      String url = null;
      try {
      // 随机获得一个config service服务地址
        if (lastServiceDto == null) {
          List<ServiceDTO> configServices = getConfigServices();
          lastServiceDto = configServices.get(random.nextInt(configServices.size()));
        }
		// 组件访问的url,比如http://192.168.2.104:8080/notifications/v2?cluster=default&appId=freedom-code&ip=192.168.2.104¬ifications=%5B%7B%22namespaceName%22%3A%22application-dev.yml%22%2C%22notificationId%22%3A28%7D%5D,注意这里访问接口/notifications/v2
        url =
            assembleLongPollRefreshUrl(lastServiceDto.getHomepageUrl(), appId, cluster, dataCenter,
                m_notifications);

        logger.debug("Long polling from {}", url);

        HttpRequest request = new HttpRequest(url);
        // 这里LONG_POLLING_READ_TIMEOUT默认超时时间是90s,比服务端DeferredResultWrapper默认的60s长。
        request.setReadTimeout(LONG_POLLING_READ_TIMEOUT);
        if (!StringUtils.isBlank(secret)) {
          Map<String, String> headers = Signature.buildHttpHeaders(url, appId, secret);
          request.setHeaders(headers);
        }

        transaction.addData("Url", url);
		// 发起http请求,改请求将在服务端配置即时发布后返回response,或者服务端在60s没有配置发布返回response
        final HttpResponse<List<ApolloConfigNotification>> response =
            m_httpClient.doGet(request, m_responseType);

        logger.debug("Long polling response: {}, url: {}", response.getStatusCode(), url);
        // 如果返回成功,通知监听器
        if (response.getStatusCode() == 200 && response.getBody() != null) {
          updateNotifications(response.getBody());
          updateRemoteNotifications(response.getBody());
          transaction.addData("Result", response.getBody().toString());
          // 通知监听器
          notify(lastServiceDto, response.getBody());
        }

        //try to load balance
        if (response.getStatusCode() == 304 && random.nextBoolean()) {
          lastServiceDto = null;
        }

        m_longPollFailSchedulePolicyInSecond.success();
        transaction.addData("StatusCode", response.getStatusCode());
        transaction.setStatus(Transaction.SUCCESS);
      } catch (Throwable ex) {
        lastServiceDto = null;
        Tracer.logEvent("ApolloConfigException", ExceptionUtil.getDetailMessage(ex));
        transaction.setStatus(ex);
        long sleepTimeInSecond = m_longPollFailSchedulePolicyInSecond.fail();
        logger.warn(
            "Long polling failed, will retry in {} seconds. appId: {}, cluster: {}, namespaces: {}, long polling url: {}, reason: {}",
            sleepTimeInSecond, appId, cluster, assembleNamespaces(), url, ExceptionUtil.getDetailMessage(ex));
        try {
          TimeUnit.SECONDS.sleep(sleepTimeInSecond);
        } catch (InterruptedException ie) {
          //ignore
        }
      } finally {
        transaction.complete();
      }
    }
  }
 // 通知监听器
private void notify(ServiceDTO lastServiceDto, List<ApolloConfigNotification> notifications) {
    if (notifications == null || notifications.isEmpty()) {
      return;
    }
    for (ApolloConfigNotification notification : notifications) {
      String namespaceName = notification.getNamespaceName();
      //create a new list to avoid ConcurrentModificationException
      List<RemoteConfigRepository> toBeNotified =
          Lists.newArrayList(m_longPollNamespaces.get(namespaceName));
      ApolloNotificationMessages originalMessages = m_remoteNotificationMessages.get(namespaceName);
      ApolloNotificationMessages remoteMessages = originalMessages == null ? null : originalMessages.clone();
      //since .properties are filtered out by default, so we need to check if there is any listener for it
      toBeNotified.addAll(m_longPollNamespaces
          .get(String.format("%s.%s", namespaceName, ConfigFileFormat.Properties.getValue())));
      for (RemoteConfigRepository remoteConfigRepository : toBeNotified) {
        try {
        // 通知
          remoteConfigRepository.onLongPollNotified(lastServiceDto, remoteMessages);
        } catch (Throwable ex) {
          Tracer.logError(ex);
        }
      }
    }
  }

RemoteConfigRepository 的代码

 public void onLongPollNotified(ServiceDTO longPollNotifiedServiceDto, ApolloNotificationMessages remoteMessages) {
    m_longPollServiceDto.set(longPollNotifiedServiceDto);
    m_remoteMessages.set(remoteMessages);
    m_executorService.submit(new Runnable() {
      @Override
      public void run() {
        m_configNeedForceRefresh.set(true);
        // 试着获取配置
        trySync();
      }
    });
  }

参考资料

  • https://www.apolloconfig.com/#/zh/design/apollo-design

你可能感兴趣的:(Microservice,apollo源码解析)