【问题经验】记一次Dubbo泛化调用踩坑-zookeeper临时节点暴增

现象 

    使用dubbo的开发者对dubbo泛化调用肯定不陌生。我们在定时任务管理场景中使用dubbo的泛化调用(泛化调用dubbo接口)。一次,同事在测试环境配置定时任务 10s执行一次,但是配置的接口并没有提供者。过不多久,zookeeper上就有了上万个消费者节点。

     官方dubbo泛化调用的示例是这样的:【问题经验】记一次Dubbo泛化调用踩坑-zookeeper临时节点暴增_第1张图片

ReferenceConfig获取"服务引用"的时候先取已经实例化的"服务引用",如果没有实例化过则会调用init来实例化"服务引用"。实例化"服务引用"过程默认要check提供者是否存在,不存在则抛异常导致实例化失败(此时已经在zookeeper上创建了消费者节点)。下一次通过ReferenceConfig获取"服务引用"又会失败(也会创建消费者节点,消费者节点上会带上时间戳所以每次都会创建新的节点)。如此反复,就会无穷尽的创建zk节点。 看一下ReferenceConfig获取"服务引用"的源码(关键位置笔者加了注视):

public class ReferenceConfig extends AbstractReferenceConfig {
    /**
     * The interface proxy reference
     */
    private transient volatile T ref;

    ...
    
    public synchronized T get() {
        checkAndUpdateSubConfigs();

        if (destroyed) {
            throw new IllegalStateException("The invoker of ReferenceConfig(" + url + ") has already destroyed!");
        }

        // 笔者注: 是否已经实例化了"服务引用", 没有则调用init实例化"服务引用"
        if (ref == null) {
            init();
        }
        return ref;
    }

    //笔者注:初始化"服务引用"
    private void init() {
        if (initialized) {
            return;
        }
        checkStubAndLocal(interfaceClass);
        checkMock(interfaceClass);
        Map map = new HashMap();

        map.put(SIDE_KEY, CONSUMER_SIDE);

        //笔者注:这里拼接上了"时间戳"的参数
        appendRuntimeParameters(map);
        if (!isGeneric()) {
            String revision = Version.getVersion(interfaceClass, version);
            if (revision != null && revision.length() > 0) {
                map.put(REVISION_KEY, revision);
            }

            String[] methods = Wrapper.getWrapper(interfaceClass).getMethodNames();
            if (methods.length == 0) {
                logger.warn("No method found in service interface " + interfaceClass.getName());
                map.put(METHODS_KEY, ANY_VALUE);
            } else {
                map.put(METHODS_KEY, StringUtils.join(new HashSet(Arrays.asList(methods)), COMMA_SEPARATOR));
            }
        }
        map.put(INTERFACE_KEY, interfaceName);
        appendParameters(map, metrics);
        appendParameters(map, application);
        appendParameters(map, module);
        // remove 'default.' prefix for configs from ConsumerConfig
        // appendParameters(map, consumer, Constants.DEFAULT_KEY);
        appendParameters(map, consumer);
        appendParameters(map, this);
        Map attributes = null;
        if (CollectionUtils.isNotEmpty(methods)) {
            attributes = new HashMap();
            for (MethodConfig methodConfig : methods) {
                appendParameters(map, methodConfig, methodConfig.getName());
                String retryKey = methodConfig.getName() + ".retry";
                if (map.containsKey(retryKey)) {
                    String retryValue = map.remove(retryKey);
                    if ("false".equals(retryValue)) {
                        map.put(methodConfig.getName() + ".retries", "0");
                    }
                }
                attributes.put(methodConfig.getName(), convertMethodConfig2AyncInfo(methodConfig));
            }
        }

        String hostToRegistry = ConfigUtils.getSystemProperty(DUBBO_IP_TO_REGISTRY);
        if (StringUtils.isEmpty(hostToRegistry)) {
            hostToRegistry = NetUtils.getLocalHost();
        } else if (isInvalidLocalHost(hostToRegistry)) {
            throw new IllegalArgumentException("Specified invalid registry ip from property:" + DUBBO_IP_TO_REGISTRY + ", value:" + hostToRegistry);
        }
        map.put(REGISTER_IP_KEY, hostToRegistry);
        
        // 笔者注: createProxy抛异常的化 ref就没有设置值,仍然为空
        ref = createProxy(map);

        String serviceKey = URL.buildKey(interfaceName, group, version);
        ApplicationModel.initConsumerModel(serviceKey, buildConsumerModel(serviceKey, attributes));
        initialized = true;
    }

    //笔者注: 创建"服务引用" 实例,
    //(1)先在zk上创建一个消费者节点
    //(2)校验服务是否可用(如果配置了需要校验),不可用抛异常
    //(3)创建一个代理对象
    private T createProxy(Map map) {
        if (shouldJvmRefer(map)) {
            URL url = new URL(LOCAL_PROTOCOL, LOCALHOST_VALUE, 0, interfaceClass.getName()).addParameters(map);
            invoker = REF_PROTOCOL.refer(interfaceClass, url);
            if (logger.isInfoEnabled()) {
                logger.info("Using injvm service " + interfaceClass.getName());
            }
        } else {
            urls.clear(); // reference retry init will add url to urls, lead to OOM
            if (url != null && url.length() > 0) { // user specified URL, could be peer-to-peer address, or register center's address.
                String[] us = SEMICOLON_SPLIT_PATTERN.split(url);
                if (us != null && us.length > 0) {
                    for (String u : us) {
                        URL url = URL.valueOf(u);
                        if (StringUtils.isEmpty(url.getPath())) {
                            url = url.setPath(interfaceName);
                        }
                        if (REGISTRY_PROTOCOL.equals(url.getProtocol())) {
                            urls.add(url.addParameterAndEncoded(REFER_KEY, StringUtils.toQueryString(map)));
                        } else {
                            urls.add(ClusterUtils.mergeUrl(url, map));
                        }
                    }
                }
            } else { // assemble URL from register center's configuration
                // if protocols not injvm checkRegistry
                if (!LOCAL_PROTOCOL.equalsIgnoreCase(getProtocol())){
                    checkRegistry();
                    List us = loadRegistries(false);
                    if (CollectionUtils.isNotEmpty(us)) {
                        for (URL u : us) {
                            URL monitorUrl = loadMonitor(u);
                            if (monitorUrl != null) {
                                map.put(MONITOR_KEY, URL.encode(monitorUrl.toFullString()));
                            }
                            urls.add(u.addParameterAndEncoded(REFER_KEY, StringUtils.toQueryString(map)));
                        }
                    }
                    if (urls.isEmpty()) {
                        throw new IllegalStateException("No such any registry to reference " + interfaceName + " on the consumer " + NetUtils.getLocalHost() + " use dubbo version " + Version.getVersion() + ", please config  to your spring config.");
                    }
                }
            }

            //笔者注: REF_PROTOCOL是一个SPI扩展,对于使用zookeeper注册中心来说实际会调用RegistryProtocol.refer.
           //RegistryProtocol会创建消费者节点,消费者节点的path带上了当前时间戳
            if (urls.size() == 1) {
                invoker = REF_PROTOCOL.refer(interfaceClass, urls.get(0));
            } else {
                List> invokers = new ArrayList>();
                URL registryURL = null;
                for (URL url : urls) {
                    invokers.add(REF_PROTOCOL.refer(interfaceClass, url));
                    if (REGISTRY_PROTOCOL.equals(url.getProtocol())) {
                        registryURL = url; // use last registry url
                    }
                }
                if (registryURL != null) { // registry url is available
                    // use RegistryAwareCluster only when register's CLUSTER is available
                    URL u = registryURL.addParameter(CLUSTER_KEY, RegistryAwareCluster.NAME);
                    // The invoker wrap relation would be: RegistryAwareClusterInvoker(StaticDirectory) -> FailoverClusterInvoker(RegistryDirectory, will execute route) -> Invoker
                    invoker = CLUSTER.join(new StaticDirectory(u, invokers));
                } else { // not a registry url, must be direct invoke.
                    invoker = CLUSTER.join(new StaticDirectory(invokers));
                }
            }
        }
        
        //笔者注:如果需要check服务提供者 则校验服务是否可用。如果服务不可用,则直接抛异常,
        //并没有给ref设置值,但是前面已经在zk上创建了消费者节点
        if (shouldCheck() && !invoker.isAvailable()) {
            throw new IllegalStateException("Failed to check the status of the service " + interfaceName + ". No provider available for the service " + (group == null ? "" : group + "/") + interfaceName + (version == null ? "" : ":" + version) + " from the url " + invoker.getUrl() + " to the consumer " + NetUtils.getLocalHost() + " use dubbo version " + Version.getVersion());
        }
        if (logger.isInfoEnabled()) {
            logger.info("Refer dubbo service " + interfaceClass.getName() + " from url " + invoker.getUrl());
        }
        /**
         * @since 2.7.0
         * ServiceData Store
         */
        MetadataReportService metadataReportService = null;
        if ((metadataReportService = getMetadataReportService()) != null) {
            URL consumerURL = new URL(CONSUMER_PROTOCOL, map.remove(REGISTER_IP_KEY), 0, map.get(INTERFACE_KEY), map);
            metadataReportService.publishConsumer(consumerURL);
        }
        // create service proxy
        return (T) PROXY_FACTORY.getProxy(invoker);
    }
...
}

下面是6次泛型调用服务"com.test.dubbogeneric.TestService"(没有提供者)之后的zk节点,创建了6个消费者,这些个消费者path除了timestamp不一样外其余都一模一样。

使用url解码之后更清晰(ip用xx.xx.xx.xx代替了)

[
    consumer://xx.xx.xx.xx/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000×tamp=1592036013311,
    consumer://xx.xx.xx.xx/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000×tamp=1592036035633,
    consumer://xx.xx.xx.xx/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000×tamp=1592036036773,
    consumer://xx.xx.xx.xx/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000×tamp=1592036037948,
    consumer://xx.xx.xx.xx/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000×tamp=1592036039105,
    consumer://xx.xx.xx.xx/org.apache.dubbo.rpc.service.GenericService?application=test&category=consumers&check=false&dubbo=2.0.2&generic=true&interface=com.test.dubbogeneric.TestService&lazy=false&pid=7905&release=2.7.3&retries=0&side=consumer&sticky=false&timeout=3000×tamp=1592036040226]

如何解决

解决方案有3个 

  • 1.【推荐】升级dubbo版本到2.7.7(包括)以上,2.7.7版本判断服务不可用时,执行了destroy操作会删除之前创建消费者节点
            if (shouldCheck() && !invoker.isAvailable()) {
                //笔者注: 2.7.7版本增加的destroy操作
                invoker.destroy();
                throw new IllegalStateException("Failed to check the status of the service "
                        + interfaceName
                        + ". No provider available for the service "
                        + (group == null ? "" : group + "/")
                        + interfaceName +
                        (version == null ? "" : ":" + version)
                        + " from the url "
                        + invoker.getUrl()
                        + " to the consumer "
                        + NetUtils.getLocalHost() + " use dubbo version " +    Version.getVersion());
            }

     

  • 2.【推荐】泛型调用创建ReferenceConfig时设置check=false,即
    reference.setCheck(false);

    设置false以后,init过程就不会校验服务是否可用,也就不抛异常,ref就不会为空。第2次、第3次、第n次获取"服务引用"时都直接返回第1次的ref,也就不会创建zk节点

  • 3. 捕获异常,通过java的泛型获取invoker,然后调用invoker的destroy来删除zk节点

你可能感兴趣的:(问题经验)