eureka源码分析-复制算法(三)

一、前言

这篇文章尝试通过 eureka 心跳包的整个复制流程来带大家去理解eureka的复制算法。
首先大家需要明白，eureka跟zookeeper不一样，eureka是选择了A(可用性)P(分区容忍)的实现，而zookeeper选择了C(一致性)P(分区容忍)的实现。这也就造就了eureka和zookeeper在实现上不一样，由于不需要保证一致性，eureka不需要有一个中心结点的存在，所有的结点都是对等的。

对等结点

二、分析

1.eureka-client上报心跳包到eureka-server

请求报文如下：

Hypertext Transfer Protocol
    PUT /eureka/v2/apps/SAMPLEREGISTERINGSERVICE/201709-07262?status=UP&lastDirtyTimestamp=1552742035025 HTTP/1.1\r\n
    DiscoveryIdentity-Name: DefaultClient\r\n
    DiscoveryIdentity-Version: 1.4\r\n
    DiscoveryIdentity-Id: 172.19.10.230\r\n
    Accept-Encoding: gzip\r\n
    Content-Length: 0\r\n
    Host: localhost:8080\r\n
    Connection: Keep-Alive\r\n
    User-Agent: Java-EurekaClient/v\r\n
    \r\n
    [Full request URI: http://localhost:8080/eureka/v2/apps/SAMPLEREGISTERINGSERVICE/201709-07262?status=UP&lastDirtyTimestamp=1552742035025]
    [HTTP request 1/1]
    [Response in frame: 7296]

客户端把应用存活的信息上报到eureka-server。
具体可以看上一篇分析：eureka源码分析-DiscoveryClient
服务端的代码入口：InstanceResource.renewLease

/**
     * A put request for renewing lease from a client instance.
     *
     * @param isReplication
     *            a header parameter containing information whether this is
     *            replicated from other nodes.
     * @param overriddenStatus
     *            overridden status if any.
     * @param status
     *            the {@link InstanceStatus} of the instance.
     * @param lastDirtyTimestamp
     *            last timestamp when this instance information was updated.
     * @return response indicating whether the operation was a success or
     *         failure.
     */
    @PUT
    public Response renewLease(
            @HeaderParam(PeerEurekaNode.HEADER_REPLICATION) String isReplication,
            @QueryParam("overriddenstatus") String overriddenStatus,
            @QueryParam("status") String status,
            @QueryParam("lastDirtyTimestamp") String lastDirtyTimestamp) {

通过这个方法，eureka会把客户端上报的信息同步到内存，当然这里还会涉及到实例状态冲突的问题，这里暂时不作讨论。

2. 增量复制

增量复制

eureka-server把结点信息复制到其他eureka-server结点
通过方法的调用链我们跟踪到PeerAwareInstanceRegistryImpl.renew这个方法。

调用栈

/*
     * (non-Javadoc)
     *
     * @see com.netflix.eureka.registry.InstanceRegistry#renew(java.lang.String,
     * java.lang.String, long, boolean)
     */
    public boolean renew(final String appName, final String id, final boolean isReplication) {
        if (super.renew(appName, id, isReplication)) {
            //复制信息到其他结点
            replicateToPeers(Action.Heartbeat, appName, id, null, null, isReplication);
            return true;
        }
        return false;
    }

/**
     * Replicates all eureka actions to peer eureka nodes except for replication
     * traffic to this node.
     *
     */
    private void replicateToPeers(Action action, String appName, String id,
                                  InstanceInfo info /* optional */,
                                  InstanceStatus newStatus /* optional */, boolean isReplication) {
        Stopwatch tracer = action.getTimer().start();
        try {
            if (isReplication) {
                numberOfReplicationsLastMin.increment();
            }
            // If it is a replication already, do not replicate again as this will create a poison replication
            if (peerEurekaNodes == Collections.EMPTY_LIST || isReplication) {
                return;
            }

            for (final PeerEurekaNode node : peerEurekaNodes.getPeerEurekaNodes()) {
                // If the url represents this host, do not replicate to yourself.
                if (peerEurekaNodes.isThisMyUrl(node.getServiceUrl())) {
                    continue;
                }
                replicateInstanceActionsToPeers(action, appName, id, info, newStatus, node);
            }
        } finally {
            tracer.stop();
        }
    }

最初收到心跳包的eureka-server的isReplication为false，因此会把节点信息往其他的eureka-server结点传递。
当传递到下一个结点的时候isReplication已经为true，表示该结点信息由其他eureka-server结点复制过来，这时候下一个结点就不会继续往下传递。这主要是为了避免造成死循环。
在数据复制过程中失败(结点重启？网络分区)，这种场景如何解决？除了增量复制以外，eureka-server还会定时做结点间的全量复制来保证数据的一致性(30s一次)。

3. 全量复制

具体源码可以参考之前的分析
eureka源码分析-复制算法(一)

image.png

三、总结

eureka-server的复制算法是依赖增量复制+全量复制实现的。区别于zookeeper，这里没有leader的概念，所有的结点都是平等的，因此数据并不保证一致性。
eureka-server数据的状态冲突如何解决？后面考虑单独抽一篇文章来分析