ES5.6.4源码解析--批量索引bulk

#引言
ES的批量索引操作,可以把多条索引请求合成一次请求,每个请求可以指定不同的索引。当往ES中索引大量数据的时候,使用批量索引能够大大增加索引的数据。接下来让我们通过阅读批量索引的源码来揭开其神秘的面纱。
#索引请求的预处理
批量索引的入口位于TransportBulkAction#protected void doExecute(Task task, BulkRequest bulkRequest, ActionListener listener)

// 检查请求中是否含有 包含管道的所有请求(管道是5.0新增的,用于在索引前处理数据)
        if (bulkRequest.hasIndexRequestsWithPipelines()) {
            if (clusterService.localNode().isIngestNode()) { //1
                // 根据管道对索引请求做处理,处理后继续走批量索引
                processBulkIndexIngestRequest(task, bulkRequest, listener);//2
            } else {
                // 本节点不是Ingest节点,将请求发给下一个节点
                ingestForwarder.forwardIngestRequest(BulkAction.INSTANCE, bulkRequest, listener);
            }
            return;
        }

ES 5.0开始增加了IngestNode的概念,Ingest节点的作用是在实际索引之前,对文档进行预处理。它对索引请求或者批量索引请求进行拦截,对文档进行预处理,再将处理完毕的文档放回请求中,最后根据改变后的请求索引。
代码1: 是判断当前节点是否为ingest节点(可在elasticsearch.yml文件中配置node.ingest: false 来设置节点是否为ingest节点,默认为true)。
代码2 :如果当前节点是ingest节点,就根据定义好的管道对索引请求进行拦截处理,处理完后继续走批量索引。
代码3:如果当前节点不是ingest节点,将请求发给下一个ingest节点。
接下来进入processBulkIndexIngestRequest阅读

void processBulkIndexIngestRequest(Task task, BulkRequest original, ActionListener listener) {
        long ingestStartTimeInNanos = System.nanoTime();
        BulkRequestModifier bulkRequestModifier = new BulkRequestModifier(original);
        //1
        ingestService.getPipelineExecutionService().executeBulkRequest(() -> bulkRequestModifier, (indexRequest, exception) -> {
            logger.debug((Supplier) () -> new ParameterizedMessage("failed to execute pipeline [{}] for document [{}/{}/{}]",
                indexRequest.getPipeline(), indexRequest.index(), indexRequest.type(), indexRequest.id()), exception);
            bulkRequestModifier.markCurrentItemAsFailed(exception);
        }, (exception) -> {
            if (exception != null) {
                logger.error("failed to execute pipeline for a bulk request", exception);
                listener.onFailure(exception);
            } else {
                long ingestTookInMillis = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - ingestStartTimeInNanos);
                BulkRequest bulkRequest = bulkRequestModifier.getBulkRequest();
                ActionListener actionListener = bulkRequestModifier.wrapActionListenerIfNeeded(ingestTookInMillis, listener);
                if (bulkRequest.requests().isEmpty()) {
                    // at this stage, the transport bulk action can't deal with a bulk request with no requests,
                    // so we stop and send an empty response back to the client.
                    // (this will happen if pre-processing all items in the bulk failed)
                    actionListener.onResponse(new BulkResponse(new BulkItemResponse[0], 0));
                } else {
                    doExecute(task, bulkRequest, actionListener);
                }
            }
        });
    }

这个方法主要就是executeBulkRequest,用于预处理文档,参数包含:批量索引请求,预处理失败的回调函数,预处理成功的回调函数。接着看executeBulkRequest

public void executeBulkRequest(Iterable actionRequests,
                                   BiConsumer itemFailureHandler,
                                   Consumer completionHandler) {
        threadPool.executor(ThreadPool.Names.BULK).execute(new AbstractRunnable() {

            @Override
            public void onFailure(Exception e) {
                completionHandler.accept(e);
            }

            @Override
            protected void doRun() throws Exception {
                for (DocWriteRequest actionRequest : actionRequests) {
                    if ((actionRequest instanceof IndexRequest)) {
                        IndexRequest indexRequest = (IndexRequest) actionRequest;
                        if (Strings.hasText(indexRequest.getPipeline())) {
                            try {
                                innerExecute(indexRequest, getPipeline(indexRequest.getPipeline()));//1
                                //this shouldn't be needed here but we do it for consistency with index api
                                // which requires it to prevent double execution
                                indexRequest.setPipeline(null);//2
                            } catch (Exception e) {
                                itemFailureHandler.accept(indexRequest, e);
                            }
                        }
                    }
                }
                completionHandler.accept(null);
            }
        });
    }

代码1:进行预处理需要指定具体的处理过程,而具体的处理过程就是pipeline定义的,getPipeline(indexRequest.getPipeline())用于获取pipeline。以下是定义pipeline的demo:

PUT _ingest/pipeline/my-pipeline-id
{
  "description" : "describe pipeline",
  "processors" : [
    {
      "set" : {
        "field": "foo",
        "value": "bar"
      }
    }
  ]
}

获取到pipeline作为参数传入innerExecute方法,开始预处理数据。

代码2:预处理完成后,需要给请求的pipeline置空,这样做的目的是为了不重复执行预处理:成功回调函数会再次调用doExecute,执行bulkRequest.hasIndexRequestsWithPipelines() 来查看是否有pipeline,这时候发现pipeline=null就不会再次执行预处理操作了。

当前节点的预处理到此结束,如果当前节点不是ingest节点,就转发给ingest节点

public void forwardIngestRequest(Action action, ActionRequest request, ActionListener listener) {
        transportService.sendRequest(randomIngestNode(), action.name(), request,
            new ActionListenerResponseHandler(listener, action::newResponse));
    }

通过randomIngestNode 来选择请求转发的目标节点

private DiscoveryNode randomIngestNode() {
        final DiscoveryNode[] nodes = ingestNodes;//1
        if (nodes.length == 0) {
            throw new IllegalStateException("There are no ingest nodes in this cluster, unable to forward request to an ingest node.");
        }

        return nodes[Math.floorMod(ingestNodeGenerator.incrementAndGet(), nodes.length)];//2
    }

代码1:已经维护了一个ingest节点列表,如果这个列表为空就会抛异常
代码2:通过取模的方式来选择节点

#创建不存在的索引

if (needToCheck()) {//1
            // Attempt to create all the indices that we're going to need during the bulk before we start.
            // Step 1: collect all the indices in the request
            final Set indices = bulkRequest.requests.stream()//2
                .map(DocWriteRequest::index)
                .collect(Collectors.toSet());
            /* Step 2: filter that to indices that don't exist and we can create. At the same time build a map of indices we can't create
             * that we'll use when we try to run the requests. */
            final Map indicesThatCannotBeCreated = new HashMap<>();//3
            Set autoCreateIndices = new HashSet<>();
            ClusterState state = clusterService.state();
            for (String index : indices) {
                boolean shouldAutoCreate;
                try {
                    // 根据配置的action.auto_create_index
                    // index.mapper.dynamic  判断是否要自动创建索引
                    shouldAutoCreate = shouldAutoCreate(index, state);
                } catch (IndexNotFoundException e) {
                    shouldAutoCreate = false;
                    indicesThatCannotBeCreated.put(index, e);
                }
                if (shouldAutoCreate) {
                    autoCreateIndices.add(index);
                }
            }
            // Step 3: create all the indices that are missing, if there are any missing. start the bulk after all the creates come back.
            if (autoCreateIndices.isEmpty()) {
                executeBulk(task, bulkRequest, startTime, listener, responses, indicesThatCannotBeCreated);
            } else {
                final AtomicInteger counter = new AtomicInteger(autoCreateIndices.size());
                for (String index : autoCreateIndices) {
                //4
                    createIndex(index, bulkRequest.timeout(), new ActionListener() {
                        @Override
                        public void onResponse(CreateIndexResponse result) {
                            if (counter.decrementAndGet() == 0) {
                                executeBulk(task, bulkRequest, startTime, listener, responses, indicesThatCannotBeCreated);
                            }
                        }

                        @Override
                        public void onFailure(Exception e) {
                            if (!(ExceptionsHelper.unwrapCause(e) instanceof ResourceAlreadyExistsException)) {
                                // fail all requests involving this index, if create didn't work
                                for (int i = 0; i < bulkRequest.requests.size(); i++) {
                                    DocWriteRequest request = bulkRequest.requests.get(i);
                                    if (request != null && setResponseFailureIfIndexMatches(responses, i, request, index, e)) {
                                        bulkRequest.requests.set(i, null);
                                    }
                                }
                            }
                            if (counter.decrementAndGet() == 0) {
                                executeBulk(task, bulkRequest, startTime, ActionListener.wrap(listener::onResponse, inner -> {
                                    inner.addSuppressed(e);
                                    listener.onFailure(inner);
                                }), responses, indicesThatCannotBeCreated);
                            }
                        }
                    });
                }
            }
        }

代码1:首先要通过needToCheck 方法确定ES是否支持自动创建索引(可在配置文件中设置action.auto_create_index参数配置),如果不支持就不去创建索引,如果支持进入进一步的判断

代码2:获取请求中包含的所有索引

代码3:过滤出需要创建索引,且允许创建索引的索引:过滤条件主要看action.auto_create_index(是否允许自动创建索引),和index.mapper.dynamic(是否允许自动创建映射) 两个配置。只有即可以自动创建索引,又可以自动创建映射,且索引不存在,才能够创建索引。

代码4:开始创建索引

接下来就是调用executeBulk去执行批量索引操作。

void executeBulk(Task task, final BulkRequest bulkRequest, final long startTimeNanos, final ActionListener listener,
            final AtomicArray responses, Map indicesThatCannotBeCreated) {
        new BulkOperation(task, bulkRequest, listener, responses, startTimeNanos, indicesThatCannotBeCreated).run();
    }

创建BulkOperation对象实例,并且运行其dorun方法

@Override
        protected void doRun() throws Exception {
            final ClusterState clusterState = observer.setAndGetObservedState();
            if (handleBlockExceptions(clusterState)) {
                return;
            }
            final ConcreteIndices concreteIndices = new ConcreteIndices(clusterState, indexNameExpressionResolver);
            MetaData metaData = clusterState.metaData();
            for (int i = 0; i < bulkRequest.requests.size(); i++) {
                DocWriteRequest docWriteRequest = bulkRequest.requests.get(i);
                //the request can only be null because we set it to null in the previous step, so it gets ignored
                if (docWriteRequest == null) {
                    continue;
                }
                // 排除索引关闭或者索引不存在的
                if (addFailureIfIndexIsUnavailable(docWriteRequest, i, concreteIndices, metaData)) {
                    continue;
                }
                Index concreteIndex = concreteIndices.resolveIfAbsent(docWriteRequest);
                try {
                    switch (docWriteRequest.opType()) {
                        case CREATE:
                        case INDEX:
                            IndexRequest indexRequest = (IndexRequest) docWriteRequest;
                            MappingMetaData mappingMd = null;
                            // 获取索引元数据
                            final IndexMetaData indexMetaData = metaData.index(concreteIndex);
                            if (indexMetaData != null) {
                                // 根据type获取映射
                                mappingMd = indexMetaData.mappingOrDefault(indexRequest.type());
                            }
                            indexRequest.resolveRouting(metaData);
                            // 生成_id 和时间戳
                            indexRequest.process(mappingMd, allowIdGeneration, concreteIndex.getName());
                            break;
                        case UPDATE:
                            TransportUpdateAction.resolveAndValidateRouting(metaData, concreteIndex.getName(), (UpdateRequest) docWriteRequest);
                            break;
                        case DELETE:
                            docWriteRequest.routing(metaData.resolveIndexRouting(docWriteRequest.parent(), docWriteRequest.routing(), docWriteRequest.index()));
                            // check if routing is required, if so, throw error if routing wasn't specified
                            if (docWriteRequest.routing() == null && metaData.routingRequired(concreteIndex.getName(), docWriteRequest.type())) {
                                throw new RoutingMissingException(concreteIndex.getName(), docWriteRequest.type(), docWriteRequest.id());
                            }
                            break;
                        default: throw new AssertionError("request type not supported: [" + docWriteRequest.opType() + "]");
                    }
                } catch (ElasticsearchParseException | IllegalArgumentException | RoutingMissingException e) {
                    BulkItemResponse.Failure failure = new BulkItemResponse.Failure(concreteIndex.getName(), docWriteRequest.type(), docWriteRequest.id(), e);
                    BulkItemResponse bulkItemResponse = new BulkItemResponse(i, docWriteRequest.opType(), failure);
                    responses.set(i, bulkItemResponse);
                    // make sure the request gets never processed again
                    bulkRequest.requests.set(i, null);
                }
            }

            // first, go over all the requests and create a ShardId -> Operations mapping
            // 将请求根据路由分组
            Map> requestsByShard = new HashMap<>();
            for (int i = 0; i < bulkRequest.requests.size(); i++) {
                DocWriteRequest request = bulkRequest.requests.get(i);
                if (request == null) {
                    continue;
                 }
                String concreteIndex = concreteIndices.getConcreteIndex(request.index()).getName();
                // 用路由字段的hash值取模的方式获取偏移量,路由字段为空则取id值
                ShardId shardId = clusterService.operationRouting().indexShards(clusterState, concreteIndex, request.id(), request.routing()).shardId();
                List shardRequests = requestsByShard.computeIfAbsent(shardId, shard -> new ArrayList<>());
                shardRequests.add(new BulkItemRequest(i, request));
             }

            if (requestsByShard.isEmpty()) {
                listener.onResponse(new BulkResponse(responses.toArray(new BulkItemResponse[responses.length()]), buildTookInMillis(startTimeNanos)));
                return;
            }

            final AtomicInteger counter = new AtomicInteger(requestsByShard.size());
            String nodeId = clusterService.localNode().getId();
            for (Map.Entry> entry : requestsByShard.entrySet()) {
                final ShardId shardId = entry.getKey();
                final List requests = entry.getValue();
                BulkShardRequest bulkShardRequest = new BulkShardRequest(shardId, bulkRequest.getRefreshPolicy(),
                        requests.toArray(new BulkItemRequest[requests.size()]));
                // 在写入文档之前要求多少分片可用,在这里做设置
                bulkShardRequest.waitForActiveShards(bulkRequest.waitForActiveShards());
                // 设置超时等待时间默认一分钟
                bulkShardRequest.timeout(bulkRequest.timeout());
                if (task != null) {
                    bulkShardRequest.setParentTask(nodeId, task.getId());
                }
                shardBulkAction.execute(bulkShardRequest, new ActionListener() {
                    @Override
                    public void onResponse(BulkShardResponse bulkShardResponse) {
                        for (BulkItemResponse bulkItemResponse : bulkShardResponse.getResponses()) {
                            // we may have no response if item failed
                            if (bulkItemResponse.getResponse() != null) {
                                bulkItemResponse.getResponse().setShardInfo(bulkShardResponse.getShardInfo());
                            }
                            responses.set(bulkItemResponse.getItemId(), bulkItemResponse);
                        }
                        if (counter.decrementAndGet() == 0) {
                            finishHim();
                        }
                    }

                    @Override
                    public void onFailure(Exception e) {
                        // create failures for all relevant requests
                        for (BulkItemRequest request : requests) {
                            final String indexName = concreteIndices.getConcreteIndex(request.index()).getName();
                            DocWriteRequest docWriteRequest = request.request();
                            responses.set(request.id(), new BulkItemResponse(request.id(), docWriteRequest.opType(),
                                    new BulkItemResponse.Failure(indexName, docWriteRequest.type(), docWriteRequest.id(), e)));
                        }
                        if (counter.decrementAndGet() == 0) {
                            finishHim();
                        }
                    }

                    private void finishHim() {
                        listener.onResponse(new BulkResponse(responses.toArray(new BulkItemResponse[responses.length()]), buildTookInMillis(startTimeNanos)));
                    }
                });
            }
        }

这里对索引请求的参数进一步做了预处理,并且根据路由参数,计算出请求的目标分片,按分片分组请求,挨个用 shardBulkAction.execute处理每组请求。
TransportShardBulkAction.execute -> Transport.execute -> requestFilterChain.proceed(task, actionName, request, listener) -> this.action.doExecute(task, request, listener) -> TransportReplicationAction.doExecute
ES5.6.4源码解析--批量索引bulk_第1张图片
由于TransportShardBulkAction 继承于TransportReplicationAction,因此最后执行的是TransportReplicationAction.doExecute。

@Override
    protected void doExecute(Task task, Request request, ActionListener listener) {
        new ReroutePhase((ReplicationTask) task, request, listener).run();
    }

#请求的路由
继续执行ReroutePhase的doRun方法,将请求分发到各个路由节点

@Override
        protected void doRun() {
            setPhase(task, "routing");
            final ClusterState state = observer.setAndGetObservedState();
            if (handleBlockExceptions(state)) {
                return;
            }

            // request does not have a shardId yet, we need to pass the concrete index to resolve shardId
            final String concreteIndex = concreteIndex(state);
            final IndexMetaData indexMetaData = state.metaData().index(concreteIndex);
            if (indexMetaData == null) {
                retry(new IndexNotFoundException(concreteIndex));
                return;
            }
            if (indexMetaData.getState() == IndexMetaData.State.CLOSE) {
                throw new IndexClosedException(indexMetaData.getIndex());
            }

            // resolve all derived request fields, so we can route and apply it
            resolveRequest(state.metaData(), indexMetaData, request);
            assert request.shardId() != null : "request shardId must be set in resolveRequest";
            assert request.waitForActiveShards() != ActiveShardCount.DEFAULT : "request waitForActiveShards must be set in resolveRequest";

            // 找到路由指向的主分片
            final ShardRouting primary = primary(state);
            if (retryIfUnavailable(state, primary)) {
                return;
            }
            // 找到主分片所在的节点
            final DiscoveryNode node = state.nodes().get(primary.currentNodeId());
            if (primary.currentNodeId().equals(state.nodes().getLocalNodeId())) {
                // 如果这个节点就是本节点
                performLocalAction(state, primary, node, indexMetaData);
            } else {
                performRemoteAction(state, primary, node);
            }
        }

如果路由到本节点就执行performLocalAction,路由到其他节点就执行performRemoteAction,不过最终都要执行performAction

private void performLocalAction(ClusterState state, ShardRouting primary, DiscoveryNode node, IndexMetaData indexMetaData) {
            setPhase(task, "waiting_on_primary");
            if (logger.isTraceEnabled()) {
                logger.trace("send action [{}] on primary [{}] for request [{}] with cluster state version [{}] to [{}] ",
                    transportPrimaryAction, request.shardId(), request, state.version(), primary.currentNodeId());
            }
            performAction(node, transportPrimaryAction, true,
                new ConcreteShardRequest<>(request, primary.allocationId().getId(), indexMetaData.primaryTerm(primary.id()))); //1
        }
        
private void performAction(final DiscoveryNode node, final String action, final boolean isPrimaryAction,
                                   final TransportRequest requestToPerform) {
            transportService.sendRequest(node, action, requestToPerform, transportOptions, new TransportResponseHandler() {

               ...            });
        }

代码1处显示请求由transportPrimaryAction处理,而处理的handler已经在TransportReplicationAction类的构造函数中注册

transportService.registerRequestHandler(transportPrimaryAction, () -> new ConcreteShardRequest<>(request), executor,
            new PrimaryOperationTransportHandler());

即PrimaryOperationTransportHandler,由其messageReceived接收请求进行处理

@Override
        public void messageReceived(ConcreteShardRequest request, TransportChannel channel, Task task) {
            // incoming primary term can be 0 if request is coming from a < 5.6 node (relocated primary)
            // just use as speculative term the one from the current cluster state, it's validated against the actual primary term
            // within acquirePrimaryShardReference
            final long primaryTerm;
            if (request.primaryTerm > 0L) {
                primaryTerm = request.primaryTerm;
            } else {
                ShardId shardId = request.request.shardId();
                primaryTerm = clusterService.state().metaData().getIndexSafe(shardId.getIndex()).primaryTerm(shardId.id());
            }
            new AsyncPrimaryAction(request.request, request.targetAllocationID, primaryTerm, channel, (ReplicationTask) task).run();
        }

#获取分片锁
在索引数据之前需要获取分片的操作锁,保证线程安全。
new AsyncPrimaryAction(request.request, request.targetAllocationID, primaryTerm, channel, (ReplicationTask) task).run(); ->
acquirePrimaryShardReference(request.shardId(), targetAllocationID, primaryTerm, this); ->
indexShard.acquirePrimaryOperationLock(onAcquired, executor);->
indexShardOperationsLock.acquire(onLockAcquired, executorOnDelay, false);->
onAcquired.onResponse(releasable);
获取锁成功后调用AsyncPrimaryAction的 onResponse

@Override
        public void onResponse(PrimaryShardReference primaryShardReference) {
            try {
                if (primaryShardReference.isRelocated()) {
                    primaryShardReference.close(); // release shard operation lock as soon as possible
                    setPhase(replicationTask, "primary_delegation");
                    // delegate primary phase to relocation target
                    // it is safe to execute primary phase on relocation target as there are no more in-flight operations where primary
                    // phase is executed on local shard and all subsequent operations are executed on relocation target as primary phase.
                    final ShardRouting primary = primaryShardReference.routingEntry();
                    assert primary.relocating() : "indexShard is marked as relocated but routing isn't" + primary;
                    DiscoveryNode relocatingNode = clusterService.state().nodes().get(primary.relocatingNodeId());
                    if (relocatingNode != null && relocatingNode.getVersion().major > Version.CURRENT.major) {
                        // ES 6.x requires a primary context hand-off during primary relocation which is not implemented on ES 5.x,
                        // otherwise it might not be aware of a replica that finished recovery and became activated on the master before
                        // the new primary became in charge of replicating operations, as the cluster state with that in-sync information
                        // might not be applied yet on the primary relocation target before it would be in charge of replicating operations.
                        // This would mean that the new primary could advance the global checkpoint too quickly, not taking into account
                        // the newly in-sync replica.
                        // ES 6.x detects that the primary is relocating from a 5.x node, and activates the primary mode of the global
                        // checkpoint tracker only after activation of the relocation target, which means, however, that requests cannot
                        // be handled as long as the relocation target shard has not been activated.
                        throw new ReplicationOperation.RetryOnPrimaryException(request.shardId(),
                            "waiting for 6.x primary to be activated");
                    }
                    transportService.sendRequest(relocatingNode, transportPrimaryAction,
                        new ConcreteShardRequest<>(request, primary.allocationId().getRelocationId(), primaryTerm),
                        transportOptions,
                        new TransportChannelResponseHandler(logger, channel, "rerouting indexing to target primary " + primary,
                            TransportReplicationAction.this::newResponseInstance) {

                            @Override
                            public void handleResponse(Response response) {
                                setPhase(replicationTask, "finished");
                                super.handleResponse(response);
                            }

                            @Override
                            public void handleException(TransportException exp) {
                                setPhase(replicationTask, "finished");
                                super.handleException(exp);
                            }
                        });
                } else {
                    setPhase(replicationTask, "primary");
                    final IndexMetaData indexMetaData = clusterService.state().getMetaData().index(request.shardId().getIndex());
                    final boolean executeOnReplicas = (indexMetaData == null) || shouldExecuteReplication(indexMetaData);
                    final ActionListener listener = createResponseListener(primaryShardReference);
                    createReplicatedOperation(request,
                            ActionListener.wrap(result -> result.respond(listener), listener::onFailure),
                            primaryShardReference, executeOnReplicas)
                            .execute();
                }
            } catch (Exception e) {
                Releasables.closeWhileHandlingException(primaryShardReference); // release shard operation lock before responding to caller
                onFailure(e);
            }
        }

#在分片上执行索引请求
如果分片被转移,就重新路由转发,如果还是在本节点,就直接执行。

public void execute() throws Exception {
        final String activeShardCountFailure = checkActiveShardCount();
        final ShardRouting primaryRouting = primary.routingEntry();
        final ShardId primaryId = primaryRouting.shardId();
        if (activeShardCountFailure != null) {
            finishAsFailed(new UnavailableShardsException(primaryId,
                "{} Timeout: [{}], request: [{}]", activeShardCountFailure, request.timeout(), request));
            return;
        }

        totalShards.incrementAndGet();
        pendingActions.incrementAndGet();
        // 先在主分片上执行请求
        primaryResult = primary.perform(request);
        final ReplicaRequest replicaRequest = primaryResult.replicaRequest();
        if (replicaRequest != null) {
            if (logger.isTraceEnabled()) {
                logger.trace("[{}] op [{}] completed on primary for request [{}]", primaryId, opType, request);
            }

            // we have to get a new state after successfully indexing into the primary in order to honour recovery semantics.
            // we have to make sure that every operation indexed into the primary after recovery start will also be replicated
            // to the recovery target. If we use an old cluster state, we may miss a relocation that has started since then.
            ClusterState clusterState = clusterStateSupplier.get();
            final List shards = getShards(primaryId, clusterState);
            Set inSyncAllocationIds = getInSyncAllocationIds(primaryId, clusterState);

            markUnavailableShardsAsStale(replicaRequest, inSyncAllocationIds, shards);
			// 请求发到副本上执行
            performOnReplicas(replicaRequest, shards);
        }

        successfulShards.incrementAndGet();
        decPendingAndFinishIfNeeded();
    }

主分片执行请求成功后,请求再发到副本上执行。
我们看主分片的操作,副本上的操作与主分片流程大致一致。
perform -> TransportShardBulkAction.shardOperationOnPrimary

@Override
    public WritePrimaryResult shardOperationOnPrimary(
            BulkShardRequest request, IndexShard primary) throws Exception {
        final IndexMetaData metaData = primary.indexSettings().getIndexMetaData();

        long[] preVersions = new long[request.items().length];
        VersionType[] preVersionTypes = new VersionType[request.items().length];
        Translog.Location location = null;
        // 将批量请求逐个执行
        for (int requestIndex = 0; requestIndex < request.items().length; requestIndex++) {
            if (isAborted(request.items()[requestIndex].getPrimaryResponse()) == false) {
                location = executeBulkItemRequest(metaData, primary, request, preVersions, preVersionTypes, location, requestIndex);
            }
        }
        // 保存每个请求的结果
        BulkItemResponse[] responses = new BulkItemResponse[request.items().length];
        BulkItemRequest[] items = request.items();
        for (int i = 0; i < items.length; i++) {
            responses[i] = items[i].getPrimaryResponse();
        }
        BulkShardResponse response = new BulkShardResponse(request.shardId(), responses);
        return new WritePrimaryResult<>(request, response, location, null, primary, logger);
    }

将批量请求逐个执行,并且保存每个请求的结果

/** Executes bulk item requests and handles request execution exceptions */
    private Translog.Location executeBulkItemRequest(IndexMetaData metaData, IndexShard primary,
                                                     BulkShardRequest request,
                                                     long[] preVersions, VersionType[] preVersionTypes,
                                                     Translog.Location location, int requestIndex) throws Exception {
        final DocWriteRequest itemRequest = request.items()[requestIndex].request();
        preVersions[requestIndex] = itemRequest.version();
        preVersionTypes[requestIndex] = itemRequest.versionType();
        DocWriteRequest.OpType opType = itemRequest.opType();
        try {
            // execute item request
            final Engine.Result operationResult;
            final DocWriteResponse response;
            final BulkItemRequest replicaRequest;
            switch (itemRequest.opType()) {
                case CREATE:
                case INDEX:
                    final IndexRequest indexRequest = (IndexRequest) itemRequest;
                    Engine.IndexResult indexResult = executeIndexRequestOnPrimary(indexRequest, primary, mappingUpdatedAction);
                    if (indexResult.hasFailure()) {
                        response = null;
                    } else {
                        // update the version on request so it will happen on the replicas
                        final long version = indexResult.getVersion();
                        indexRequest.version(version);
                        indexRequest.versionType(indexRequest.versionType().versionTypeForReplicationAndRecovery());
                        assert indexRequest.versionType().validateVersionForWrites(indexRequest.version());
                        response = new IndexResponse(primary.shardId(), indexRequest.type(), indexRequest.id(),
                                indexResult.getVersion(), indexResult.isCreated());
                    }
                    operationResult = indexResult;
                    replicaRequest = request.items()[requestIndex];
                    break;
                case UPDATE:
                  ...        }
        return location;
    }

根据请求的类型具体执行,包括,新建,更新,删除,这里新建请求调用executeIndexRequestOnPrimary

public static Engine.IndexResult executeIndexRequestOnPrimary(IndexRequest request, IndexShard primary,
                                                                  MappingUpdatedAction mappingUpdatedAction) throws Exception {
        Engine.Index operation;
        try {
            operation = prepareIndexOperationOnPrimary(request, primary);
        } catch (MapperParsingException | IllegalArgumentException e) {
            return new Engine.IndexResult(e, request.version());
        }
        Mapping update = operation.parsedDoc().dynamicMappingsUpdate();
        final ShardId shardId = primary.shardId();
        if (update != null) {
            // can throw timeout exception when updating mappings or ISE for attempting to update default mappings
            // which are bubbled up
            try {
                mappingUpdatedAction.updateMappingOnMaster(shardId.getIndex(), request.type(), update);
            } catch (IllegalArgumentException e) {
                // throws IAE on conflicts merging dynamic mappings
                return new Engine.IndexResult(e, request.version());
            }
            try {
                operation = prepareIndexOperationOnPrimary(request, primary);
            } catch (MapperParsingException | IllegalArgumentException e) {
                return new Engine.IndexResult(e, request.version());
            }
            update = operation.parsedDoc().dynamicMappingsUpdate();
            if (update != null) {
                throw new ReplicationOperation.RetryOnPrimaryException(shardId,
                        "Dynamic mappings are not available on the node that holds the primary yet");
            }
        }
        return primary.index(operation);
    }

在真正索引数据之前需要更新一下映射,例如文档中可能会新增之前映射中没有的域,这时候就要更新映射,加入该域。

private Engine.IndexResult index(Engine engine, Engine.Index index) throws IOException {
        active.set(true);
        final Engine.IndexResult result;
        index = indexingOperationListeners.preIndex(shardId, index);
        try {
            if (logger.isTraceEnabled()) {
                logger.trace("index [{}][{}] (v# [{}])",  index.type(), index.id(), index.version());
            }
            result = engine.index(index);
        } catch (Exception e) {
            indexingOperationListeners.postIndex(shardId, index, e);
            throw e;
        }
        indexingOperationListeners.postIndex(shardId, index, result);
        return result;
    }

创建索引引擎去向索引添加文档,该引擎封装了对lucene的调用。
#总结
批量索引的过程主要有如下:

  1. 利用管道对请求做预处理(5.X新增特性)
  2. 校验索引,并且创建不存在的索引
  3. 将请求路由到各个目标节点
  4. 获取锁
  5. 通过调用Lucene接口执行请求。

ES批量索引的源码解析就到这里,具体lucene的源码,之后将另外解析。

你可能感兴趣的:(elasticsearch)