官方API:https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-document-bulk.html
之前和ES support的小伙伴说是我现在就是异步的,会自动进行大小验证,数量验证,然后进行数据提交,而不是同步的。之前因为忙,一直没调研。后面有时间就调研了一下,果然这位同学,看错了。
es rest client 现在bulk api,默认是同步的,而不是异步的。
bulksync才是,所以我把代码拔的过程,重新梳理一遍让自己思路更清楚一点。
但是这里有一个问题,是为什么要异步了?其实这里就是有一个概念,首先从ES插入来说
ES每次提交一个请求都会有一个segement file,如果1000条数据,提交1000次和提交1次里面是1000的处理的性能是完全不同的。当然ES是有segement合并,但是合并就有性能损耗
当然代码我还没看过,所以可以参考一下,其他大佬的文章:
https://blog.csdn.net/jiaojiao521765146514/article/details/83753215
但是个人我,在使用中出现了类似的问题。由于上次调用方无法每次大量数据,拆分成多个小请求,导致偶尔请求高峰,但是我有重试机制,但是长此以往对于扩展肯定不好。所以就带来了下面的研究。
然后在源码分析之前:再来分享一下自己对于ES查询的优化想法。ES查询是倒排索引,以前我总以为直接是把倒排索引直接放内存,放不下就进硬盘。然后一直感觉想法不对。查了一下lucene文档,感觉真的聪明,term dictionry是很大,那么做一个term index来做。
然后term index的索引做成一个二叉树,这样子就牛逼了。时间复杂度又变成了logN.
logN带来的结果是即便数据量再大,也能压缩到很少的。
然后这个推荐看 极客时间的:算法 还有 漫画算法。
算法导论,我是真的服气。刷不完。最主要是算法题还有一个很重要的功能就是面试。哈哈哈。祝福自己面试顺利
然后再补充一下:缓存概念,es缓存分为了 query cache,request cache,fielddata data
所以完全可以把常用的index尽量合并,来提高命中率。同时提高内存。
这样子每次驱逐的是使用低频的index.
/**
* @deprecated If creating a new HLRC ReST API call, consider creating new actions instead of reusing server actions. The Validation
* layer has been added to the ReST client, and requests should extend {@link Validatable} instead of {@link ActionRequest}.
*/
@Deprecated
protected final Resp performRequest(Req request,
CheckedFunction requestConverter,
RequestOptions options,
CheckedFunction responseConverter,
Set ignores) throws IOException {
//这里进行request校验
ActionRequestValidationException validationException = request.validate();
if (validationException != null && validationException.validationErrors().isEmpty() == false) {
throw validationException;
}
return internalPerformRequest(request, requestConverter, options, responseConverter, ignores);
}
@Override
//注意这里是子类进行的重写。父亲节点
public ActionRequestValidationException validate() {
ActionRequestValidationException validationException = null;
if (requests.isEmpty()) {
validationException = addValidationError("no requests added", validationException);
}
for (DocWriteRequest> request : requests) {
// We first check if refresh has been set
if (((WriteRequest>) request).getRefreshPolicy() != RefreshPolicy.NONE) {
validationException = addValidationError(
"RefreshPolicy is not supported on an item request. Set it on the BulkRequest instead.", validationException);
}
//这里的validate是基于iindex的了
ActionRequestValidationException ex = ((WriteRequest>) request).validate();
if (ex != null) {
if (validationException == null) {
validationException = new ActionRequestValidationException();
}
validationException.addValidationErrors(ex.validationErrors());
}
}
return validationException;
}
比如这里我使用的都是IndexRequest,一些基本类型的判断,其实没啥价值
@Override
public ActionRequestValidationException validate() {
ActionRequestValidationException validationException = super.validate();
if (source == null) {
validationException = addValidationError("source is missing", validationException);
}
if (Strings.isEmpty(type())) {
validationException = addValidationError("type is missing", validationException);
}
if (contentType == null) {
validationException = addValidationError("content type is missing", validationException);
}
final long resolvedVersion = resolveVersionDefaults();
if (opType() == OpType.CREATE) {
if (versionType != VersionType.INTERNAL) {
validationException = addValidationError("create operations only support internal versioning. use index instead",
validationException);
return validationException;
}
if (resolvedVersion != Versions.MATCH_DELETED) {
validationException = addValidationError("create operations do not support explicit versions. use index instead",
validationException);
return validationException;
}
if (ifSeqNo != UNASSIGNED_SEQ_NO || ifPrimaryTerm != UNASSIGNED_PRIMARY_TERM) {
validationException = addValidationError("create operations do not support compare and set. use index instead",
validationException);
return validationException;
}
}
if (opType() != OpType.INDEX && id == null) {
addValidationError("an id is required for a " + opType() + " operation", validationException);
}
validationException = DocWriteRequest.validateSeqNoBasedCASParams(this, validationException);
if (id != null && id.getBytes(StandardCharsets.UTF_8).length > 512) {
validationException = addValidationError("id is too long, must be no longer than 512 bytes but was: " +
id.getBytes(StandardCharsets.UTF_8).length, validationException);
}
if (id == null && (versionType == VersionType.INTERNAL && resolvedVersion == Versions.MATCH_ANY) == false) {
validationException = addValidationError("an id must be provided if version type or value are set", validationException);
}
if (pipeline != null && pipeline.isEmpty()) {
validationException = addValidationError("pipeline cannot be an empty string", validationException);
}
return validationException;
}
然后就是
internalPerformRequest(request, requestConverter, options, responseConverter, ignores);
直接往下面找到真正封装的调用代码,虽然调用的是异步client,但是最终加了get,导致它变为了阻塞
private Response performRequest(final NodeTuple> nodeTuple,
final InternalRequest request,
Exception previousException) throws IOException {
RequestContext context = request.createContextForNextAttempt(nodeTuple.nodes.next(), nodeTuple.authCache);
HttpResponse httpResponse;
try {
//这个client是CloseableHttpAsyncClient,但是后面用了get(),其实也就是同步的
httpResponse = client.execute(context.requestProducer, context.asyncResponseConsumer, context.context, null).get();
} catch(Exception e) {
RequestLogger.logFailedRequest(logger, request.httpRequest, context.node, e);
onFailure(context.node);
Exception cause = extractAndWrapCause(e);
addSuppressedException(previousException, cause);
if (nodeTuple.nodes.hasNext()) {
return performRequest(nodeTuple, request, cause);
}
if (cause instanceof IOException) {
throw (IOException) cause;
}
if (cause instanceof RuntimeException) {
throw (RuntimeException) cause;
}
throw new IllegalStateException("unexpected exception type: must be either RuntimeException or IOException", cause);
}
ResponseOrResponseException responseOrResponseException = convertResponse(request, context.node, httpResponse);
if (responseOrResponseException.responseException == null) {
return responseOrResponseException.response;
}
addSuppressedException(previousException, responseOrResponseException.responseException);
if (nodeTuple.nodes.hasNext()) {
return performRequest(nodeTuple, request, responseOrResponseException.responseException);
}
throw responseOrResponseException.responseException;
}
基于3个维度
1.提交多少次
2.文件总共大小
3.每隔多久刷新一次
使用是通过bulkProcessor
public static BulkProcessor getBulkProcessor(RestHighLevelClient restHighLevelClient) {
BiConsumer> bulkConsumer =
(request, bulkListener) -> restHighLevelClient.bulkAsync(request, RequestOptions.DEFAULT, bulkListener);
return BulkProcessor.builder(bulkConsumer, new BulkProcessor.Listener() {
@Override
public void beforeBulk(long executionId, BulkRequest bulkRequest) {
}
@Override
public void afterBulk(long executionId, BulkRequest bulkRequest, BulkResponse bulkResponse) {
}
@Override
public void afterBulk(long executionId, BulkRequest bulkRequest, Throwable throwable) {
}
}).setBulkActions(5000)
.setFlushInterval(TimeValue.timeValueSeconds(10))
.build();
}
public static Builder builder(BiConsumer> consumer, Listener listener) {
Objects.requireNonNull(consumer, "consumer");
Objects.requireNonNull(listener, "listener");
//定时任务线程池,但是可以
final ScheduledThreadPoolExecutor scheduledThreadPoolExecutor = Scheduler.initScheduler(Settings.EMPTY);
return new Builder(consumer, listener,
buildScheduler(scheduledThreadPoolExecutor),
() -> Scheduler.terminate(scheduledThreadPoolExecutor, 10, TimeUnit.SECONDS));
}
static ScheduledThreadPoolExecutor initScheduler(Settings settings) {
//核心线程为1,
final ScheduledThreadPoolExecutor scheduler = new SafeScheduledThreadPoolExecutor(1,
EsExecutors.daemonThreadFactory(settings, "scheduler"), new EsAbortPolicy());
//关闭任务,延时任务依旧执行
scheduler.setExecuteExistingDelayedTasksAfterShutdownPolicy(false);
//关闭任务,依旧执行定时任务
scheduler.setContinueExistingPeriodicTasksAfterShutdownPolicy(false);
//取消任务并不删除
scheduler.setRemoveOnCancelPolicy(true);
return scheduler;
}
//Es线程名,所以查询直接按照Es的名称可以很容易看到哪些是ES线程
public static String threadName(Settings settings, String namePrefix) {
if (Node.NODE_NAME_SETTING.exists(settings)) {
return threadName(Node.NODE_NAME_SETTING.get(settings), namePrefix);
} else {
// TODO this should only be allowed in tests
return threadName("", namePrefix);
}
}
public static String threadName(final String nodeName, final String namePrefix) {
// TODO missing node names should only be allowed in tests
return "elasticsearch" + (nodeName.isEmpty() ? "" : "[") + nodeName + (nodeName.isEmpty() ? "" : "]") + "[" + namePrefix + "]";
}
static class EsThreadFactory implements ThreadFactory {
final ThreadGroup group;
final AtomicInteger threadNumber = new AtomicInteger(1);
final String namePrefix;
EsThreadFactory(String namePrefix) {
this.namePrefix = namePrefix;
SecurityManager s = System.getSecurityManager();
group = (s != null) ? s.getThreadGroup() :
Thread.currentThread().getThreadGroup();
}
@Override
public Thread newThread(Runnable r) {
//设置为守护线程
Thread t = new Thread(group, r,
namePrefix + "[T#" + threadNumber.getAndIncrement() + "]",
0);
t.setDaemon(true);
return t;
}
}
//报错策略是为自定义的会强行重试一波,但是这里没有使用。估计是考虑到大并发的情况。猜测
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
if (r instanceof AbstractRunnable && ((AbstractRunnable)r).isForceExecution()) {
BlockingQueue queue = executor.getQueue();
if (!(queue instanceof SizeBlockingQueue)) {
throw new IllegalStateException("forced execution, but expected a size queue");
} else {
try {
((SizeBlockingQueue)queue).forcePut(r);
} catch (InterruptedException var5) {
Thread.currentThread().interrupt();
throw new IllegalStateException("forced execution, but got interrupted", var5);
}
}
} else {
this.rejected.inc();
throw new EsRejectedExecutionException("rejected execution of " + r + " on " + executor, executor.isShutdown());
}
}
关键的builder方法
eturn new Builder(consumer, listener,
buildScheduler(scheduledThreadPoolExecutor),
() -> Scheduler.terminate(scheduledThreadPoolExecutor, 10, TimeUnit.SECONDS));
这里buildeSchedual和 terminate我觉得非常有意思。
Scheduler.terminate是去关闭线程池。
bulkProcessor.add(new IndexRequest())
private void internalAdd(DocWriteRequest> request) {
//bulkRequest and instance swapping is not threadsafe, so execute the mutations under a lock.
//once the bulk request is ready to be shipped swap the instance reference unlock and send the local reference to the handler.
Tuple bulkRequestToExecute = null;
lock.lock();
try {
//判断qschedual线程池是否已经关闭,关闭直接抛异常出去
ensureOpen();
//bulkRueqest底层是用hashSet来add,Yonglock来保证线程安全
bulkRequest.add(request);
//就是通过这个来判断是否超过了bulkSize/bulkActionNum
//所以多线程也是数据量多到限制才开多线程。牛逼。和线程池的逻辑一样
bulkRequestToExecute = newBulkRequestIfNeeded();
} finally {
lock.unlock();
}
//execute sending the local reference outside the lock to allow handler to control the concurrency via it's configuration.
if (bulkRequestToExecute != null) {
execute(bulkRequestToExecute.v1(), bulkRequestToExecute.v2());
}
}
private Tuple newBulkRequestIfNeeded(){
ensureOpen();
if (!isOverTheLimit()) {
return null;
}
final BulkRequest bulkRequest = this.bulkRequest;
this.bulkRequest = bulkRequestSupplier.get();
return new Tuple<>(bulkRequest,executionIdGen.incrementAndGet()) ;
}
private boolean isOverTheLimit() {
if (bulkActions != -1 && bulkRequest.numberOfActions() >= bulkActions) {
return true;
}
if (bulkSize != -1 && bulkRequest.estimatedSizeInBytes() >= bulkSize) {
return true;
}
return false;
}
然后就是execute,然后我觉得这里做切面就非常的耿直,也是我最喜欢的,直接那代码去做的静态代码。
good work
public void execute(BulkRequest bulkRequest, long executionId) {
Runnable toRelease = () -> {};
boolean bulkRequestSetupSuccessful = false;
try {
//Before bulk切面
listener.beforeBulk(executionId, bulkRequest);
//通过信号量来设置同时可以并发发送,有多少线程提交
semaphore.acquire();
toRelease = semaphore::release;
//这里其实latch,了必须等这里结束之后,才能进行下一步,来做到了每个action的同步返回出错或者成功
CountDownLatch latch = new CountDownLatch(1);
//这里就是去提交
retry.withBackoff(consumer, bulkRequest, ActionListener.runAfter(new ActionListener() {
@Override
public void onResponse(BulkResponse response) {
listener.afterBulk(executionId, bulkRequest, response);
}
@Override
public void onFailure(Exception e) {
listener.afterBulk(executionId, bulkRequest, e);
}
}, () -> {
semaphore.release();
latch.countDown();
}));
bulkRequestSetupSuccessful = true;
if (concurrentRequests == 0) {
latch.await();
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
logger.info(() -> new ParameterizedMessage("Bulk request {} has been cancelled.", executionId), e);
listener.afterBulk(executionId, bulkRequest, e);
} catch (Exception e) {
logger.warn(() -> new ParameterizedMessage("Failed to execute bulk request {}.", executionId), e);
listener.afterBulk(executionId, bulkRequest, e);
} finally {
if (bulkRequestSetupSuccessful == false) { // if we fail on client.bulk() release the semaphore
toRelease.run();
}
}
}
//注意最后的外部类
//这里有过一个qschedual注意,这里是只是失败重试,而不是10s之内刷新数据
public void execute(BulkRequest bulkRequest) {
this.currentBulkRequest = bulkRequest;
consumer.accept(bulkRequest, this);
}
//下面的方法
@Override
public void onResponse(BulkResponse bulkItemResponses) {
if (!bulkItemResponses.hasFailures()) {
// we're done here, include all responses
addResponses(bulkItemResponses, (r -> true));
finishHim();
} else {
if (canRetry(bulkItemResponses)) {
addResponses(bulkItemResponses, (r -> !r.isFailed()));
retry(createBulkRequestForRetry(bulkItemResponses));
} else {
addResponses(bulkItemResponses, (r -> true));
finishHim();
}
}
}
@Override
public void onFailure(Exception e) {
if (e instanceof RemoteTransportException && ((RemoteTransportException) e).status() == RETRY_STATUS && backoff.hasNext()) {
retry(currentBulkRequest);
} else {
try {
listener.onFailure(e);
} finally {
if (retryCancellable != null) {
retryCancellable.cancel();
}
}
}
}
然后跟踪下去就到了之前lambda传入的方法restHighLevelClient.bulkAsync(request, RequestOptions.DEFAULT, bulkListener)
和之前区别不大。CloseableHttpAsyncClient都是这个。
但是相比之前的区别是什么了?在于Listen
client.execute(context.requestProducer, context.asyncResponseConsumer, context.context, new FutureCallback() {
@Override
public void completed(HttpResponse httpResponse) {
try {
ResponseOrResponseException responseOrResponseException = convertResponse(request, context.node, httpResponse);
if (responseOrResponseException.responseException == null) {
listener.onSuccess(responseOrResponseException.response);
} else {
if (nodeTuple.nodes.hasNext()) {
listener.trackFailure(responseOrResponseException.responseException);
performRequestAsync(nodeTuple, request, listener);
} else {
listener.onDefinitiveFailure(responseOrResponseException.responseException);
}
}
} catch(Exception e) {
listener.onDefinitiveFailure(e);
}
}
@Override
public void failed(Exception failure) {
try {
RequestLogger.logFailedRequest(logger, request.httpRequest, context.node, failure);
onFailure(context.node);
if (nodeTuple.nodes.hasNext()) {
listener.trackFailure(failure);
performRequestAsync(nodeTuple, request, listener);
} else {
listener.onDefinitiveFailure(failure);
}
} catch(Exception e) {
listener.onDefinitiveFailure(e);
}
}
@Override
public void cancelled() {
listener.onDefinitiveFailure(new ExecutionException("request was cancelled", null));
}
});
其实现在问题来了,维度我们知道在add的时候去触发大小和数量的维度来计算。
那么怎么去计算主动刷新的维度了?
.setBulkActions(5000)
.setFlushInterval(TimeValue.timeValueSeconds(10))
.build();
其实楼上有一个代码分析是强行关闭,但是无法关闭返回falst,后面的add才能进行的。那么如何保证无法被强行关闭了。
就是在Build这里调用构造函数的时候,进行任务刷新。
BulkProcessor(BiConsumer> consumer, BackoffPolicy backoffPolicy, Listener listener,
int concurrentRequests, int bulkActions, ByteSizeValue bulkSize, @Nullable TimeValue flushInterval,
Scheduler scheduler, Runnable onClose, Supplier bulkRequestSupplier) {
this.bulkActions = bulkActions;
this.bulkSize = bulkSize.getBytes();
this.bulkRequest = bulkRequestSupplier.get();
this.bulkRequestSupplier = bulkRequestSupplier;
this.bulkRequestHandler = new BulkRequestHandler(consumer, backoffPolicy, listener, scheduler, concurrentRequests);
// Start period flushing task after everything is setup
this.cancellableFlushTask = startFlushTask(flushInterval, scheduler);
this.onClose = onClose;
}
//这里进行任务刷新
private Scheduler.Cancellable startFlushTask(TimeValue flushInterval, Scheduler scheduler) {
if (flushInterval == null) {
return new Scheduler.Cancellable() {
@Override
public boolean cancel() {
return false;
}
@Override
public boolean isCancelled() {
return true;
}
};
}
final Runnable flushRunnable = scheduler.preserveContext(new Flush());
return scheduler.scheduleWithFixedDelay(flushRunnable, flushInterval, ThreadPool.Names.GENERIC);
}
其实通过从上面的代码分析中,我得到了我想要的结论
1.刷新3个维度
2.多线程如何开启。
3.错误如何重试