http_code : 409
Caused by: org.elasticsearch.search.query.QueryPhaseExecutionException: Result window is too large, from + size must be less than or equal to: [20000] but was [83440000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.
at org.elasticsearch.search.DefaultSearchContext.preProcess(DefaultSearchContext.java:203) ~[elasticsearch-5.6.4.jar:5.6.4]
at org.elasticsearch.search.query.QueryPhase.preProcess(QueryPhase.java:95) ~[elasticsearch-5.6.4.jar:5.6.4]
at org.elasticsearch.search.SearchService.createContext(SearchService.java:497) ~[elasticsearch-5.6.4.jar:5.6.4]
at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:461) ~[elasticsearch-5.6.4.jar:5.6.4]
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:257) ~[elasticsearch-5.6.4.jar:5.6.4]
at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:343) ~[elasticsearch-5.6.4.jar:5.6.4]
at org.elasticsearch.action.search.SearchTransportService$6.messageReceived(SearchTransportService.java:340) ~[elasticsearch-5.6.4.jar:5.6.4]
at com.floragunn.searchguard.ssl.transport.SearchGuardSSLRequestHandler.messageReceivedDecorate(SearchGuardSSLRequestHandler.java:178) ~[?:?]
at com.floragunn.searchguard.transport.SearchGuardRequestHandler.messageReceivedDecorate(SearchGuardRequestHandler.java:107) ~[?:?]
at com.floragunn.searchguard.ssl.transport.SearchGuardSSLRequestHandler.messageReceived(SearchGuardSSLRequestHandler.java:92) ~[?:?]
at com.floragunn.searchguard.SearchGuardPlugin$5$1.messageReceived(SearchGuardPlugin.java:493) ~[?:?]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.6.4.jar:5.6.4]
source={
"from" : 83438000,
"size" : 2000,
"query" : {
"bool" : {
"must" : [
{
"range" : {
"createTime" : {
"from" : 1631807999000,
"to" : 1631894399000,
"include_lower" : true,
"include_upper" : false,
"boost" : 1.0
}
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"_source" : {
"includes" : [
"companyId",
"riskLabel"
],
"excludes" : [ ]
}
}}]
Elasticsearch 查询索引结果时,用于分页的两个属性 from和size。可以使用from和size 参数对结果进行分页。该from参数定义了您要获取的第一个结果的偏移量。该size参数允许您配置要返回的最大命中数。
虽然from和size可以设置为请求参数,但也可以在搜索正文中设置。from默认为0,size 默认为10。
注意,from+size不能超过index.max_result_window 默认为 10,000的索引设置。目前ES集群返回结果最大值为20,000
而业务方却要求83438000行。故ES抛出异常。
参考: https://www.elastic.co/guide/en/elasticsearch/reference/5.1/search-request-from-size.html
并且这种分页也是浅分页,可以理解为简单意义上的分页。它的原理很简单,就是查询前20条数据,然后截断前10条,只返回10-20的数据。这样其实白白浪费了前10条的查询。
并且越往后的分页,执行的效率越低。总体上会随着from的增加,消耗时间也会增加。而且数据量越大,效果越明显!
scroll: 游标查询允许我们 先做查询初始化,然后再批量地拉取结果。 这有点儿像传统数据库中的 cursor 。
游标查询会取某个时间点的快照数据。 查询初始化之后索引上的任何变化会被它忽略。 它通过保存旧的数据文件来实现这个特性,结果就像保留初始化时的索引 视图 一样。
scroll
APIhttp_code: 500
Caused by: java.lang.IllegalArgumentException: Fielddata is disabled on text fields by default. Set fielddata=true on [order_record_id] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.
{
"from" : 0,
"size" : 10,
"query" : {
"bool" : {
"filter" : [
{
"bool" : {
"must" : [
{
"bool" : {
"must" : [
{
"bool" : {
"must" : [
{
"script" : {
"script" : {
"source" : "1 == 1",
"lang" : "painless"
},
"boost" : 1.0
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
{
"range" : {
"create_time" : {
"from" : 1631894400,
"to" : null,
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
},
{
"range" : {
"create_time" : {
"from" : null,
"to" : 1631980800,
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
},
{
"range" : {
"begin_time" : {
"from" : null,
"to" : 1632002400,
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
},
{
"bool" : {
"should" : [
{
"match_phrase" : {
"mis_user_id" : {
"query" : 3230,
"slop" : 0,
"boost" : 1.0
}
}
},
{
"match_phrase" : {
"mis_user_id" : {
"query" : 505150,
"slop" : 0,
"boost" : 1.0
}
}
},
{
"match_phrase" : {
"mis_user_id" : {
"query" : 505150,
"slop" : 0,
"boost" : 1.0
}
}
},
{
"match_phrase" : {
"mis_user_id" : {
"query" : 505153,
"slop" : 0,
"boost" : 1.0
}
}
},
{
"match_phrase" : {
"mis_user_id" : {
"query" : 505154,
"slop" : 0,
"boost" : 1.0
}
}
},
{
"match_phrase" : {
"mis_user_id" : {
"query" : 505155,
"slop" : 0,
"boost" : 1.0
}
}
},
{
"match_phrase" : {
"mis_user_id" : {
"query" : 505156,
"slop" : 0,
"boost" : 1.0
}
}
},
{"match_phrase" : {
"mis_user_id" : {
"query" : 2663,
"slop" : 0,
"boost" : 1.0
}
}
},
{
"match_phrase" : {
"mis_user_id" : {
"query" : 3216,
"slop" : 0,
"boost" : 1.0
}
}
},
{
"match_phrase" : {
"mis_user_id" : {
"query" : 3217,
"slop" : 0,
"boost" : 1.0
}
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
{
"bool" : {
"should" : [
{
"match_phrase" : {
"type" : {
"query" : 1,
"slop" : 0,
"boost" : 1.0
}
}
},
{
"match_phrase" : {
"type" : {
"query" : 12,
"slop" : 0,
"boost" : 1.0
}
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
}
],
"disable_coord" : false,
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"_source" : {
"includes" : [
"id",
"begin_time",
"end_time",
"mobile",
"order_record_id",
"product_apply_order_id",
"category_id",
"product_id",
"content",
"status",
"zifang",
"product_detail_id",
"user_type",
"level1_type",
"level2_type",
"level3_type",
"subteams",
"ext_type",
"mis_user_id",
"name",
"call_system",
"hotline",
"record_id",
"create_time",
"satisfaction_score",
"transfer_queue",
"ext_id",
"call_start_time",
"end_reason"
],
"excludes" : [ ]
},
"sort" : [
{
"order_record_id" : {
"order" : "desc"
}
}
]
}}]
order_record_id
字段类型为text类型,针对text类型的FIleData默认是禁用的,ElasticSearch无法对类型为text的字段进行聚合查询/排序。
HTTP/1.1 401 Unauthorized
[2021-09-18T12:18:06,243][WARN ][o.e.t.LoggingTaskListener] 75886912359 failed with exception
org.elasticsearch.ElasticsearchSecurityException: no permissions for [indices:admin/create] and User [name=sentiment_rw, roles=[]]
sentiment_rw 用户没有写入索引的权限功能。
业务方写入索引时,应检查自己的权限。
http_code : 400
Caused by: org.elasticsearch.index.query.QueryShardException: failed to create query
Caused by: org.elasticsearch.common.io.stream.NotSerializableExceptionWrapper: too_many_clauses: maxClauseCount is set to 1024
该查询特别长,这里只是截取了一点点。
{
"bool" : {
"must" : [
{
"range" : {
"createTime" : {
"from" : 1631635199000,
"to" : 1631721599000,
"include_lower" : true,
"include_upper" : false,
"boost" : 1.0
}
}
}
],
"must_not" : [
{
"term" : {
"companyId" : {
"value" : "5433f1a77a775684071d5034208a9b21",
"boost" : 1.0
}
}
},
....
]
}
查询sql语句过长,超出lucene的最大子句限制(1024)。
规范查询,尽量在上线前,提前进行测试。