事情起因:我们使用es作为日志搜索引擎,客户收集到的业务日志非常之大,每次查询后,返回页数较多,由于我们web界面限制每页返回150条,当客户翻到66页之后就会报错。
前言
二、实验
1.默认生成20条数据
2.默认查询
3.Search after查询
4.Search after二次查询
总结
前言
报错信息如下:
一、Search after介绍
使用Search after作为查询,默认情况下我们使用Scroll search作为查询,可以指定From + size 大小的数据批量返回,这是为了防止因为请求过多的结果而导致内存和CPU资源过度消耗。
查询结果的分页可以通过使用from
and size
来完成,但是当达到深度分页时,成本变得令人望而却步。 默认为 10000 是一种保护措施,搜索请求占用的堆内存和时间成正比。 建议使用 scroll API 来实现高效的深度滚动,但滚动上下文成本高昂,而且成本不高,建议将其用于实时用户请求。 Search after通过提供实时游标来规避此问题。 这个想法是使用上一页的结果来帮助检索下一页。
for i in `seq 1 20`
do
echo 192.168.113.195 -d -P 54000 "zhongguoren $i"
sleep 1
done
GET index_1/_search
{
"query": {
"match": {
"message": "zhongguoren"
}
},
"sort": [
{
"timestamp": "asc"
}
]
}
默认返回20条数据,通过默认查询知识确认,每条信息的流入时间和条数,目前按照1-20进行排序,没有重复的时间戳。
index_1/_search
{
"size": 10,
"query": {
"match": {
"message": "zhongguoren"
}
},
"sort": [
{
"timestamp": "asc"
},
{
"_id": "asc"
}
]
}
先查询前10条,这个和之前默认查询有个区别,就是排序时通过timestamp和_id进行同时排序,在查询结果中得到一个sort字段,值为
"sort": [ -
1680770417895,
"a894bf71-d456-11ed-aa52-000c29f6b211"
]
查询后结果值为:
{ -
"took": 223,
"timed_out": false,
"_shards": { -
"total": 4,
"successful": 4,
"skipped": 0,
"failed": 0
},
"hits": { -
"total": { -
"value": 20,
"relation": "eq"
},
"max_score": null,
"hits": [ -
{ -
"_index": "12_149",
"_type": "_doc",
"_id": "2df06a80-d456-11ed-aa52-000c29f6b211",
"_score": null,
"_source": { -
"elap_accounted_message_size": 140,
"elap_source_input": "636c90d3246ca975ec999b24",
"streams": [ -
"6969696969697379736c6f67",
"6969696969727379736c6f67"
],
"elap_remote_port": 35834,
"elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
"source": "192.168.113.195",
"elap_message_id": "01GXAVC899ZZXDJXF453F0RVXJ",
"message": "<5>Apr 6 16:36:52 root: zhongguoren 1",
"elap_remote_ip": "192.168.113.195",
"timestamp": "2023-04-06 08:36:52.135"
},
"sort": [ -
1680770212135,
"2df06a80-d456-11ed-aa52-000c29f6b211"
]
},
{ -
"_index": "12_149",
"_type": "_doc",
"_id": "5244ccf0-d456-11ed-aa52-000c29f6b211",
"_score": null,
"_source": { -
"elap_accounted_message_size": 140,
"elap_source_input": "636c90d3246ca975ec999b24",
"streams": [ -
"6969696969697379736c6f67",
"6969696969727379736c6f67"
],
"elap_remote_port": 52885,
"elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
"source": "192.168.113.195",
"elap_message_id": "01GXAVE3XCWZK71D74YVF58PZ2",
"message": "<5>Apr 6 16:37:53 root: zhongguoren 2",
"elap_remote_ip": "192.168.113.195",
"timestamp": "2023-04-06 08:37:53.086"
},
"sort": [ -
1680770273086,
"5244ccf0-d456-11ed-aa52-000c29f6b211"
]
},
{ -
"_index": "12_149",
"_type": "_doc",
"_id": "5e46d610-d456-11ed-aa52-000c29f6b211",
"_score": null,
"_source": { -
"elap_accounted_message_size": 140,
"elap_source_input": "636c90d3246ca975ec999b24",
"streams": [ -
"6969696969697379736c6f67",
"6969696969727379736c6f67"
],
"elap_remote_port": 57417,
"elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
"source": "192.168.113.195",
"elap_message_id": "01GXAVEQFJP3CN17A2NG41YYHY",
"message": "<5>Apr 6 16:38:13 root: zhongguoren 3",
"elap_remote_ip": "192.168.113.195",
"timestamp": "2023-04-06 08:38:13.232"
},
"sort": [ -
1680770293232,
"5e46d610-d456-11ed-aa52-000c29f6b211"
]
},
{ -
"_index": "12_149",
"_type": "_doc",
"_id": "644eb870-d456-11ed-aa52-000c29f6b211",
"_score": null,
"_source": { -
"elap_accounted_message_size": 140,
"elap_source_input": "636c90d3246ca975ec999b24",
"streams": [ -
"6969696969697379736c6f67",
"6969696969727379736c6f67"
],
"elap_remote_port": 52483,
"elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
"source": "192.168.113.195",
"elap_message_id": "01GXAVF1BQNJZ48GW39G48DQNQ",
"message": "<5>Apr 6 16:38:23 root: zhongguoren 4",
"elap_remote_ip": "192.168.113.195",
"timestamp": "2023-04-06 08:38:23.350"
},
"sort": [ -
1680770303350,
"644eb870-d456-11ed-aa52-000c29f6b211"
]
},
{ -
"_index": "12_149",
"_type": "_doc",
"_id": "a562e070-d456-11ed-aa52-000c29f6b211",
"_score": null,
"_source": { -
"elap_accounted_message_size": 140,
"elap_source_input": "636c90d3246ca975ec999b24",
"streams": [ -
"6969696969697379736c6f67",
"6969696969727379736c6f67"
],
"elap_remote_port": 45404,
"elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
"source": "192.168.113.195",
"elap_message_id": "01GXAVJC1YT9SPVWG81CJGBMXG",
"message": "<5>Apr 6 16:40:12 root: zhongguoren 5",
"elap_remote_ip": "192.168.113.195",
"timestamp": "2023-04-06 08:40:12.534"
},
"sort": [ -
1680770412534,
"a562e070-d456-11ed-aa52-000c29f6b211"
]
},
{ -
"_index": "12_149",
"_type": "_doc",
"_id": "a609f5e0-d456-11ed-aa52-000c29f6b211",
"_score": null,
"_source": { -
"elap_accounted_message_size": 140,
"elap_source_input": "636c90d3246ca975ec999b24",
"streams": [ -
"6969696969697379736c6f67",
"6969696969727379736c6f67"
],
"elap_remote_port": 33127,
"elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
"source": "192.168.113.195",
"elap_message_id": "01GXAVJD1Y45HAS4N7ZN2YXWS0",
"message": "<5>Apr 6 16:40:13 root: zhongguoren 6",
"elap_remote_ip": "192.168.113.195",
"timestamp": "2023-04-06 08:40:13.629"
},
"sort": [ -
1680770413629,
"a609f5e0-d456-11ed-aa52-000c29f6b211"
]
},
{ -
"_index": "12_149",
"_type": "_doc",
"_id": "a6a328a0-d456-11ed-aa52-000c29f6b211",
"_score": null,
"_source": { -
"elap_accounted_message_size": 140,
"elap_source_input": "636c90d3246ca975ec999b24",
"streams": [ -
"6969696969697379736c6f67",
"6969696969727379736c6f67"
],
"elap_remote_port": 41301,
"elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
"source": "192.168.113.195",
"elap_message_id": "01GXAVJE1A0ER5S5BZ8DJKW2A6",
"message": "<5>Apr 6 16:40:14 root: zhongguoren 7",
"elap_remote_ip": "192.168.113.195",
"timestamp": "2023-04-06 08:40:14.633"
},
"sort": [ -
1680770414633,
"a6a328a0-d456-11ed-aa52-000c29f6b211"
]
},
{ -
"_index": "12_149",
"_type": "_doc",
"_id": "a73c3451-d456-11ed-aa52-000c29f6b211",
"_score": null,
"_source": { -
"elap_accounted_message_size": 140,
"elap_source_input": "636c90d3246ca975ec999b24",
"streams": [ -
"6969696969697379736c6f67",
"6969696969727379736c6f67"
],
"elap_remote_port": 36916,
"elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
"source": "192.168.113.195",
"elap_message_id": "01GXAVJF0PXMNPGKVZ7YEJ1JNY",
"message": "<5>Apr 6 16:40:15 root: zhongguoren 8",
"elap_remote_ip": "192.168.113.195",
"timestamp": "2023-04-06 08:40:15.637"
},
"sort": [ -
1680770415637,
"a73c3451-d456-11ed-aa52-000c29f6b211"
]
},
{ -
"_index": "12_149",
"_type": "_doc",
"_id": "a7dfc750-d456-11ed-aa52-000c29f6b211",
"_score": null,
"_source": { -
"elap_accounted_message_size": 140,
"elap_source_input": "636c90d3246ca975ec999b24",
"streams": [ -
"6969696969697379736c6f67",
"6969696969727379736c6f67"
],
"elap_remote_port": 42778,
"elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
"source": "192.168.113.195",
"elap_message_id": "01GXAVJG26BRZG09JSSA96S329",
"message": "<5>Apr 6 16:40:16 root: zhongguoren 9",
"elap_remote_ip": "192.168.113.195",
"timestamp": "2023-04-06 08:40:16.708"
},
"sort": [ -
1680770416708,
"a7dfc750-d456-11ed-aa52-000c29f6b211"
]
},
{ -
"_index": "12_149",
"_type": "_doc",
"_id": "a894bf71-d456-11ed-aa52-000c29f6b211",
"_score": null,
"_source": { -
"elap_accounted_message_size": 141,
"elap_source_input": "636c90d3246ca975ec999b24",
"streams": [ -
"6969696969697379736c6f67",
"6969696969727379736c6f67"
],
"elap_remote_port": 53090,
"elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
"source": "192.168.113.195",
"elap_message_id": "01GXAVJH78H4GCQ44KNZQ1MQ0H",
"message": "<5>Apr 6 16:40:17 root: zhongguoren 10",
"elap_remote_ip": "192.168.113.195",
"timestamp": "2023-04-06 08:40:17.895"
},
"sort": [ -
1680770417895,
"a894bf71-d456-11ed-aa52-000c29f6b211"
]
}
]
}
}
可以根据sort值带入到后续得search after中,以做到从某点进行再次查询。
GET index_1/_search
{
"size": 10,
"query": {
"match": {
"message": "zhongguoren"
}
},
"search_after": [
1680770417895,
"a894bf71-d456-11ed-aa52-000c29f6b211"
],
"sort": [
{
"timestamp": "asc"
},
{
"_id": "asc"
}
]
}
返回结果中,得到11-20条数据。
我也尝试过,直接从16条的sort字段值进行查询,同样可以查询到17-20的值,所以,我们可以进行改造,默认仍然是10000条的max_result_window,兼顾小批量数据的查询和搜索,当查询从66页往后时,每次翻页或者跳页我们要能获取到超过10000的最后一条信息的id和时间戳,由此解决es深度查询翻页的问题。
明天找产品聊聊实现逻辑。