【elasticsearch】关于elasticsearch的max_result_window限制问题的解决方式思考

事情起因:我们使用es作为日志搜索引擎,客户收集到的业务日志非常之大,每次查询后,返回页数较多,由于我们web界面限制每页返回150条,当客户翻到66页之后就会报错。

文章目录

前言

二、实验

1.默认生成20条数据

2.默认查询

3.Search after查询

4.Search after二次查询

总结


前言

报错信息如下:

  • Elasticsearch limits the search result to 10000 messages. With a page size of 150 messages, you can use the first 66 pages. Unable to perform search query: Elasticsearch exception [type=illegal_argument_exception, reason=Result window is too large, from + size must be less than or equal to: [10000] but was [34050]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting.].

一、Search after介绍

        使用Search after作为查询,默认情况下我们使用Scroll search作为查询,可以指定From + size 大小的数据批量返回,这是为了防止因为请求过多的结果而导致内存和CPU资源过度消耗。

        查询结果的分页可以通过使用from and size 来完成,但是当达到深度分页时,成本变得令人望而却步。 默认为 10000 是一种保护措施,搜索请求占用的堆内存和时间成正比。 建议使用 scroll API 来实现高效的深度滚动,但滚动上下文成本高昂,而且成本不高,建议将其用于实时用户请求。 Search after通过提供实时游标来规避此问题。 这个想法是使用上一页的结果来帮助检索下一页。

二、实验

1.默认生成20条数据

for i in `seq 1 20`
do 
    echo 192.168.113.195 -d -P 54000 "zhongguoren $i"
    sleep 1
done

2.默认查询

GET index_1/_search
{
  "query": {
    "match": {
      "message": "zhongguoren"
    }
  },
  "sort": [
    {
      "timestamp": "asc"
    }
  ]
}

默认返回20条数据,通过默认查询知识确认,每条信息的流入时间和条数,目前按照1-20进行排序,没有重复的时间戳。

3.Search after查询

index_1/_search
{
  "size": 10,
  "query": {
    "match": {
      "message": "zhongguoren"
    }
  },
  "sort": [
    {
      "timestamp": "asc"
    },
    {
      "_id": "asc"
    }
  ]
}

先查询前10条,这个和之前默认查询有个区别,就是排序时通过timestamp和_id进行同时排序,在查询结果中得到一个sort字段,值为

"sort": [ - 
          1680770417895,
          "a894bf71-d456-11ed-aa52-000c29f6b211"
        ]

查询后结果值为:

{ - 
  "took": 223,
  "timed_out": false,
  "_shards": { - 
    "total": 4,
    "successful": 4,
    "skipped": 0,
    "failed": 0
  },
  "hits": { - 
    "total": { - 
      "value": 20,
      "relation": "eq"
    },
    "max_score": null,
    "hits": [ - 
      { - 
        "_index": "12_149",
        "_type": "_doc",
        "_id": "2df06a80-d456-11ed-aa52-000c29f6b211",
        "_score": null,
        "_source": { - 
          "elap_accounted_message_size": 140,
          "elap_source_input": "636c90d3246ca975ec999b24",
          "streams": [ - 
            "6969696969697379736c6f67",
            "6969696969727379736c6f67"
          ],
          "elap_remote_port": 35834,
          "elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
          "source": "192.168.113.195",
          "elap_message_id": "01GXAVC899ZZXDJXF453F0RVXJ",
          "message": "<5>Apr 6 16:36:52 root: zhongguoren 1",
          "elap_remote_ip": "192.168.113.195",
          "timestamp": "2023-04-06 08:36:52.135"
        },
        "sort": [ - 
          1680770212135,
          "2df06a80-d456-11ed-aa52-000c29f6b211"
        ]
      },
      { - 
        "_index": "12_149",
        "_type": "_doc",
        "_id": "5244ccf0-d456-11ed-aa52-000c29f6b211",
        "_score": null,
        "_source": { - 
          "elap_accounted_message_size": 140,
          "elap_source_input": "636c90d3246ca975ec999b24",
          "streams": [ - 
            "6969696969697379736c6f67",
            "6969696969727379736c6f67"
          ],
          "elap_remote_port": 52885,
          "elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
          "source": "192.168.113.195",
          "elap_message_id": "01GXAVE3XCWZK71D74YVF58PZ2",
          "message": "<5>Apr 6 16:37:53 root: zhongguoren 2",
          "elap_remote_ip": "192.168.113.195",
          "timestamp": "2023-04-06 08:37:53.086"
        },
        "sort": [ - 
          1680770273086,
          "5244ccf0-d456-11ed-aa52-000c29f6b211"
        ]
      },
      { - 
        "_index": "12_149",
        "_type": "_doc",
        "_id": "5e46d610-d456-11ed-aa52-000c29f6b211",
        "_score": null,
        "_source": { - 
          "elap_accounted_message_size": 140,
          "elap_source_input": "636c90d3246ca975ec999b24",
          "streams": [ - 
            "6969696969697379736c6f67",
            "6969696969727379736c6f67"
          ],
          "elap_remote_port": 57417,
          "elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
          "source": "192.168.113.195",
          "elap_message_id": "01GXAVEQFJP3CN17A2NG41YYHY",
          "message": "<5>Apr 6 16:38:13 root: zhongguoren 3",
          "elap_remote_ip": "192.168.113.195",
          "timestamp": "2023-04-06 08:38:13.232"
        },
        "sort": [ - 
          1680770293232,
          "5e46d610-d456-11ed-aa52-000c29f6b211"
        ]
      },
      { - 
        "_index": "12_149",
        "_type": "_doc",
        "_id": "644eb870-d456-11ed-aa52-000c29f6b211",
        "_score": null,
        "_source": { - 
          "elap_accounted_message_size": 140,
          "elap_source_input": "636c90d3246ca975ec999b24",
          "streams": [ - 
            "6969696969697379736c6f67",
            "6969696969727379736c6f67"
          ],
          "elap_remote_port": 52483,
          "elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
          "source": "192.168.113.195",
          "elap_message_id": "01GXAVF1BQNJZ48GW39G48DQNQ",
          "message": "<5>Apr 6 16:38:23 root: zhongguoren 4",
          "elap_remote_ip": "192.168.113.195",
          "timestamp": "2023-04-06 08:38:23.350"
        },
        "sort": [ - 
          1680770303350,
          "644eb870-d456-11ed-aa52-000c29f6b211"
        ]
      },
      { - 
        "_index": "12_149",
        "_type": "_doc",
        "_id": "a562e070-d456-11ed-aa52-000c29f6b211",
        "_score": null,
        "_source": { - 
          "elap_accounted_message_size": 140,
          "elap_source_input": "636c90d3246ca975ec999b24",
          "streams": [ - 
            "6969696969697379736c6f67",
            "6969696969727379736c6f67"
          ],
          "elap_remote_port": 45404,
          "elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
          "source": "192.168.113.195",
          "elap_message_id": "01GXAVJC1YT9SPVWG81CJGBMXG",
          "message": "<5>Apr 6 16:40:12 root: zhongguoren 5",
          "elap_remote_ip": "192.168.113.195",
          "timestamp": "2023-04-06 08:40:12.534"
        },
        "sort": [ - 
          1680770412534,
          "a562e070-d456-11ed-aa52-000c29f6b211"
        ]
      },
      { - 
        "_index": "12_149",
        "_type": "_doc",
        "_id": "a609f5e0-d456-11ed-aa52-000c29f6b211",
        "_score": null,
        "_source": { - 
          "elap_accounted_message_size": 140,
          "elap_source_input": "636c90d3246ca975ec999b24",
          "streams": [ - 
            "6969696969697379736c6f67",
            "6969696969727379736c6f67"
          ],
          "elap_remote_port": 33127,
          "elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
          "source": "192.168.113.195",
          "elap_message_id": "01GXAVJD1Y45HAS4N7ZN2YXWS0",
          "message": "<5>Apr 6 16:40:13 root: zhongguoren 6",
          "elap_remote_ip": "192.168.113.195",
          "timestamp": "2023-04-06 08:40:13.629"
        },
        "sort": [ - 
          1680770413629,
          "a609f5e0-d456-11ed-aa52-000c29f6b211"
        ]
      },
      { - 
        "_index": "12_149",
        "_type": "_doc",
        "_id": "a6a328a0-d456-11ed-aa52-000c29f6b211",
        "_score": null,
        "_source": { - 
          "elap_accounted_message_size": 140,
          "elap_source_input": "636c90d3246ca975ec999b24",
          "streams": [ - 
            "6969696969697379736c6f67",
            "6969696969727379736c6f67"
          ],
          "elap_remote_port": 41301,
          "elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
          "source": "192.168.113.195",
          "elap_message_id": "01GXAVJE1A0ER5S5BZ8DJKW2A6",
          "message": "<5>Apr 6 16:40:14 root: zhongguoren 7",
          "elap_remote_ip": "192.168.113.195",
          "timestamp": "2023-04-06 08:40:14.633"
        },
        "sort": [ - 
          1680770414633,
          "a6a328a0-d456-11ed-aa52-000c29f6b211"
        ]
      },
      { - 
        "_index": "12_149",
        "_type": "_doc",
        "_id": "a73c3451-d456-11ed-aa52-000c29f6b211",
        "_score": null,
        "_source": { - 
          "elap_accounted_message_size": 140,
          "elap_source_input": "636c90d3246ca975ec999b24",
          "streams": [ - 
            "6969696969697379736c6f67",
            "6969696969727379736c6f67"
          ],
          "elap_remote_port": 36916,
          "elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
          "source": "192.168.113.195",
          "elap_message_id": "01GXAVJF0PXMNPGKVZ7YEJ1JNY",
          "message": "<5>Apr 6 16:40:15 root: zhongguoren 8",
          "elap_remote_ip": "192.168.113.195",
          "timestamp": "2023-04-06 08:40:15.637"
        },
        "sort": [ - 
          1680770415637,
          "a73c3451-d456-11ed-aa52-000c29f6b211"
        ]
      },
      { - 
        "_index": "12_149",
        "_type": "_doc",
        "_id": "a7dfc750-d456-11ed-aa52-000c29f6b211",
        "_score": null,
        "_source": { - 
          "elap_accounted_message_size": 140,
          "elap_source_input": "636c90d3246ca975ec999b24",
          "streams": [ - 
            "6969696969697379736c6f67",
            "6969696969727379736c6f67"
          ],
          "elap_remote_port": 42778,
          "elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
          "source": "192.168.113.195",
          "elap_message_id": "01GXAVJG26BRZG09JSSA96S329",
          "message": "<5>Apr 6 16:40:16 root: zhongguoren 9",
          "elap_remote_ip": "192.168.113.195",
          "timestamp": "2023-04-06 08:40:16.708"
        },
        "sort": [ - 
          1680770416708,
          "a7dfc750-d456-11ed-aa52-000c29f6b211"
        ]
      },
      { - 
        "_index": "12_149",
        "_type": "_doc",
        "_id": "a894bf71-d456-11ed-aa52-000c29f6b211",
        "_score": null,
        "_source": { - 
          "elap_accounted_message_size": 141,
          "elap_source_input": "636c90d3246ca975ec999b24",
          "streams": [ - 
            "6969696969697379736c6f67",
            "6969696969727379736c6f67"
          ],
          "elap_remote_port": 53090,
          "elap_source_node": "5bb5d963-7fdd-4e2d-bfdf-a5069bb07097",
          "source": "192.168.113.195",
          "elap_message_id": "01GXAVJH78H4GCQ44KNZQ1MQ0H",
          "message": "<5>Apr 6 16:40:17 root: zhongguoren 10",
          "elap_remote_ip": "192.168.113.195",
          "timestamp": "2023-04-06 08:40:17.895"
        },
        "sort": [ - 
          1680770417895,
          "a894bf71-d456-11ed-aa52-000c29f6b211"
        ]
      }
    ]
  }
}

4.Search after二次查询

可以根据sort值带入到后续得search after中,以做到从某点进行再次查询。

GET index_1/_search
{
  "size": 10,
  "query": {
    "match": {
      "message": "zhongguoren"
    }
  },
  "search_after": [
    1680770417895,
    "a894bf71-d456-11ed-aa52-000c29f6b211"
  ],
  "sort": [
    {
      "timestamp": "asc"
    },
    {
      "_id": "asc"
    }
  ]
}

返回结果中,得到11-20条数据。


总结

我也尝试过,直接从16条的sort字段值进行查询,同样可以查询到17-20的值,所以,我们可以进行改造,默认仍然是10000条的max_result_window,兼顾小批量数据的查询和搜索,当查询从66页往后时,每次翻页或者跳页我们要能获取到超过10000的最后一条信息的id和时间戳,由此解决es深度查询翻页的问题。

明天找产品聊聊实现逻辑。

你可能感兴趣的:(elasticsearch,搜索引擎)