Boolean query | Elasticsearch Guide [8.15] | Elastic
布尔查询是用于匹配出和组合中的其他查询匹配的文档。bool查询映射到Lucene BooleanQuery。它由一个或更多的布尔子句组成,每个子句是一个类型化的事件。事件如下:
must | 子句(查询)必须出现在匹配的文档中,并将有助于得分。 |
filter | 子句(查询)必须出现在匹配的文档中。但和must不同,查询分数会被忽略。Filter子句是在过滤上下文执行的,意味着分数会被忽略,子句被考虑用于缓存。 |
should | 子句(查询)应出现在匹配的文档中。 |
must_not | 子句(查询)必须不能出现在匹配的文档中。子句是在过滤上下文执行的,意味着分数会被忽略,子句被考虑用于缓存。因为分数被忽略,所有文档的分数都是0。 |
布尔查询采用的是 “更多匹配更好分数” 的策略,所以一个文档对于must 或 should子句的匹配会被加到一起来计算最终的_score分数。
POST _search
{
"query": {
"bool" : {
"must" : {
"term" : { "user.id" : "kimchy" }
},
"filter": {
"term" : { "tags" : "production" }
},
"must_not" : {
"range" : {
"age" : { "gte" : 10, "lte" : 20 }
}
},
"should" : [
{ "term" : { "tags" : "env1" } },
{ "term" : { "tags" : "deployed" } }
],
"minimum_should_match" : 1,
"boost" : 1.0
}
}
}
在filter元素下面列出的查询语句是不进行分数计算的,分数返回0。只有在特殊说明的情况下才能影响到分数,一共有三种情况,下面举例。
第一个查询:对于全部返回的文档,分数都是0, 因为没有特定的分数查询。
{
"query": {
"bool": {
"filter": [
{
"match": {
"scenery": "大栅栏"
}
}
]
}
}
}
返回
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0,
"hits": [
{
"_index": "app_mark",
"_type": "_doc",
"_id": "20",
"_score": 0,
"_source": {
"scenery": "大栅栏街道前门西大街前门西大街辅路-道路停车位",
"description": "",
"add_time": "2024-08-14T13:43:55+08:00"
}
}
]
}
}
第二个查询:bool 查询用了 match_all 语句,他会把把所有文档都设置分数 1.0 。
{
"query": {
"bool": {
"filter": [
{
"match": {
"scenery": "大栅栏"
}
}
],
"must" : {
"match_all": {}
}
}
}
}
返回
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "app_mark",
"_type": "_doc",
"_id": "20",
"_score": 1,
"_source": {
"scenery": "大栅栏街道前门西大街前门西大街辅路-道路停车位",
"description": "",
"add_time": "2024-08-14T13:43:55+08:00"
}
}
]
}
}
第三个查询:用constant_score查询,它的行为和上面第二个查询例子是一样的。constant_score语句给所有filter过滤匹配的文档都设置分数1.0。
{
"query": {
"constant_score": {
"filter": {
"match": {
"scenery": "大栅栏"
}
}
}
}
}
返回
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "app_mark",
"_type": "_doc",
"_id": "20",
"_score": 1,
"_source": {
"scenery": "大栅栏街道前门西大街前门西大街辅路-道路停车位",
"description": "",
"add_time": "2024-08-14T13:43:55+08:00"
}
}
]
}
}
每个 query 接收一个 _name 在最上层的定义中。你可以用 named querie跟踪一个query匹配到的文档。如果named queries 被使用了,返回的内容中每个文档会包含一个 matched_queries属性。
例如下面语句,scenery字段含义 "解放" , description字段还有"测试" 的文档,如果匹配了第一个,返回的字段包含first,匹配了第二个,返回的字段包含second,都匹配,则first、second都有。
{
"query": {
"bool": {
"should": [
{
"match": {
"scenery": {
"query": "解放",
"_name": "first"
}
}
},
{
"match": {
"description": {
"query": "测试",
"_name": "second"
}
}
}
]
}
}
}
返回值举例
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.0880616,
"hits": [
{
"_index": "app_mark",
"_type": "_doc",
"_id": "16",
"_score": 1.0880616,
"_source": {
"scenery": "解放街道紫禁城(天元金都店)天元金都",
"description": "测试描述1",
"add_time": "2024-08-13T12:01:35+08:00"
},
"matched_queries": [
"first",
"second"
]
},
{
"_index": "app_mark",
"_type": "_doc",
"_id": "17",
"_score": 0.7392724,
"_source": {
"scenery": "解放街道紫禁城(天元金都店)天元金都",
"description": "",
"add_time": "2024-08-13T13:25:22+08:00"
},
"matched_queries": [
"first"
]
}
]
}
}