bool查询是组合叶子查询或复合查询子句的默认查询方式,如must,should,must_not或者filter子句;must与should子句查询最终分数由两个子句各自匹配分数相加得到,而must_not与filter子句需要在过滤查询中执行;
bool查询底层由Lucene中的BooleanQuery类实现,该查询由一个或多个布尔子句组成,每个子句由特定类型声明;
序号 | 类型 | 描述 |
---|---|---|
1 | must | 该查询子句必须出现在匹配的文档中且与相似度分数计算相关 |
2 | filter | 该查询子句必须出现在匹配的文档中且是在过滤上下文中执行,与must查询不同的是该查询会忽略相似度分数计算且会对结果缓存 |
3 | should | 该查询子句应该出现在匹配的文档中 |
4 | must_not | 该查询子句必须不能出现在匹配的文档中,该查询在过滤上下文中执行,这也意味着不会计算相似度分数(分数为0)且对结果会缓存 |
文档同时匹配查询子句must或should可获得更高的分数,而最后相似度分_score
就是通过匹配must或should计算出的分数相加得到
//请求参数
POST bank/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"gender.keyword": "M"
}
}
],
"filter": {
"term": {
"state.keyword": "MO"
}
},
"must_not": [
{
"range": {
"age": {
"gte": 20,
"lte": 30
}
}
}
],
"should": [
{
"match": {
"email": "comcubine.com"
}
},
{
"match": {
"address": "Avenue"
}
}
],
"minimum_should_match": 1,
"boost": 1
}
}
}
//返回结果
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 7.1838775,
"hits" : [
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "58",
"_score" : 7.1838775,
"_source" : {
"account_number" : 58,
"balance" : 31697,
"firstname" : "Marva",
"lastname" : "Cannon",
"age" : 40,
"gender" : "M",
"address" : "993 Highland Place",
"employer" : "Comcubine",
"email" : "[email protected]",
"city" : "Orviston",
"state" : "MO"
}
},
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "286",
"_score" : 2.2192826,
"_source" : {
"account_number" : 286,
"balance" : 39063,
"firstname" : "Rosetta",
"lastname" : "Turner",
"age" : 35,
"gender" : "M",
"address" : "169 Jefferson Avenue",
"employer" : "Spacewax",
"email" : "[email protected]",
"city" : "Stewart",
"state" : "MO"
}
}
]
}
}
minimum_should_match参数说明
可以使用minimum_should_match参数指定必须匹配should子句的文档数量或文档百分比,若一个bool查询包含至少一个should子句且无must或filter子句,则minimum_should_match默认值为1,反之为0;
查询中包含filter子句的查询不会计算相似度分(返回_score为0),
以下三个示例均返回字段为state且值为WA的文档
1)、示例查询分数均为0,因为未指定可计算分数的查询
//请求参数
GET bank/_search
{
"size": 2,
"query": {
"bool": {
"filter": {
"term": {
"state.keyword": "WA"
}
}
}
}
}
//结果返回
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 19,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "20",
"_score" : 0.0,
"_source" : {
"account_number" : 20,
"balance" : 16418,
"firstname" : "Elinor",
"lastname" : "Ratliff",
"age" : 36,
"gender" : "M",
"address" : "282 Kings Place",
"employer" : "Scentric",
"email" : "[email protected]",
"city" : "Ribera",
"state" : "WA"
}
},
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "284",
"_score" : 0.0,
"_source" : {
"account_number" : 284,
"balance" : 22806,
"firstname" : "Randolph",
"lastname" : "Banks",
"age" : 29,
"gender" : "M",
"address" : "875 Hamilton Avenue",
"employer" : "Caxt",
"email" : "[email protected]",
"city" : "Crawfordsville",
"state" : "WA"
}
}
]
}
}
2)、示例查询分为1.0,因为使用了match_all查询返回了所有文档
//请求参数
GET bank/_search
{
"size": 2,
"query": {
"bool": {
"must": {
"match_all":{}
},
"filter": {
"term": {
"state.keyword": "WA"
}
}
}
}
}
//结果返回,分数均为1.0
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 19,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "20",
"_score" : 1.0,
"_source" : {
"account_number" : 20,
"balance" : 16418,
"firstname" : "Elinor",
"lastname" : "Ratliff",
"age" : 36,
"gender" : "M",
"address" : "282 Kings Place",
"employer" : "Scentric",
"email" : "[email protected]",
"city" : "Ribera",
"state" : "WA"
}
},
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "284",
"_score" : 1.0,
"_source" : {
"account_number" : 284,
"balance" : 22806,
"firstname" : "Randolph",
"lastname" : "Banks",
"age" : 29,
"gender" : "M",
"address" : "875 Hamilton Avenue",
"employer" : "Caxt",
"email" : "[email protected]",
"city" : "Crawfordsville",
"state" : "WA"
}
}
]
}
}
3)、示例查询分为1.0,因为使用了constant_score查询,其效果与示例2中一样
//请求参数,boost设置为1.2
GET bank/_search
{
"size": 2,
"query": {
"constant_score": {
"filter": {
"term": {
"state.keyword": "WA"
}
},
"boost": 1.2
}
}
}
//结果返回,分数均为1.2
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 19,
"relation" : "eq"
},
"max_score" : 1.2,
"hits" : [
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "20",
"_score" : 1.2,
"_source" : {
"account_number" : 20,
"balance" : 16418,
"firstname" : "Elinor",
"lastname" : "Ratliff",
"age" : 36,
"gender" : "M",
"address" : "282 Kings Place",
"employer" : "Scentric",
"email" : "[email protected]",
"city" : "Ribera",
"state" : "WA"
}
},
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "284",
"_score" : 1.2,
"_source" : {
"account_number" : 284,
"balance" : 22806,
"firstname" : "Randolph",
"lastname" : "Banks",
"age" : 29,
"gender" : "M",
"address" : "875 Hamilton Avenue",
"employer" : "Caxt",
"email" : "[email protected]",
"city" : "Crawfordsville",
"state" : "WA"
}
}
]
}
}
为查询命名以观察实际是哪个查询子句被匹配
每一个过滤操作或查询操作在指定匹配子句时都可配置_name参数
//请求参数,针对每个查询指定查询字段别名
GET bank/_search
{
"size": 3,
"query": {
"bool": {
"should": [
{
"match": {
"email": {
"query": "comcubine.com",
"_name": "q_n1"
}
}
},
{
"match": {
"address": {
"query": "Avenue",
"_name": "q_n2"
}
}
}
],
"filter": {
"terms": {
"age": [
40,
38
],
"_name": "q_a"
}
}
}
}
}
//结果返回,同时列举匹配项
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 85,
"relation" : "eq"
},
"max_score" : 6.5046196,
"hits" : [
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "58",
"_score" : 6.5046196,
"_source" : {
"account_number" : 58,
"balance" : 31697,
"firstname" : "Marva",
"lastname" : "Cannon",
"age" : 40,
"gender" : "M",
"address" : "993 Highland Place",
"employer" : "Comcubine",
"email" : "[email protected]",
"city" : "Orviston",
"state" : "MO"
},
"matched_queries" : [
"q_a",
"q_n1"
]
},
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "664",
"_score" : 1.5400248,
"_source" : {
"account_number" : 664,
"balance" : 16163,
"firstname" : "Hart",
"lastname" : "Mccormick",
"age" : 40,
"gender" : "M",
"address" : "144 Guider Avenue",
"employer" : "Dyno",
"email" : "[email protected]",
"city" : "Carbonville",
"state" : "ID"
},
"matched_queries" : [
"q_a",
"q_n2"
]
},
{
"_index" : "bank",
"_type" : "_doc",
"_id" : "791",
"_score" : 1.5400248,
"_source" : {
"account_number" : 791,
"balance" : 48249,
"firstname" : "Janine",
"lastname" : "Huber",
"age" : 38,
"gender" : "F",
"address" : "348 Porter Avenue",
"employer" : "Viocular",
"email" : "[email protected]",
"city" : "Fivepointville",
"state" : "MA"
},
"matched_queries" : [
"q_a",
"q_n2"
]
}
]
}
}
查询结果当中会包含每一个匹配到的查询,在查询操作和过滤操作上指定标签只在bool查询中有意义;
返回匹配positive查询的文档并降低匹配negative查询的文档相似度分;
这样就可以在不排除某些文档的前提下对文档进行查询,搜索结果中存在只不过相似度分数相比正常匹配的要低;
GET bank/_search
{
"query": {
"boosting": {
"positive": {
"term": {
"state.keyword": {
"value": "DC"
}
}
},
"negative": {
"term": {
"age": {
"value": 23
}
}
},
"negative_boost": 0.2
}
}
}
序号 | 参数 | 参数说明 |
---|---|---|
1 | positive | 必须存在,查询对象,指定希望执行的查询子句,返回的结果都将满足该子句指定的条件 |
2 | negative | 必须存在,查询对象,指定的查询子句用于降低匹配文档的相似度分 |
3 | negative_boost | 必须存在,浮点数,介于0与1.0之间的浮点数,用于降低匹配文档的相似分 |
若一个匹配返回的文档既满足positive查询子句又满足negative查询子句,那么boosting查询计算相似度分数步骤如下:
1)、获取从positive查询中的原始分数;
2)、将获取的分数与negative_boost系数相乘得到最终分;