目前正在出一个Es专题
系列教程, 篇幅会较多, 喜欢的话,给个关注❤️ ~
承接上文,本节把上节遗留的条件查询
操作给大家讲一下~
为了方便学习, 本节中所有示例沿用上节的索引。本文偏实战一些,好了, 废话不多说直接开整吧~
es
中使用bool
来控制多条件
查询,bool
查询支持以下参数:
must
:被查询的数据必须满足
当前条件mush_not
:被查询的数据必须不满足
当前条件should
:被查询的数据应该满足
当前条件。should
查询被用于修正查询结果的评分。需要注意的是,如果组合查询中没有must
,那么被查询的数据至少要匹配一条should
。如果有must
语句,那么就无须匹配should
,should
将完全用于修正查询结果的评分filter
:被查询的数据必须满足
当前条件,但是filter
操作不涉及查询结果评分。仅用于条件过滤下面通过一个例子来看下如何使用:
GET class_1/_search
{
"query": {
"bool": {
"must": [
{"match": {
"name": "apple"
}}
],
"must_not": [
{"term": {
"num": {
"value": "5"
}
}}
],
"should": [
{"match": {
"name": "k"
}}
],"filter": [
{"range": {
"num": {
"gte": 0,
"lte": 10
}
}}
]
}
}
}
复制代码
结果返回:
{
"took" : 9,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.752627,
"hits" : [
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "b8fcCoYB090miyjed7YE",
"_score" : 0.752627,
"_source" : {
"name" : "I eat apple so haochi1~",
"num" : 1
}
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "ccfcCoYB090miyjed7YE",
"_score" : 0.752627,
"_source" : {
"name" : "I eat apple so haochi3~",
"num" : 1
}
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "cMfcCoYB090miyjed7YE",
"_score" : 0.7389809,
"_source" : {
"name" : "I eat apple so zhen haochi2~",
"num" : 1
}
}
]
}
}
复制代码
constant_score
查询可以通过boost
指定一个固定的评分,通常来说,constant_score
的作用是代替一个只有filter
的bool
查询
下面看具体使用:
GET class_1/_search
{
"query": {
"constant_score": {
"filter": {
"term": {
"num": 6
}
},
"boost": 1.2
}
}
}
复制代码
返回:
{
"took" : 7,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.2,
"hits" : [
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "h2Fg-4UBECmbBdQA6VLg",
"_score" : 1.2,
"_source" : {
"name" : "b",
"num" : 6
}
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.2,
"_source" : {
"name" : "l",
"num" : 6
}
}
]
}
}
复制代码
es
中通过/_validate/query
路由来验证查询条件的正确性
, 这里要注意是验证查询条件
是否准确
示例:
GET class_1/_validate/query?explain
{
"query": {
"bool": {
"must": [
{"match": {
"name": "apple"
}}
]
}
}
}
复制代码
正常返回:
{
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"valid" : true,
"explanations" : [
{
"index" : "class_1",
"valid" : true,
"explanation" : "+name:apple"
}
]
}
复制代码
将name
字段改为 name1
再查询:
{
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"valid" : true,
"explanations" : [
{
"index" : "class_1",
"valid" : true,
"explanation" : """+MatchNoDocsQuery("unmapped fields [name1]")"""
}
]
}
复制代码
可以看到报了异常错误
es
中通过/_validate/query?explain
路由来进行查询分析
示例:
GET class_1/_validate/query?explain
{
"query": {
"bool": {
"must": [
{"match": {
"name": "apple so"
}}
]
}
}
}
复制代码
返回:
{
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"valid" : true,
"explanations" : [
{
"index" : "class_1",
"valid" : true,
"explanation" : "+(name:apple name:so)"
}
]
}
复制代码
可以看到"explanation" : "+(name:apple name:so)"
,查询的短语apple so
被进行了分词,分成了name:apple
, name: so
在前面的几个例子中,我们可以看到它的默认排序是按照_score降序
,也就是匹配度高的比较靠前
,但是_socre
的计算是很占用查询性能的,这个不难理解。
当我们不需要进行_score计算
,可以通过filter
或constant_score
来进行构建查询条件
filter
示例:
GET class_1/_search
{
"query": {
"bool": {
"filter": [
{"term": {
"num": 1
}}
]
}
}
}
复制代码
返回:
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "b8fcCoYB090miyjed7YE",
"_score" : 0.0,
"_source" : {
"name" : "I eat apple so haochi1~",
"num" : 1
}
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "ccfcCoYB090miyjed7YE",
"_score" : 0.0,
"_source" : {
"name" : "I eat apple so haochi3~",
"num" : 1
}
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "cMfcCoYB090miyjed7YE",
"_score" : 0.0,
"_source" : {
"name" : "I eat apple so zhen haochi2~",
"num" : 1
}
}
]
}
}
复制代码
通过查询结果我们发现score
都为0.0
了,说明没有进行score
计算
constant_score
示例:
GET class_1/_search
{
"query": {
"constant_score": {
"filter": {
"term": {
"num": 1
}
},
"boost": 1.2
}
}
}
复制代码
返回:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.2,
"hits" : [
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "b8fcCoYB090miyjed7YE",
"_score" : 1.2,
"_source" : {
"name" : "I eat apple so haochi1~",
"num" : 1
}
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "ccfcCoYB090miyjed7YE",
"_score" : 1.2,
"_source" : {
"name" : "I eat apple so haochi3~",
"num" : 1
}
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "cMfcCoYB090miyjed7YE",
"_score" : 1.2,
"_source" : {
"name" : "I eat apple so zhen haochi2~",
"num" : 1
}
}
]
}
}
复制代码
可以看到,对应返回的分值,都是使用boost
属性指定的分值
自定义可以用于大部分场景,那么es
中怎么进行自定义排序呢? es
中使用sort
参数来自定义排序顺序,默认为升序
,那么降序
怎么操作呢?
{"sort":["num"]}
复制代码
desc
代表降序{"sort":[{"num":{"order":"desc"}}]}
复制代码
es
中使用doc value
列式存储来实现字段的排序功能text
字段默认不创建doc value
,因此无法针对text
字段进行排序text
字段属性fielddata=true
来开启对text
字段的排序功能,但是不建议开启,对text
字段排序及其消耗查询性能且不符合需求GET class_1/_search
{
"sort": [
"num"
]
}
复制代码
返回:
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 11,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "b8fcCoYB090miyjed7YE",
"_score" : null,
"_source" : {
"name" : "I eat apple so haochi1~",
"num" : 1
},
"sort" : [
1
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "ccfcCoYB090miyjed7YE",
"_score" : null,
"_source" : {
"name" : "I eat apple so haochi3~",
"num" : 1
},
"sort" : [
1
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "cMfcCoYB090miyjed7YE",
"_score" : null,
"_source" : {
"name" : "I eat apple so zhen haochi2~",
"num" : 1
},
"sort" : [
1
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "h2Fg-4UBECmbBdQA6VLg",
"_score" : null,
"_source" : {
"name" : "b",
"num" : 6
},
"sort" : [
6
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"name" : "l",
"num" : 6
},
"sort" : [
6
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"num" : 9,
"name" : "e",
"age" : 9,
"desc" : [
"hhhh"
]
},
"sort" : [
9
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "4",
"_score" : null,
"_source" : {
"name" : "f",
"age" : 10,
"num" : 10
},
"sort" : [
10
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "RWlfBIUBDuA8yW5cu9wu",
"_score" : null,
"_source" : {
"name" : "一年级",
"num" : 20
},
"sort" : [
20
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "iGFt-4UBECmbBdQAnVJe",
"_score" : null,
"_source" : {
"name" : "g",
"age" : 8
},
"sort" : [
9223372036854775807
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "iWFt-4UBECmbBdQAnVJg",
"_score" : null,
"_source" : {
"name" : "h",
"age" : 9
},
"sort" : [
9223372036854775807
]
}
]
}
}
复制代码
可以看到是按照num
默认升序排序
再看下降序:
GET class_1/_search
{
"sort": [
{"num": {"order":"desc"}}
]
}
复制代码
返回:
{
"took" : 15,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 11,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "RWlfBIUBDuA8yW5cu9wu",
"_score" : null,
"_source" : {
"name" : "一年级",
"num" : 20
},
"sort" : [
20
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "4",
"_score" : null,
"_source" : {
"name" : "f",
"age" : 10,
"num" : 10
},
"sort" : [
10
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"num" : 9,
"name" : "e",
"age" : 9,
"desc" : [
"hhhh"
]
},
"sort" : [
9
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "h2Fg-4UBECmbBdQA6VLg",
"_score" : null,
"_source" : {
"name" : "b",
"num" : 6
},
"sort" : [
6
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"name" : "l",
"num" : 6
},
"sort" : [
6
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "b8fcCoYB090miyjed7YE",
"_score" : null,
"_source" : {
"name" : "I eat apple so haochi1~",
"num" : 1
},
"sort" : [
1
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "ccfcCoYB090miyjed7YE",
"_score" : null,
"_source" : {
"name" : "I eat apple so haochi3~",
"num" : 1
},
"sort" : [
1
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "cMfcCoYB090miyjed7YE",
"_score" : null,
"_source" : {
"name" : "I eat apple so zhen haochi2~",
"num" : 1
},
"sort" : [
1
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "iGFt-4UBECmbBdQAnVJe",
"_score" : null,
"_source" : {
"name" : "g",
"age" : 8
},
"sort" : [
-9223372036854775808
]
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "iWFt-4UBECmbBdQAnVJg",
"_score" : null,
"_source" : {
"name" : "h",
"age" : 9
},
"sort" : [
-9223372036854775808
]
}
]
}
}
复制代码
这下就降序
排序了
GET class_1/_search
{
"sort": [
"num", "age"
]
}
复制代码
还记得之前给大家讲的from+size
的分页方式吗,es
中默认允许from+size
的分页的最大数据量为10000
。当我们想要批量获取更大的数据量时,使用from+size
就会十分的耗费性能。
然而大部分应用场景下的数据量是极其庞大的,比如你要查询某些系统日志数据。es
中可以使用/scorll
路由来进行滚动分页查询
,它类似于在查询初始时间点创建了一个当前服务集群的数据快照
(包含每一个分片),并保留它一段时间。在时间超过了设置的过期时间以后,快照将在es空闲时被删除。
需要注意的是,因为是进行快照
查询,因此在快照
创建后数据的变更在本次的滚动查询中,不可见
查询示例:
GET class_1/_search?scroll=10m
{
"query": {
"match_phrase": {
"name": "apple"
}
},
"size": 2
}
复制代码
返回:
{
"_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==",
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.752627,
"hits" : [
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "b8fcCoYB090miyjed7YE",
"_score" : 0.752627,
"_source" : {
"name" : "I eat apple so haochi1~",
"num" : 1
}
},
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "ccfcCoYB090miyjed7YE",
"_score" : 0.752627,
"_source" : {
"name" : "I eat apple so haochi3~",
"num" : 1
}
}
]
}
}
复制代码
如图,当前共返回2
条数据,并且返回了一个快照ID
,后续可以根据快照ID
进行滚动
查询:
GET /_search/scroll
{
"scroll": "10m",
"scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw=="
}
复制代码
返回:
{
"_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==",
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.752627,
"hits" : [
{
"_index" : "class_1",
"_type" : "_doc",
"_id" : "cMfcCoYB090miyjed7YE",
"_score" : 0.7389809,
"_source" : {
"name" : "I eat apple so zhen haochi2~",
"num" : 1
}
}
]
}
}
复制代码
在滚动一次:
{
"_scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAAAAXoFjEwWkdOMkxLUTVPZEMzM01ZdHhPc1EAAAAAAAACABZjUy1CemQwQVFfU3BUeGs2OGk0R1Z3AAAAAAAAAgEWY1MtQnpkMEFRX1NwVHhrNjhpNEdWdw==",
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.752627,
"hits" : [ ]
}
}
复制代码
有的小伙伴可能不知道怎么滚动
的,因为后续滚动都是同一个scroll_id
,其实通过结果,我们不难发现:
快照
,规定了每次返回的数据量为2条
,并且初始化的时候,返回了2条scroll_id
进行滚动操作,返回了1条
数据,原因是快照的数据量总共只有3条
,初始化的时候返回了2条
,所以现在只有1条
本节就到此结束了,大家一定要多去练习。下节我们进入进阶查询
部分内容 ~
本着把自己知道的都告诉大家,如果本文对您有所帮助,点赞+关注
鼓励一下呗~