本人从17年在工作中接触ES,但是到现在感觉没有入门,主要是一直使用ES的JavaAPI去做简单业务逻辑开发,并没有认真看过ES的文档,对ES的理解还很浅。本着“教是最好的学”,特别想整理下ES查询的常用API,尤其看了下面ES开发者占比,感觉尤为强烈,因为我不属于其中一种(我=年龄大+工资低+头发少)。
PUT /pigg/_doc/1
{
"name": "老亚瑟",
"age": 31,
"sex": "男",
"word": "死亡骑士,不是死掉的骑士",
"weapon": ["黑切", "冰痕之握", "反伤刺甲","闪电匕首","破军"]
}
PUT /pigg/_doc/2
{
"name": "孙悟空",
"age": 40,
"sex": "男",
"word": "我就是吉吉国王",
"weapon": ["黑切", "冰痕之握", "无尽战刃", "宗师之力"]
}
PUT /pigg/_doc/3
{
"name": "安琪拉",
"age": 16,
"sex": "女",
"word": "我就是小萝莉",
"weapon": []
}
PUT /pigg/_doc/4
{
"name": "老夫子",
"age": 100,
"sex": "男",
"word": "我要定住你"
}
对ES不熟悉可先看Elasticsearch笔记(九) term terms exists 查询案例
查询name=“老亚瑟”的数据
GET /pigg/_search
{
"query": {
"term": {
"name": {
"value": "老亚瑟"
}
}
},
"_source": ["name"]
}
这个时候我们发下结果如下,没有数据
{
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
因为我们没有自己定义mapping,这里name是text类型,“老亚瑟”被ES默认分词为“老”,“亚”,“瑟”这3个字,所以找不到。
term是包含的意思,查询name里包含“老”的数据:
GET /pigg/_search
{
"query": {
"term": {
"name": {
"value": "老"
}
}
},
"_source": ["name"]
}
返回结果如下,可以看到“老夫子”和“老亚瑟”都匹配中。
{
"hits" : [
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.6931472,
"_source" : {
"name" : "老夫子"
}
},
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"name" : "老亚瑟"
}
}
]
name默认下面有个keyword字段,就是name.keyword,它没有被分词。
GET /pigg/_search
{
"query": {
"term": {
"name.keyword": {
"value": "老亚瑟"
}
}
},
"_source": ["name"]
}
结果如下,通过keyword类型可以精确查询
"hits" : [
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"name" : "老亚瑟"
}
}
]
terms是命中一个就算匹配,查询有黑切或者宗师之力的人
GET /pigg/_search
{
"query": {
"terms": {
"weapon.keyword": [
"黑切",
"宗师之力"
]
}
},
"_source": ["name", "weapon"]
}
返回结果如下:
"hits" : [
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"weapon" : [
"黑切",
"冰痕之握",
"无尽战刃",
"宗师之力"
],
"name" : "孙悟空"
}
},
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"weapon" : [
"黑切",
"冰痕之握",
"反伤刺甲",
"闪电匕首",
"破军"
],
"name" : "老亚瑟"
}
}
]
prefix前缀查询,在工作中很常见,就行MySQL里的 like “abc%”。
查询name以“老”开头的人:
GET /pigg/_search
{
"query": {
"prefix": {
"name.keyword": {
"value": "老"
}
}
},
"_source": ["name"]
}
结果如下:
"hits" : [
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"name" : "老夫子"
}
},
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"name" : "老亚瑟"
}
}
]
wildcard查询就像MySQL的Like查询,它查询效率比较低,一般也不用
查询name里包含“亚”的人:
GET /pigg/_search
{
"query": {
"wildcard": {
"name.keyword": {
"value": "*亚*"
}
}
},
"_source": ["name"]
}
range是范围查询,查询age在[10,30]的人
GET /pigg/_search
{
"query": {
"range": {
"age": {
"gte": 10,
"lte": 30
}
}
},
"_source": ["name"]
}
返回如下:
"hits" : [
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "安琪拉"
}
}
]
查询weapon字段有值的人:
GET /pigg/_search
{
"query": {
"exists": {
"field": "weapon"
}
},
"_source": ["name"]
}
查询weapon字段没有值的人:
GET /pigg/_search
{
"query": {
"bool": {
"must_not": [
{
"exists": {
"field": "weapon"
}
}
]
}
},
"_source": ["name"]
}
结果如下:其中老夫子没有weapon这个字段,而安琪拉的weapon=[]。
"hits" : [
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"name" : "老夫子"
}
},
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"name" : "安琪拉"
}
}
]
bool 过滤器是个 复合过滤器,它可以接受多个其他过滤器作为参数,并将这些过滤器结合成各式各样的布尔(逻辑)组合。
它格式如下:
{
"bool" : {
"must" : [],
"should" : [],
"must_not" : [],
}
}
查询name以“老”开头的,并且age>=90的人
GET /pigg/_search
{
"query": {
"bool": {
"must": [
{
"prefix": {
"name": {
"value": "老"
}
}
},
{
"range": {
"age": {
"gte": 90
}
}
}
]
}
},
"_source": ["name","age"]
}
查询结果如下,毕竟我们的亚瑟王怎么可能那么老
"hits" : [
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "4",
"_score" : 2.0,
"_source" : {
"name" : "老夫子",
"age" : 100
}
}
]
must_not和must相反,是非的意思,查询买了武器但是没有买无尽战刃的人
GET /pigg/_search
{
"query": {
"bool": {
"must_not": [
{
"term": {
"weapon.keyword": {
"value": "无尽战刃"
}
}
}
],
"must": [
{
"exists": {
"field": "weapon"
}
}
]
}
},
"_source": ["name", "weapon"]
}
should是或的意思
查询是女的,或者word包含“吉吉国王”的人
GET /pigg/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"sex": {
"value": "女"
}
}
},
{
"match": {
"word": "吉吉国王"
}
}
]
}
},
"_source": ["name","sex", "word"]
}
返回如下:
"hits" : [
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "2",
"_score" : 3.1186123,
"_source" : {
"sex" : "男",
"name" : "孙悟空",
"word" : "我就是吉吉国王"
}
},
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.2876821,
"_source" : {
"sex" : "女",
"name" : "安琪拉",
"word" : "我就是小萝莉"
}
}
]
当should和must或者must_not在同一层的时候,它不会影响结果,但影响匹配分数。
GET /pigg/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"sex.keyword": {
"value": "男"
}
}
}
],
"should": [
{
"range": {
"age": {
"gte": 90
}
}
}
]
}
},
"_source": ["name","sex", "age"]
}
结果如下:大家都是男人,但是老夫子的年龄>=90,他的_score=1.1823215,比另外2人高。
"hits" : [
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.1823215,
"_source" : {
"sex" : "男",
"name" : "老夫子",
"age" : 100
}
},
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.2876821,
"_source" : {
"sex" : "男",
"name" : "老亚瑟",
"age" : 31
}
},
{
"_index" : "pigg",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.18232156,
"_source" : {
"sex" : "男",
"name" : "孙悟空",
"age" : 40
}
}
]
filter过滤查询,它不评分,效率高,网上介绍filter的文章很多,在此我就不瞎BB了。
GET /pigg/_search
{
"query": {
"bool": {
"filter": {
"term": {
"sex.keyword": "男"
}
}
}
},
"_source": ["name","sex"]
}
#只返回"name", "sex"2个字段
GET /pigg/_search
{
"query": {
"match_all": {
}
},
"_source": ["name", "sex"]
}
#只返回w开头的字段
GET /pigg/_search
{
"query": {
"match_all": {
}
},
"_source": ["w*"]
}
#只返回w开头并且不是n结尾的字段
GET /pigg/_search
{
"query": {
"match_all": {
}
},
"_source": {
"includes": "w*",
"excludes": "*n"
}
}
GET /pigg/_search
{
"sort": [
{
"sex.keyword": {
"order": "desc"
}
},
{
"age": {
"order": "desc"
}
}
],
"_source": ["name","sex","age"]
}
分页功能很常用,from从0开始,如果数据量很大,有深分页的问题。虽然有时公司喜欢改max_result_window这个参数,调的很多。额。。。,能查出来,你高兴就好。
如果数据量很大,分页读取数据并处理,可以考虑scroll,网上文章很多,我就不BB了。
GET /pigg/_search
{
"from": 0,
"size": 2,
"sort": [
{
"sex.keyword": {
"order": "desc"
}
}
],
"_source": ["name","sex"]
}
统计满足条件的数量
GET /pigg/_count
{
"query": {
"term": {
"sex.keyword": {
"value": "男"
}
}
}
}
terms聚合,就像GROUP BY
POST /_xpack/sql?format=txt
{
"query": "SELECT sex, COUNT(*) num FROM pigg GROUP BY sex ORDER BY num desc"
}
统计各装备的使用数量,并排序
GET /pigg/_search
{
"aggs": {
"terms_by_weapon": {
"terms": {
"field": "weapon.keyword",
"size": 10,
"order" : {
"_count" : "asc" }
}
}
}
}
结果如下:
"buckets" : [
{
"key" : "反伤刺甲",
"doc_count" : 1
},
{
"key" : "宗师之力",
"doc_count" : 1
},
{
"key" : "无尽战刃",
"doc_count" : 1
},
{
"key" : "破军",
"doc_count" : 1
},
{
"key" : "闪电匕首",
"doc_count" : 1
},
{
"key" : "冰痕之握",
"doc_count" : 2
},
{
"key" : "黑切",
"doc_count" : 2
}
]
统计使用量>=2的装备
GET /pigg/_search
{
"size": 0,
"aggs":{
"terms_by_weapon":{
"terms":{
"field":"weapon.keyword",
"size":10
},
"aggs":{
"having":{
"bucket_selector":{
"buckets_path":{
"weaponCount":"_count"
},
"script":{
"lang":"expression",
"inline":"weaponCount >= 2"
}
}
}
}
}
}
}
返回结果如下:
"buckets" : [
{
"key" : "冰痕之握",
"doc_count" : 2
},
{
"key" : "黑切",
"doc_count" : 2
}
]
先限定age<=90,然后按照sex分组,再求各性别的平均age
GET /pigg/_search
{
"size": 5,
"query": {
"bool": {
"filter": {
"range": {
"age": {
"lte": 90
}
}
}
}
},
"_source": ["name","sex","age"],
"aggs": {
"terms_by_sex": {
"terms": {
"field": "sex.keyword",
"size": 10
},
"aggs":{
"avg_age":{
"avg": {
"field": "age"
}
}
}
}
}
}
GET /pigg/_search
{
"query": {
"range": {
"age": {
"gte": 10,
"lte": 90
}
}
},
"collapse": {
"field": "sex.keyword",
"inner_hits":{
"name": "old_age",
"size": 1,
"sort": [{
"age": "desc"}]
}
},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
explain参数可以接受DSL的语句,_validate验证DSL是否合法。
GET /pigg/_validate/query?explain
{
"query": {
"terms": {
"weapon.keyword": [
"黑切",
"宗师之力"
]
}
}
}
返回如下:
"valid" : true,
"explanations" : [
{
"index" : "pigg",
"valid" : true,
"explanation" : "weapon.keyword:(宗师之力 黑切)"
}
]
上面也就算ES的一些皮毛,ES功能很多,想一次性学完不可能,只能在工作中在闲暇时间学习积累,
少玩些农药,多学习吧。