Elasticsearch笔记(十一) ES term terms prefix 搜索 聚合查询 详细总结

ES term terms prefix 搜索 聚合查询 详细总结

  • 1 前提
  • 2 准备数据
  • 3 凉菜--这个都不会?一首凉凉送给你
    • 1 term
    • 2 terms
    • 3 prefix
    • 4 wildcard
    • 5 range
    • 6 exists
  • 4 油焖大虾--把condition组合起来一锅焖
    • 1 bool
    • 2 must
    • 3 must_not
    • 4 should
    • 5 filter
  • 5 阿凡提羊肉串--给查询加点料
    • 1 过滤字段 _source
    • 2 排序 sort
    • 3 分页查询 from+size
  • 6 肉末茄子--Aggs服务员,你过来统计下肉末
    • 1 count
    • 2 terms聚合
    • 3 having,聚合完,再过滤
    • 4 先过滤后,再聚合
  • 7 扇子骨-collapse听说你想折叠起来
    • 1 collapse折叠查询
  • 8 青椒肉丝-Explain服务员,解释下为啥只有青椒
    • 1 explain
  • 9 总结一句话

1 前提

本人从17年在工作中接触ES,但是到现在感觉没有入门,主要是一直使用ES的JavaAPI去做简单业务逻辑开发,并没有认真看过ES的文档,对ES的理解还很浅。本着“教是最好的学”,特别想整理下ES查询的常用API,尤其看了下面ES开发者占比,感觉尤为强烈,因为我不属于其中一种(我=年龄大+工资低+头发少)。
Elasticsearch笔记(十一) ES term terms prefix 搜索 聚合查询 详细总结_第1张图片

2 准备数据

PUT /pigg/_doc/1
{
     
  "name": "老亚瑟",
  "age": 31,
  "sex": "男",
  "word": "死亡骑士,不是死掉的骑士",
  "weapon": ["黑切", "冰痕之握", "反伤刺甲","闪电匕首","破军"]
}

PUT /pigg/_doc/2
{
     
  "name": "孙悟空",
  "age": 40,
  "sex": "男",
  "word": "我就是吉吉国王",
  "weapon": ["黑切", "冰痕之握", "无尽战刃", "宗师之力"]
}

PUT /pigg/_doc/3
{
     
  "name": "安琪拉",
  "age": 16,
  "sex": "女",
  "word": "我就是小萝莉",
  "weapon": []
}

PUT /pigg/_doc/4
{
     
  "name": "老夫子",
  "age": 100,
  "sex": "男",
  "word": "我要定住你"
}

3 凉菜–这个都不会?一首凉凉送给你

对ES不熟悉可先看Elasticsearch笔记(九) term terms exists 查询案例

1 term

查询name=“老亚瑟”的数据

GET /pigg/_search
{
     
  "query": {
     
    "term": {
     
      "name": {
     
        "value": "老亚瑟"
      }
    }
  },
  "_source": ["name"]
}

这个时候我们发下结果如下,没有数据

{
     
  "hits" : {
     
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

因为我们没有自己定义mapping,这里name是text类型,“老亚瑟”被ES默认分词为“老”,“亚”,“瑟”这3个字,所以找不到。
term是包含的意思,查询name里包含“老”的数据:

GET /pigg/_search
{
     
  "query": {
     
    "term": {
     
      "name": {
     
        "value": "老"
      }
    }
  },
  "_source": ["name"]
}

返回结果如下,可以看到“老夫子”和“老亚瑟”都匹配中。

{
     
    "hits" : [
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 0.6931472,
        "_source" : {
     
          "name" : "老夫子"
        }
      },
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
     
          "name" : "老亚瑟"
        }
      }
    ]

name默认下面有个keyword字段,就是name.keyword,它没有被分词。

GET /pigg/_search
{
     
  "query": {
     
    "term": {
     
      "name.keyword": {
     
        "value": "老亚瑟"
      }
    }
  },
  "_source": ["name"]
}

结果如下,通过keyword类型可以精确查询

    "hits" : [
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
     
          "name" : "老亚瑟"
        }
      }
    ]

2 terms

terms是命中一个就算匹配,查询有黑切或者宗师之力的人

GET /pigg/_search
{
     
  "query": {
     
    "terms": {
     
      "weapon.keyword": [
        "黑切",
        "宗师之力"
      ]
    }
  },
  "_source": ["name", "weapon"]
}

返回结果如下:

    "hits" : [
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
     
          "weapon" : [
            "黑切",
            "冰痕之握",
            "无尽战刃",
            "宗师之力"
          ],
          "name" : "孙悟空"
        }
      },
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
     
          "weapon" : [
            "黑切",
            "冰痕之握",
            "反伤刺甲",
            "闪电匕首",
            "破军"
          ],
          "name" : "老亚瑟"
        }
      }
    ]

3 prefix

prefix前缀查询,在工作中很常见,就行MySQL里的 like “abc%”。
查询name以“老”开头的人:

GET /pigg/_search
{
     
  "query": {
     
    "prefix": {
     
      "name.keyword": {
     
        "value": "老"
      }
    }
  },
  "_source": ["name"]
}

结果如下:

  "hits" : [
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
     
          "name" : "老夫子"
        }
      },
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
     
          "name" : "老亚瑟"
        }
      }
    ]

4 wildcard

wildcard查询就像MySQL的Like查询,它查询效率比较低,一般也不用
查询name里包含“亚”的人:

GET /pigg/_search
{
     
  "query": {
     
    "wildcard": {
     
      "name.keyword": {
     
        "value": "*亚*"
      }
    }
  },
  "_source": ["name"]
}

5 range

range是范围查询,查询age在[10,30]的人

GET /pigg/_search
{
     
  "query": {
     
    "range": {
     
      "age": {
     
        "gte": 10,
        "lte": 30
      }
    }
  },
  "_source": ["name"]
}

返回如下:

    "hits" : [
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
     
          "name" : "安琪拉"
        }
      }
    ]

6 exists

查询weapon字段有值的人:

GET /pigg/_search
{
     
  "query": {
     
    "exists": {
     
        "field": "weapon"
    }
  },
  "_source": ["name"]
}

查询weapon字段没有值的人:

GET /pigg/_search
{
     
  "query": {
     
    "bool": {
     
      "must_not": [
        {
     
          "exists": {
     
            "field": "weapon"
          }
        }
      ]
    }
  },
  "_source": ["name"]
}

结果如下:其中老夫子没有weapon这个字段,而安琪拉的weapon=[]。

    "hits" : [
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
     
          "name" : "老夫子"
        }
      },
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
     
          "name" : "安琪拉"
        }
      }
    ]

4 油焖大虾–把condition组合起来一锅焖

1 bool

bool 过滤器是个 复合过滤器,它可以接受多个其他过滤器作为参数,并将这些过滤器结合成各式各样的布尔(逻辑)组合。
它格式如下:

{
     
   "bool" : {
     
      "must" :     [],
      "should" :   [],
      "must_not" : [],
   }
}

2 must

查询name以“老”开头的,并且age>=90的人

GET /pigg/_search
{
     
  "query": {
     
    "bool": {
     
      "must": [
        {
     
          "prefix": {
     
            "name": {
     
              "value": "老"
            }
          }
        },
        {
     
          "range": {
     
            "age": {
     
              "gte": 90
            }
          }
        }
      ]
    }
  },
   "_source": ["name","age"]
}

查询结果如下,毕竟我们的亚瑟王怎么可能那么老

 "hits" : [
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 2.0,
        "_source" : {
     
          "name" : "老夫子",
          "age" : 100
        }
      }
    ]

3 must_not

must_not和must相反,是非的意思,查询买了武器但是没有买无尽战刃的人

GET /pigg/_search
{
     
  "query": {
     
    "bool": {
     
      "must_not": [
        {
     
          "term": {
     
            "weapon.keyword": {
     
              "value": "无尽战刃"
            }
          }
        }
      ],
      "must": [
        {
     
          "exists": {
     
            "field": "weapon"
          }
        }
      ]
    }
  },
  "_source": ["name", "weapon"]
}

4 should

should是或的意思
查询是女的,或者word包含“吉吉国王”的人

GET /pigg/_search
{
     
  "query": {
     
    "bool": {
     
      "should": [
        {
     
          "term": {
     
            "sex": {
     
              "value": "女"
            }
          }
        },
        {
     
          "match": {
     
            "word": "吉吉国王"
          }
        }
      ]
    }
  },
  "_source": ["name","sex", "word"]
}

返回如下:

 "hits" : [
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 3.1186123,
        "_source" : {
     
          "sex" : "男",
          "name" : "孙悟空",
          "word" : "我就是吉吉国王"
        }
      },
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.2876821,
        "_source" : {
     
          "sex" : "女",
          "name" : "安琪拉",
          "word" : "我就是小萝莉"
        }
      }
    ]

当should和must或者must_not在同一层的时候,它不会影响结果,但影响匹配分数。

GET /pigg/_search
{
     
  "query": {
     
    "bool": {
     
      "must": [
        {
     
          "term": {
     
            "sex.keyword": {
     
              "value": "男"
            }
          }
        }
      ],
      "should": [
        {
     
          "range": {
     
            "age": {
     
              "gte": 90
            }
          }
        }
      ]
    }
  },
  "_source": ["name","sex", "age"]
}

结果如下:大家都是男人,但是老夫子的年龄>=90,他的_score=1.1823215,比另外2人高。

  "hits" : [
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.1823215,
        "_source" : {
     
          "sex" : "男",
          "name" : "老夫子",
          "age" : 100
        }
      },
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
     
          "sex" : "男",
          "name" : "老亚瑟",
          "age" : 31
        }
      },
      {
     
        "_index" : "pigg",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.18232156,
        "_source" : {
     
          "sex" : "男",
          "name" : "孙悟空",
          "age" : 40
        }
      }
    ]

5 filter

filter过滤查询,它不评分,效率高,网上介绍filter的文章很多,在此我就不瞎BB了。

GET /pigg/_search
{
     
  "query": {
     
    "bool": {
     
      "filter": {
     
        "term": {
     
          "sex.keyword": "男"
        }
      }
    }
  },
  "_source": ["name","sex"]
}

5 阿凡提羊肉串–给查询加点料

1 过滤字段 _source

#只返回"name", "sex"2个字段
GET /pigg/_search
{
     
  "query": {
     
    "match_all": {
     }
  },
  "_source": ["name", "sex"]
}

#只返回w开头的字段
GET /pigg/_search
{
     
  "query": {
     
    "match_all": {
     }
  },
  "_source": ["w*"]
}

#只返回w开头并且不是n结尾的字段
GET /pigg/_search
{
     
  "query": {
     
    "match_all": {
     }
  },
  "_source": {
     
    "includes": "w*",
    "excludes": "*n"
  }
}

2 排序 sort

GET /pigg/_search
{
     
  "sort": [
     {
     
      "sex.keyword": {
     
        "order": "desc"
      }
    },
    {
     
      "age": {
     
        "order": "desc"
      }
    }
  ],
  "_source": ["name","sex","age"]
}

3 分页查询 from+size

分页功能很常用,from从0开始,如果数据量很大,有深分页的问题。虽然有时公司喜欢改max_result_window这个参数,调的很多。额。。。,能查出来,你高兴就好。
如果数据量很大,分页读取数据并处理,可以考虑scroll,网上文章很多,我就不BB了。

GET /pigg/_search
{
     
  "from": 0,
  "size": 2, 
  "sort": [
    {
     
      "sex.keyword": {
     
        "order": "desc"
      }
    }
  ],
  "_source": ["name","sex"]
}

6 肉末茄子–Aggs服务员,你过来统计下肉末

1 count

统计满足条件的数量

GET /pigg/_count
{
     
  "query": {
     
    "term": {
     
      "sex.keyword": {
     
        "value": "男"
      }
    }
  }
}

2 terms聚合

terms聚合,就像GROUP BY

POST /_xpack/sql?format=txt
{
     
  "query": "SELECT sex, COUNT(*) num FROM pigg GROUP BY sex ORDER BY num desc" 
}

统计各装备的使用数量,并排序

GET /pigg/_search
{
     
  "aggs": {
     
    "terms_by_weapon": {
     
      "terms": {
     
        "field": "weapon.keyword",
        "size": 10,
         "order" : {
      "_count" : "asc" }
      }
    }
  }
}

结果如下:

      "buckets" : [
        {
     
          "key" : "反伤刺甲",
          "doc_count" : 1
        },
        {
     
          "key" : "宗师之力",
          "doc_count" : 1
        },
        {
     
          "key" : "无尽战刃",
          "doc_count" : 1
        },
        {
     
          "key" : "破军",
          "doc_count" : 1
        },
        {
     
          "key" : "闪电匕首",
          "doc_count" : 1
        },
        {
     
          "key" : "冰痕之握",
          "doc_count" : 2
        },
        {
     
          "key" : "黑切",
          "doc_count" : 2
        }
      ]

3 having,聚合完,再过滤

统计使用量>=2的装备

GET /pigg/_search
{
     
    "size": 0,
    "aggs":{
     
        "terms_by_weapon":{
     
            "terms":{
     
                "field":"weapon.keyword",
                "size":10
            },
            "aggs":{
     
                "having":{
     
                    "bucket_selector":{
     
                        "buckets_path":{
     
                            "weaponCount":"_count"
                        },
                        "script":{
     
                            "lang":"expression",
                            "inline":"weaponCount >= 2"
                        }
                    }
                }
            }
        }
    }
}

返回结果如下:

      "buckets" : [
        {
     
          "key" : "冰痕之握",
          "doc_count" : 2
        },
        {
     
          "key" : "黑切",
          "doc_count" : 2
        }
      ]

4 先过滤后,再聚合

先限定age<=90,然后按照sex分组,再求各性别的平均age

GET /pigg/_search
{
     
  "size": 5,
  "query": {
     
    "bool": {
     
      "filter": {
     
        "range": {
     
          "age": {
     
            "lte": 90
          }
        }
      }
    }
  }, 
  "_source": ["name","sex","age"],
  "aggs": {
     
    "terms_by_sex": {
     
      "terms": {
     
        "field": "sex.keyword",
        "size": 10
      },
      "aggs":{
     
        "avg_age":{
     
          "avg": {
     
            "field": "age"
          }
        }
      }
    }
  }
}

7 扇子骨-collapse听说你想折叠起来

1 collapse折叠查询

GET /pigg/_search
{
     
  "query": {
     
    "range": {
     
      "age": {
     
        "gte": 10,
        "lte": 90
      }
    }
  },
  "collapse": {
     
    "field": "sex.keyword",
    "inner_hits":{
     
      "name": "old_age",
      "size": 1,
      "sort": [{
     "age": "desc"}]
    }
  },
  "sort": [
    {
     
      "age": {
     
        "order": "desc"
      }
    }
  ]
}

8 青椒肉丝-Explain服务员,解释下为啥只有青椒

1 explain

explain参数可以接受DSL的语句,_validate验证DSL是否合法。

GET /pigg/_validate/query?explain
{
     
  "query": {
     
    "terms": {
     
      "weapon.keyword": [
        "黑切",
        "宗师之力"
      ]
    }
  }
}

返回如下:

  "valid" : true,
  "explanations" : [
    {
     
      "index" : "pigg",
      "valid" : true,
      "explanation" : "weapon.keyword:(宗师之力 黑切)"
    }
  ]

9 总结一句话

上面也就算ES的一些皮毛,ES功能很多,想一次性学完不可能,只能在工作中在闲暇时间学习积累,
少玩些农药,多学习吧。

你可能感兴趣的:(Elasticsearch,elasticsearch)