Elasticsearch教程(6) ES桶聚合Query DSL-Terms Aggregation

Elasticsearch 桶聚合 Terms Aggregation

  • 1 准备测试数据
    • 1.1 插入数据DSL
    • 1.2 表格展示
  • 2 Terms Aggregation(根据字段的值分组)
    • 2.1 根据dept分组,求每个部门的数量
      • 2.1.1 SQL
      • 2.1.2 DSL
    • 2.2 根据dept分组,求每个部门的数量,并排序
      • 2.2.1 SQL
      • 2.2.2 DSL
    • 2.3 对统计结果过滤,加having
      • 2.3.1 SQL
      • 2.3.2 SQL转DSL语句
    • 2.4 过滤需统计文档的范围
      • 2.4.1 SQL
      • 2.4.2 DSL

桶聚合的种类很多,一篇短文难以覆盖,这篇先介绍Terms Aggregation(按字段分组),类似MySQL中的Group By,它是最常用的聚合方式。注意用于聚合的字段不能是text类型。

1 准备测试数据

1.1 插入数据DSL

PUT /user/_doc/1
{
    "id":"1",
    "name":"张一",
    "dept":"web",
    "path":"dept1,man1",
    "birthday": "2008-11-16",
    "status":"1"
}

PUT /user/_doc/2
{
    "id":"2",
    "name":"张二",
    "dept":"web",
    "path":"dept1,man2",
    "birthday": "2008-12-17",
    "status":"1"
}

PUT /user/_doc/3
{
    "id":"3",
    "name":"张三",
    "dept":"web",
    "path":"dept1,man3",
    "birthday": "2009-10-10",
    "status":"1"
}

PUT /user/_doc/4
{
    "id":"4",
    "name":"李四",
    "dept":"java",
    "path":"dept2,man4",
    "birthday": "2012-01-01",
    "status":"1"
}

PUT /user/_doc/5
{
    "id":"5",
    "name":"王五",
    "dept":"java",
    "path":"dept2,man5",
    "birthday": "2012-07-01",
    "status":"0"
}

PUT /user/_doc/6
{
    "id":"6",
    "name":"王六",
    "dept":"data",
    "status":"0",
    "path":"dept3,man6",
    "birthday": "2009-12-12",
    "gender":"man"
}

1.2 表格展示

id name birthday gender dept path status
1 张一 2008-11-16 web dept1,man1 1
2 张二 2008-12-17 web dept1,man2 1
3 张三 2009-10-10 web dept1,man3 1
4 李四 2012-01-01 java dept2,man4 1
5 王五 2012-07-01 java dept2,man5 0
5 王六 2009-12-12 man data dept3,man6 0

2 Terms Aggregation(根据字段的值分组)

Terms聚合用于分组

2.1 根据dept分组,求每个部门的数量

2.1.1 SQL

#=======group by=======
POST /_xpack/sql?format=txt
{
  "query": "SELECT dept, COUNT(*) num FROM user GROUP BY dept" 
}
    dept       |      num      
---------------+---------------
data           |1              
java           |2              
web            |3              

2.1.2 DSL

GET /user/_doc/_search
{
    "size":0,
    "aggs":{
        "depts":{
            "terms":{
                "field":"dept.keyword"
            }
        }
    }
}
  "aggregations" : {
    "depts" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "web",
          "doc_count" : 3
        },
        {
          "key" : "java",
          "doc_count" : 2
        },
        {
          "key" : "data",
          "doc_count" : 1
        }
      ]
    }
  }

2.2 根据dept分组,求每个部门的数量,并排序

2.2.1 SQL

POST /_xpack/sql?format=txt
{
    "query": "SELECT dept, COUNT(*) num FROM user GROUP BY dept ORDER BY num DESC" 
}
     dept      |      num      
---------------+---------------
web            |3              
java           |2              
data           |1              

2.2.2 DSL

GET /user/_doc/_search
{
    "size":0,
    "aggs":{
        "depts":{
            "terms":{
                "field":"dept.keyword",
                "order" : { "_count" : "desc" }
            }
        }
    }
}

上面按照数量降序,如果将"_count"改成"_key",就是按照key(这里就是指dept)降序

  "aggregations" : {
    "depts" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "web",
          "doc_count" : 3
        },
        {
          "key" : "java",
          "doc_count" : 2
        },
        {
          "key" : "data",
          "doc_count" : 1
        }
      ]
    }
  }

2.3 对统计结果过滤,加having

2.3.1 SQL

POST /_xpack/sql?format=txt
{
    "query": "SELECT dept, COUNT(*) num  FROM user GROUP BY dept HAVING num > 1 ORDER BY num DESC" 
}

过滤出数量>1的部门

     dept      |      num      
---------------+---------------
web            |3              
java           |2              

2.3.2 SQL转DSL语句

POST /_xpack/sql/translate
{
    "query": "SELECT dept, COUNT(*) num  FROM user GROUP BY dept HAVING num > 1 ORDER BY num DESC" 
}

2.4 过滤需统计文档的范围

2.4.1 SQL

POST /_xpack/sql?format=txt
{
    "query": "SELECT dept, COUNT(*) num  FROM user  WHERE status = 1 GROUP BY dept ORDER BY num DESC" 
}

2.4.2 DSL

GET /user/_doc/_search
{
    "size":0,
    "query":{
        "term":{
            "status":1
        }
    },
    "aggs":{
        "dept_count":{
            "terms":{
                "field":"dept.keyword",
                "order":{
                    "_count":"desc"
                }
            }
        }
    }
}

你可能感兴趣的:(Elasticsearch,terms,aggregation)