elasticsearch-dsl聚合-1

接续上篇,本篇介绍elasticsearch聚合查询,使用python库elasticsearch-dsl进行聚合查询操作。

7.3、聚合查询

高阶概念

  • Buckets(/集合):满足特定条件的文档的集合
  • Metrics(指标):对桶内的文档进行统计计算(例如最小值,求和,最大值等)

    • 新建一张测试表
       1 PUT cars
       2 {
       3   "mappings": {
       4     "transactions":{
       5       "properties": {
       6         "price":{
       7           "type": "integer"
       8         },
       9         "color":{
      10           "type": "text",
      11           "fielddata": true
      12         },
      13         "make":{
      14           "type": "text",
      15           "fielddata": true
      16         },
      17         "sold":{
      18           "type": "date",
      19           "format": "yyyy-MM-dd"
      20         }
      21       }
      22     }
      23   }
      24 }

      插入数据

       1 POST /cars/transactions/_bulk
       2 { "index": {"_index": "cars", "_type": "transactions"}} 
       3 { "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }
       4 { "index": {"_index": "cars", "_type": "transactions"}} 
       5 { "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" } 
       6 { "index": {"_index": "cars", "_type": "transactions"}} 
       7 { "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" }
       8 { "index": {"_index": "cars", "_type": "transactions"}} 
       9 { "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" } 
      10 { "index": {"_index": "cars", "_type": "transactions"}} 
      11 { "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" }
      12 { "index": {"_index": "cars", "_type": "transactions"}} 
      13 { "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" } 
      14 { "index": {"_index": "cars", "_type": "transactions"}} 
      15 { "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" } 
      16 { "index": {"_index": "cars", "_type": "transactions"}} 
      17 { "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" }
    • 查询哪个颜色的汽车销量最好(按颜色分类)
       1 GET cars/transactions/_search
       2 {
       3   "size": 0,
       4   "aggs": {
       5     "popular_colors": {
       6       "terms": {
       7         "field": "color"
       8       }
       9     }
      10   }
      11 }
      1 s = Search(index='cars')
      2 a = A("terms", field="color")
      3 s.aggs.bucket("popular_color", a)
      4 response = s.execute()

      或者

      1 s.aggs.bucket("popular_color", "terms", field="color")
    • 查询每种颜色车的平均价格
       1 GET cars/transactions/_search
       2 {
       3   "size": 0,
       4   "aggs": {
       5     "colors": {
       6       "terms": {
       7         "field": "color"
       8       },
       9       "aggs": {
      10         "avg_price": {
      11           "avg": {
      12             "field": "price"
      13           }
      14         }
      15       }
      16     }
      17   }
      18 }
      1 s = Search(index='cars')
      2 a1 = A("terms", field="color")
      3 a2 = A("avg", field="price")
      4 s.aggs.bucket("colors", a1).metric("avg_price", a2)
      5 response = s.execute()

      或者

      1 s = Search(index='cars')
      2 s.aggs.bucket("colors", "terms", field="color").metric("avg_price", "avg", field="price")
      3 response = s.execute()
    • 先按颜色分,再按品牌分,再求每种品牌的均价
       1 GET cars/transactions/_search
       2 {
       3   "size": 0,
       4   "aggs": {
       5     "colors": {
       6       "terms": {
       7         "field": "color"
       8       },
       9       "aggs": {
      10         "make": {
      11           "terms": {
      12             "field": "make"
      13           },
      14           "aggs": {
      15             "avg_price": {
      16               "avg": {
      17                 "field": "price"
      18               }
      19             }
      20           }
      21         }
      22       }
      23     }
      24   }
      25 }
      1 s = Search(index='cars')
      2 s.aggs.bucket("colors", "terms", field="color")
      3 s.aggs["colors"].bucket("make", "terms", field="make")
      4 s.aggs["colors"].aggs["make"].metric("avg_price", "avg", field="price")
      5 response = s.execute()
    • 先按颜色分,再按品牌分,再求每种品牌的最高和最低价
       1 GET cars/transactions/_search
       2 {
       3   "size": 0,
       4   "aggs": {
       5     "colors": {
       6       "terms": {
       7         "field": "color"
       8       },
       9       "aggs": {
      10         "make": {
      11           "terms": {
      12             "field": "make"
      13           },
      14           "aggs": {
      15             "min_price": {
      16               "min": {
      17                 "field": "price"
      18               }
      19             },
      20             "max_price": {
      21               "max": {
      22                 "field": "price"
      23               }
      24             }
      25           }
      26         }
      27       }
      28     }
      29   }
      30 }
      1 s = Search(index='cars')
      2 s.aggs.bucket("colors", "terms", field="color")
      3 s.aggs["colors"].bucket("make", "terms", field="make")
      4 s.aggs["colors"].aggs["make"].metric("min_price", "min", field="price")
      5 s.aggs["colors"].aggs["make"].metric("max_price", "max", field="price")
      6 response = s.execute()
    •  未完待续...

 

转载于:https://www.cnblogs.com/dowi/p/10100878.html

你可能感兴趣的:(python,大数据)