ElasticSearch_聚合分析_简单聚合(count)

两个重要概念

有如下一组数据:

city name
北京 小李
北京 小王
北京 小张
上海 小陈
上海 小郑
上海 小赵

Bucket

基于city划分bucket,一个bucket是北京,一个bucket是上海;

所谓bucket就是分组。

metric

当bucket划分完成后,我们可以对bucket求和、求最大值或者最小值;

其中求和、求最大值或者最小值,就被称作metric。

数据准备

tvs(订单)索引创建

PUT /tvs
{
  "mappings": {
    "sales":{
      "properties": {
        "price":{
          "type": "long"
        },
        "color":{
          "type": "keyword"
        },
        "brand":{
          "type": "keyword"
        },
        "saleDate":{
          "type": "date"
        }
      }
    }
  }
}

插入数据

POST /tvs/sales/_bulk
{"index":{}}
{"price":1000,"color":"红色","brand":"长虹","saleDate":"2017-10-28"}
{"index":{}}
{"price":1000,"color":"红色","brand":"长虹","saleDate":"2017-11-28"}
{"index":{}}
{"price":2000,"color":"绿色","brand":"小米","saleDate":"2017-10-28"}
{"index":{}}
{"price":2000,"color":"红色","brand":"小米","saleDate":"2017-11-28"}
{"index":{}}
{"price":3000,"color":"绿色","brand":"TCL","saleDate":"2017-10-28"}
{"index":{}}
{"price":3000,"color":"红色","brand":"TCL","saleDate":"2017-11-28"}
{"index":{}}
{"price":4000,"color":"绿色","brand":"三星","saleDate":"2017-10-28"}
{"index":{}}
{"price":4000,"color":"红色","brand":"三星","saleDate":"2017-11-28"}

需求

那种颜色的家电销量最高?

DSL实现

GET /tvs/sales/_search
{
  "size": 0,
  "aggs": {
    "popular_colors": {
      "terms": {
        "field": "color"
      }
    }
  }
}

注意:doc_count 是es bucket 操作默认执行的一个metric。

size : 等于0表示只获取聚合结果,而不需要执行觉和的原始数据;
aggs: 固定语法,对一份数据执行分组聚合操作;
popular_colors:聚合的名字,是自己取的;
terms: 根据字段的值进行分组;
field: 根据指定的字段进行分组;


返回结果如下

{
  "took": 21,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 8,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "popular_colors": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "红色",
          "doc_count": 5
        },
        {
          "key": "绿色",
          "doc_count": 3
        }
      ]
    }
  }
}

hits.hits : 执行聚合的原始数据;上面size=0,所以这里为空;
aggregations: 聚合结果;
popular_colors:指定的某个聚合的名称;
buckets:根据聚合条件划分的buckets;

默认的排序规则:按照doc_count降序排序;

High Rest API 实现

package com.elasticsearch.aggregation;


import org.apache.http.HttpHost;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.junit.Test;

import java.io.IOException;
import java.util.List;

public class aggregation {

    /**
     *
     * 那种颜色的家电销量最高?
     * */
    @Test
    public void testAggregation01(){

        RestHighLevelClient client = getClient();

        SearchRequest searchRequest = new SearchRequest("tvs").types("sales");

        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.size(0);
        searchSourceBuilder.query(QueryBuilders.matchAllQuery());

        TermsAggregationBuilder aggregationColor = AggregationBuilders
                        .terms("popular_colors")  //聚合名称
                        .field("color");   //分组属性


        searchSourceBuilder.aggregation(aggregationColor);
        searchRequest.source(searchSourceBuilder);

        SearchResponse searchResponse = null;

        try{
            searchResponse = client.search(searchRequest);
        }catch (IOException e){
            e.printStackTrace();
        }

        Aggregations aggregations = searchResponse.getAggregations();
        Terms popular_colors = aggregations.get("popular_colors");
        List buckets = popular_colors.getBuckets();
        for(Terms.Bucket bucket : buckets){
            System.out.println(bucket.getKey() + ":" + bucket.getDocCount());
        }

    }

    private RestHighLevelClient getClient()
    {
        HttpHost httpHost = new HttpHost("localhost", 9200, "http");
        HttpHost[] httpHostsArray = new HttpHost[1];
        httpHostsArray[0] = httpHost;
        RestClientBuilder builder = RestClient.builder(httpHost);
        RestClient build = builder.build();
        RestHighLevelClient restHighLevelClient = new RestHighLevelClient(build);
        return restHighLevelClient;
    }

}

你可能感兴趣的:(ElasticSearch)