有如下一组数据:
city name
北京 小李
北京 小王
北京 小张
上海 小陈
上海 小郑
上海 小赵
基于city划分bucket,一个bucket是北京,一个bucket是上海;
所谓bucket就是分组。
当bucket划分完成后,我们可以对bucket求和、求最大值或者最小值;
其中求和、求最大值或者最小值,就被称作metric。
PUT /tvs
{
"mappings": {
"sales":{
"properties": {
"price":{
"type": "long"
},
"color":{
"type": "keyword"
},
"brand":{
"type": "keyword"
},
"saleDate":{
"type": "date"
}
}
}
}
}
POST /tvs/sales/_bulk
{"index":{}}
{"price":1000,"color":"红色","brand":"长虹","saleDate":"2017-10-28"}
{"index":{}}
{"price":1000,"color":"红色","brand":"长虹","saleDate":"2017-11-28"}
{"index":{}}
{"price":2000,"color":"绿色","brand":"小米","saleDate":"2017-10-28"}
{"index":{}}
{"price":2000,"color":"红色","brand":"小米","saleDate":"2017-11-28"}
{"index":{}}
{"price":3000,"color":"绿色","brand":"TCL","saleDate":"2017-10-28"}
{"index":{}}
{"price":3000,"color":"红色","brand":"TCL","saleDate":"2017-11-28"}
{"index":{}}
{"price":4000,"color":"绿色","brand":"三星","saleDate":"2017-10-28"}
{"index":{}}
{"price":4000,"color":"红色","brand":"三星","saleDate":"2017-11-28"}
那种颜色的家电销量最高?
GET /tvs/sales/_search
{
"size": 0,
"aggs": {
"popular_colors": {
"terms": {
"field": "color"
}
}
}
}
注意:doc_count 是es bucket 操作默认执行的一个metric。
size : 等于0表示只获取聚合结果,而不需要执行觉和的原始数据;
aggs: 固定语法,对一份数据执行分组聚合操作;
popular_colors:聚合的名字,是自己取的;
terms: 根据字段的值进行分组;
field: 根据指定的字段进行分组;
返回结果如下
{
"took": 21,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 8,
"max_score": 0,
"hits": []
},
"aggregations": {
"popular_colors": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "红色",
"doc_count": 5
},
{
"key": "绿色",
"doc_count": 3
}
]
}
}
}
hits.hits : 执行聚合的原始数据;上面size=0,所以这里为空;
aggregations: 聚合结果;
popular_colors:指定的某个聚合的名称;
buckets:根据聚合条件划分的buckets;
默认的排序规则:按照doc_count降序排序;
package com.elasticsearch.aggregation;
import org.apache.http.HttpHost;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.junit.Test;
import java.io.IOException;
import java.util.List;
public class aggregation {
/**
*
* 那种颜色的家电销量最高?
* */
@Test
public void testAggregation01(){
RestHighLevelClient client = getClient();
SearchRequest searchRequest = new SearchRequest("tvs").types("sales");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.size(0);
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
TermsAggregationBuilder aggregationColor = AggregationBuilders
.terms("popular_colors") //聚合名称
.field("color"); //分组属性
searchSourceBuilder.aggregation(aggregationColor);
searchRequest.source(searchSourceBuilder);
SearchResponse searchResponse = null;
try{
searchResponse = client.search(searchRequest);
}catch (IOException e){
e.printStackTrace();
}
Aggregations aggregations = searchResponse.getAggregations();
Terms popular_colors = aggregations.get("popular_colors");
List extends Terms.Bucket> buckets = popular_colors.getBuckets();
for(Terms.Bucket bucket : buckets){
System.out.println(bucket.getKey() + ":" + bucket.getDocCount());
}
}
private RestHighLevelClient getClient()
{
HttpHost httpHost = new HttpHost("localhost", 9200, "http");
HttpHost[] httpHostsArray = new HttpHost[1];
httpHostsArray[0] = httpHost;
RestClientBuilder builder = RestClient.builder(httpHost);
RestClient build = builder.build();
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(build);
return restHighLevelClient;
}
}