elasticsearch中保存了多个文档,而每个文档中包含多个字段,如果需要对某个字段的文档进行统计归类,可以使用聚合查询的方法
1、对某个字段进行聚合统计
curl -X GET "elasticsearch.in.netwa.cn:9200/my_index/_search" -H 'Content-Type: application/json' -d'
{
"aggs": {
"all_interests": {
"terms": { "field": "first_tag" }
}
}
}
'
结果如下:
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":123,"max_score":1.0,"hits":[...]},"aggregations":{"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":1,"buckets":[{"key":"自然","doc_count":36},{"key":"地理","doc_count":32},{"key":"人文","doc_count":55}]}}}
hits中的total显示了所有统计的文档数量:123
aggregations展示了聚合统计的结果
buckets包含了多个统计结果,first_tag字段统计的结果分别是:
自然:36
地理:32
人文:55
2、将查询结果在进行聚合统计
首先查询字段first_tag,得到的结果再对second_tag字段进行聚合统计
curl -X GET "elasticsearch.in.netwa.cn:9200/my_index/_search" -H 'Content-Type: application/json' -d'
{
"query": {
"term": {
"first_tag": "自然"
}
},
"aggs": {
"all_interests": {
"terms": { "field": "second_tag" }
}
}
}
'
结果如下:
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":36,"max_score":2.2581334,"hits":[...]},"aggregations":{"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"生物","doc_count":12},{"key":"常识","doc_count":10},{"key":"天文","doc_count":14}]}}}
hits中的total显示了所有统计的文档数量36
buckets包含了多个统计结果,first_tag字段统计的结果分别是:
生物:12
常识:10
天文:14
3、对多个字段同时进行聚合统计
同时对first_tag和second_tag字段进行聚合统计
curl -X GET "elasticsearch.in.netwa.cn:9200/my_index/_search" -H 'Content-Type: application/json' -d'
{
"aggs": {
"all_interests": {
"terms": { "field": "first_tag" },
"aggs": {
"all_interests": {
"terms": { "field": "second_tag" }
}
}
}
}
}
'
结果如下:
{"took":3,"timed_out":false,"_shards":{"total":5,"successful":5,"skipped":0,"failed":0},"hits":{"total":123,"max_score":1.0,"hits":[...]},"aggregations":{"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":1,"buckets":[{"key":"人文","doc_count":55,"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"文学","doc_count":20},{"key":"艺术","doc_count":35}]}},{"key":"自然","doc_count":36,"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"生物","doc_count":12},{"key":"常识","doc_count":10},{"key":"天文","doc_count":14}]}},{"key":"地理","doc_count":32,"all_interests":{"doc_count_error_upper_bound":0,"sum_other_doc_count":0,"buckets":[{"key":"人文地理","doc_count":11},{"key":"自然地理","doc_count":21}]}}]}}}
hits中的total显示了所有统计的文档数量123
buckets包含了多个统计结果,first_tag及包含的second_tag字段统计的结果分别是:
人文:55,{文学:20,艺术:35}
自然:36 ,{生物:12,常识:10,天文:14}
地理:32 ,{人文地理:11,自然地理:21}
参考网址:
https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-aggregations-metrics-avg-aggregation.html
https://blog.csdn.net/xialei199023/article/details/48298635
https://blog.csdn.net/lpp_dd/article/details/73136059