elasticsearch(七)调用TransportClient查询结果聚合统计分组

聚合分析是数据库中重要的功能特性,完成对一个查询的数据集中数据的聚合计算,如:找出某字段(或计算表达式的结果)的最大值、最小值,计算和、平均值等。ES作为搜索引擎兼数据库,同样提供了强大的聚合分析能力。

对一个数据集求最大、最小、和、平均值等指标的聚合,在ES中称为指标聚合   metric而关系型数据库中除了有聚合函数外,还可以对查询出的数据进行分组group by,再在组上进行指标聚合。在 ES 中group by 称为分桶桶聚合 bucket

聚合参考

TransportClient查询条件聚合处理对象 AggregationBuilder。

 1、对一个数据集求最大、最小、总和、平均值

//聚合处理
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 统计个数
AbstractAggregationBuilder valueCountAggregationBuilder = AggregationBuilders.count("count").field("name");
// 总和, 平均值, 最大值,最小值
AbstractAggregationBuilder sumAggregationBuilder = AggregationBuilders.sum("sum").field("score");
AbstractAggregationBuilder avgAggregationBuilder = AggregationBuilders.avg("avg").field("score");
AbstractAggregationBuilder maxAggregationBuilder = AggregationBuilders.max("max").field("score");
AbstractAggregationBuilder minAggregationBuilder = AggregationBuilders.min("min").field("score");
		sourceBuilder.aggregation(valueCountAggregationBuilder).aggregation(sumAggregationBuilder).aggregation(avgAggregationBuilder)
				.aggregation(maxAggregationBuilder).aggregation(minAggregationBuilder);
try {
	//查询索引对象
	SearchRequest searchRequest = new SearchRequest(index);
	searchRequest.types(type);
	searchRequest.source(sourceBuilder);
	SearchResponse response = client.search(searchRequest).get();
	System.out.println(response);
} catch (InterruptedException | ExecutionException e) {
	e.printStackTrace();
}

2、对单个字段分组后,再进行聚合统计

需要用到一个概念叫子聚合,就是在进行一次聚合分组完成后,形成一个中间表数据,再针对这个进行一次聚合例如更具名称分组,并求出这个人的分数的最大、最小、总和、平均值。

分组使用的对象的 TermsAggregationBuilder,例如构建一个分组:

TermsAggregationBuilder aggregation = AggregationBuilders.terms("name").field("name").order(Terms.Order.aggregation("name", true));

order()方法可以对数据进行排序,size方法可以控制统计数据显示的条数默认10。

aggregation.size(10);

分组后进行聚合:

sql  :   

select sum(score), avg(score),count(name),max(score), min(score) from table  group by  name 

注意代码里面的重点1,2,3此处不这样传值就是另外一个结果。

//聚合处理
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 统计个数
AbstractAggregationBuilder valueCountAggregationBuilder = AggregationBuilders.count("count").field("name");
// 总和, 平均值, 最大值,最小值
AbstractAggregationBuilder sumAggregationBuilder = AggregationBuilders.sum("sum").field("score");
AbstractAggregationBuilder avgAggregationBuilder = AggregationBuilders.avg("avg").field("score");
AbstractAggregationBuilder maxAggregationBuilder = AggregationBuilders.max("max").field("score");
AbstractAggregationBuilder minAggregationBuilder = AggregationBuilders.min("min").field("score");

//		sourceBuilder.aggregation(valueCountAggregationBuilder).aggregation(sumAggregationBuilder).aggregation(avgAggregationBuilder)
//				.aggregation(maxAggregationBuilder).aggregation(minAggregationBuilder);

//重点1 分组
TermsAggregationBuilder aggregation = AggregationBuilders.terms("name").field("name").order(Terms.Order.aggregation("name", true));
//重点2 子聚合
aggregation.subAggregation(valueCountAggregationBuilder).subAggregation(sumAggregationBuilder).subAggregation(avgAggregationBuilder).
		subAggregation(maxAggregationBuilder).subAggregation(minAggregationBuilder);
//重点3 添加aggregation到sourceBuilder
sourceBuilder.aggregation(aggregation);
try {
	//查询索引对象
	SearchRequest searchRequest = new SearchRequest(index);
	searchRequest.types(type);
	searchRequest.source(sourceBuilder);
	SearchResponse response = client.search(searchRequest).get();
	System.out.println(response);
} catch (InterruptedException | ExecutionException e) {
	e.printStackTrace();
}

3、对多个字段分组后,再进行聚合统计

需要用到一个概念叫子聚合,就是在进行一次聚合分组完成后,形成一个中间表数据,再针对这个进行一次聚合例如更具名称分组,并求出这个人的分数的最大、最小、总和、平均值。

对字段分使用到一个对象:   Script 相关资料可以查询 ES Script官网介绍。

由于ES不支持多个字段分组处理,于是只能自己写script脚本来进行多字段分组实现思想:

1.将分组字段的值取出来使用特殊字符进行分割,实现一个单字符字段,

例如将name,age,sex 值组合 特殊分隔符为;

id name age sex score
1 小明 3 60
2 小红 2 80
1 小明 3 70
2 小红 2 90
1 小明 3 80

 

 

 

 

 

 

 

转换后

id name sum avg max min
1 小明3 210 70 80 60
2 小红2 170 85 90 80

 

 

 

 

使用ES单字段聚合,对name在进行字符分割处理,使用name.split(""),处理后列表,以曲线救国方式来实现多字段分组。

id name age sex sum avg max min
1 小明 3 210 70 80 60
2 小红 2 170 85 90 80

 

 

 

 

sql :使用单字段方法聚合分组处理,

select sum(score), avg(score),max(score), min(score) from table  group by  name, age , sex 
//聚合处理
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
// 总和, 平均值, 最大值,最小值
AbstractAggregationBuilder sumAggregationBuilder = AggregationBuilders.sum("sum").field("score");
AbstractAggregationBuilder avgAggregationBuilder = AggregationBuilders.avg("avg").field("score");
AbstractAggregationBuilder maxAggregationBuilder = AggregationBuilders.max("max").field("score");
AbstractAggregationBuilder minAggregationBuilder = AggregationBuilders.min("min").field("score");

//		sourceBuilder.aggregation(valueCountAggregationBuilder).aggregation(sumAggregationBuilder).aggregation(avgAggregationBuilder)
//				.aggregation(maxAggregationBuilder).aggregation(minAggregationBuilder);

//此处定义分割线方便后面处理
String SEPARATOR = "";
//重点1 分组
TermsAggregationBuilder aggregation = AggregationBuilders.terms("name").field("name").order(Terms.Order.aggregation("name", true));

String scriptStr = "doc['name'].value +'" + SEPARATOR + "' + doc['age'].value" + SEPARATOR + "' + doc['sex'].value";

Script script = new Script(ScriptType.INLINE, Script.DEFAULT_SCRIPT_LANG, scriptStr, new HashMap<>());

//重点2 子聚合添加script
		aggregation.script(script).subAggregation(sumAggregationBuilder).subAggregation(avgAggregationBuilder).
				subAggregation(maxAggregationBuilder).subAggregation(minAggregationBuilder);
//重点3 添加aggregation到sourceBuilder
sourceBuilder.aggregation(aggregation);
try {
	//查询索引对象
	SearchRequest searchRequest = new SearchRequest(index);
	searchRequest.types(type);
	searchRequest.source(sourceBuilder);
	SearchResponse searchResponse = client.search(searchRequest).get();
	System.out.println(searchResponse);
} catch (InterruptedException | ExecutionException e) {
	e.printStackTrace();
}

 

你可能感兴趣的:(java,elasticsearch)