有一项业务数据分析比较复杂,由于使用kibana展示数据,无法直接聚合出想要的结果,于是想到了把聚合过程拆解,先聚合分析一部分然后处理后再存入es中之后再使用kibana聚合分析展示数据。
使用java调用查询 es的查询api, maven导入依赖
<dependency>
<groupId>org.elasticsearch.clientgroupId>
<artifactId>elasticsearch-rest-high-level-clientartifactId>
<version>6.4.3version>
dependency>
<dependency>
<groupId>org.elasticsearchgroupId>
<artifactId>elasticsearchartifactId>
<version>6.4.3version>
dependency>
配置客户端连接es服务器:
@Component
@Configuration
public class ESClientConfig {
private static final Logger log = LoggerFactory.getLogger(ESClientConfig.class);
@Value("${es.host}")
private String host;
@Value("${es.port}")
private int port;
@Bean
public RestHighLevelClient restClient() {
RestHighLevelClient restClient = new RestHighLevelClient(RestClient.builder(
new HttpHost(host,port,"http")).
setMaxRetryTimeoutMillis(30000). //设置超时时间
setFailureListener(new RestClient.FailureListener() {//设置一个嗅侦 当节点出现故障时收到通知
@Override
public void onFailure (Node node){
log.info(node.toString() );
}
}));
return restClient;
}
}
创建es工具类:
@Component
public class ElasticsearchUtils {
private static final Logger log = LoggerFactory.getLogger(ElasticsearchUtils.class);
@Autowired
private RestHighLevelClient restClient;
private static RestHighLevelClient client;
@PostConstruct
public void init() throws IOException {
client = this.restClient;
}
/**
* @Description: 创建索引
* @Param:
* @return:
* @Author: lijinliang
* @Date: 2019/2/27
*/
public static SearchHits select (String termName,String value) throws IOException {
SearchRequest searchRequest = new SearchRequest("logstash-mfq-php-filebeat-*");//参数设置查询的文档库
//搜索内容参数设置对象:SearchSourceBuilder
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
/*bool组合查询
filter:过滤,不参与打分
must:如果有多个条件,这些条件都必须满足 and与
should:如果有多个条件,满足一个或多个即可 or或
must_not:和must相反,必须都不满足条件才可以匹配到 !非
,"/Category/deleteOrder"
*/
//QueryStringQueryBuilder分词查找
//BoolQueryBuilder queryBuilders = QueryBuilders.boolQuery().must(new QueryStringQueryBuilder("/Collage/getCurCollage").field("url"));
TermQueryBuilder queryBuilders = QueryBuilders.termQuery("url", "/Collage/getCurCollage");
//将QueryBuilder对象添加到SearchSourceBuilder
//searchSourceBuilder.query();
searchSourceBuilder.query(queryBuilders);
//设置查询的起始索引位置和数量:如下表示从第1条开始,共返回5条文档数据
searchSourceBuilder.from(0);
searchSourceBuilder.size(10);
//searchSourceBuilder.fetchSource(false);
//将SearchSourceBuilder对象添加到搜索请求中:
searchRequest.source(searchSourceBuilder);
log.info(searchRequest.toString());//查看请求数据
SearchResponse searchResponse = client.search(searchRequest);
log.info(searchResponse.status().toString());
log.info(searchResponse.getTook().toString());
//log.info(searchResponse.isTerminatedEarly().toString());
//log.info(""+searchResponse.isTimedOut());
int totalShards = searchResponse.getTotalShards();
int successfulShards = searchResponse.getSuccessfulShards();
int failedShards = searchResponse.getFailedShards();
log.info("totalShards:"+totalShards+"");
log.info("successfulShards:"+successfulShards+"");
log.info("failedShards:"+failedShards+"");
for(ShardSearchFailure failure : searchResponse.getShardFailures()){
//应该在这里处理失败
log.info("failure.toString():"+failure.toString());
}
//访问检索文档中的内容
SearchHits hits = searchResponse.getHits();
//SearchHits提供所有点击全局信息,比如命中总数或最大比分:
long totalHits = hits.getTotalHits();
float maxScore = hits.getMaxScore();
log.info("totalHits:"+totalHits);
log.info("maxScore:"+maxScore);
//嵌套在SearchHits可以迭代的单个搜索结果中:
for (SearchHit hit:hits) {
//JSON-String或键/值对的映射形式返回文档源
String sourceAsString = hit.getSourceAsString();
log.info("sourceAsString:"+sourceAsString);
}
return hits;
}
}
设置的是不使用分词的的查询,结果根据url字段查询不到文档,其他字段都可以搜索到文档
在kibana中可以搜索到,后来想到去看下kibana的请求格式是怎样的,然后就在es的配置中把请求日志输出到日志文档
使用请求设置es配置:
PUT /_settings
{
"index.search.slowlog.threshold.query.debug": "0s",
"index.search.slowlog.threshold.fetch.debug": "0s",
"index.indexing.slowlog.threshold.index.debug": "0s"
}
查看es的日志文件
tail -f logs/cluster_index_search_slowlog.log
发现kibana是用的是query_string
而我的请求数据 在tomcat日志里:
2019-03-01 14:02:48.659 INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils : SearchRequest{searchType=QUERY_THEN_FETCH, indices=[logstash-mfq-php-filebeat-*], indicesOptions=IndicesOptions[ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=null, source={"from":0,"size":10,"query":{"term":{"url":{"value":"/Collage/getCurCollage","boost":1.0}}}}}
2019-03-01 14:02:48.668 INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils : OK
2019-03-01 14:02:48.668 INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils : 0s
2019-03-01 14:02:48.668 INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils : totalShards:5
2019-03-01 14:02:48.668 INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils : successfulShards:5
2019-03-01 14:02:48.669 INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils : failedShards:0
2019-03-01 14:02:48.669 INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils : totalHits:0
2019-03-01 14:02:48.669 INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils : maxScore:NaN
在es请求日志里:
query_string是分词查询
后来尝试改用BoolQueryBuilder构造请求
BoolQueryBuilder queryBuilders = QueryBuilders.boolQuery().must(new QueryStringQueryBuilder("/Collage/getCurCollage
").field("url"));
再次访问 tomcat 日志看到请求数据
2019-03-01 14:09:50.282 INFO 33452 --- [nio-8080-exec-1] k.a.elk.common.utils.ElasticsearchUtils : SearchRequest{searchType=QUERY_THEN_FETCH, indices=[logstash-mfq-php-filebeat-*], indicesOptions=IndicesOptions[ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=null, source={"from":0,"size":10,"query":{"bool":{"must":[{"query_string":{"query":"/Collage/getCurCollage","fields":["url^1.0"],"type":"best_fields","default_operator":"or","max_determinized_states":10000,"enable_position_increments":true,"fuzziness":"AUTO","fuzzy_prefix_length":0,"fuzzy_max_expansions":50,"phrase_slop":0,"escape":false,"auto_generate_synonyms_phrase_query":true,"fuzzy_transpositions":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}}}}
找到了文档
在es日志中查看请求
猜测可能是由于es分词造成的原因
url 是text 类型 url.keyword 是keyword类型,es 会把text分词建立索引 keyword不会,
后来使用了url.keyword作为字段查询就可以查到文档
使用TermQueryBuilder 构建请求 TermQueryBuilder queryBuilders = QueryBuilders.termQuery("url.keyword","/Collage/getCurCollage");
2019-03-01 16:30:44.158 INFO 34640 --- [nio-8080-exec-1] k.a.elk.common.utils.ElasticsearchUtils : SearchRequest{searchType=QUERY_THEN_FETCH, indices=[logstash-mfq-php-filebeat-*], indicesOptions=IndicesOptions[ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=null, source={"from":0,"size":10,"query":{"term":{"url.keyword":{"value":"/Collage/getCurCollage","boost":1.0}}}}}