elasticsearch java api elasticsearch-rest-high-level 查询url文本找不到文档

有一项业务数据分析比较复杂,由于使用kibana展示数据,无法直接聚合出想要的结果,于是想到了把聚合过程拆解,先聚合分析一部分然后处理后再存入es中之后再使用kibana聚合分析展示数据。

使用java调用查询 es的查询api, maven导入依赖

<dependency>
    <groupId>org.elasticsearch.clientgroupId>
    <artifactId>elasticsearch-rest-high-level-clientartifactId>
    <version>6.4.3version>
dependency>

<dependency>
    <groupId>org.elasticsearchgroupId>
    <artifactId>elasticsearchartifactId>
    <version>6.4.3version>
dependency>

 

配置客户端连接es服务器:

@Component
@Configuration
public class ESClientConfig {
    private static final Logger log = LoggerFactory.getLogger(ESClientConfig.class);

    @Value("${es.host}")
    private String host;

    @Value("${es.port}")
    private int port;

    @Bean
    public RestHighLevelClient restClient() {
        RestHighLevelClient  restClient = new RestHighLevelClient(RestClient.builder(
                new HttpHost(host,port,"http")).
                setMaxRetryTimeoutMillis(30000). //设置超时时间
                setFailureListener(new RestClient.FailureListener() {//设置一个嗅侦 当节点出现故障时收到通知
                    @Override
                    public void onFailure (Node node){
                        log.info(node.toString() );
                    }
                }));
        return restClient;
    }
}

创建es工具类:

@Component
public class ElasticsearchUtils {
    private static final Logger log = LoggerFactory.getLogger(ElasticsearchUtils.class);

    @Autowired
    private RestHighLevelClient restClient;

    private static RestHighLevelClient client;

    @PostConstruct
    public void init() throws IOException {
        client = this.restClient;
    }

    /**
     * @Description: 创建索引
     * @Param: 
     * @return: 
     * @Author: lijinliang
     * @Date: 2019/2/27
     */
    public static SearchHits select (String termName,String value) throws IOException {
        SearchRequest searchRequest = new SearchRequest("logstash-mfq-php-filebeat-*");//参数设置查询的文档库
        //搜索内容参数设置对象:SearchSourceBuilder
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        /*bool组合查询
        filter:过滤,不参与打分
        must:如果有多个条件,这些条件都必须满足  and与
        should:如果有多个条件,满足一个或多个即可 or或
        must_not:和must相反,必须都不满足条件才可以匹配到 !非
        ,"/Category/deleteOrder"
        */
        //QueryStringQueryBuilder分词查找
        //BoolQueryBuilder queryBuilders = QueryBuilders.boolQuery().must(new QueryStringQueryBuilder("/Collage/getCurCollage").field("url"));
        TermQueryBuilder queryBuilders = QueryBuilders.termQuery("url", "/Collage/getCurCollage");

        //将QueryBuilder对象添加到SearchSourceBuilder
        //searchSourceBuilder.query();
        searchSourceBuilder.query(queryBuilders);
        //设置查询的起始索引位置和数量:如下表示从第1条开始,共返回5条文档数据
        searchSourceBuilder.from(0);
        searchSourceBuilder.size(10);
        //searchSourceBuilder.fetchSource(false);
        //将SearchSourceBuilder对象添加到搜索请求中:
        searchRequest.source(searchSourceBuilder);
        log.info(searchRequest.toString());//查看请求数据
        SearchResponse searchResponse =  client.search(searchRequest);
        log.info(searchResponse.status().toString());
        log.info(searchResponse.getTook().toString());
        //log.info(searchResponse.isTerminatedEarly().toString());
        //log.info(""+searchResponse.isTimedOut());
        int totalShards = searchResponse.getTotalShards();
        int successfulShards = searchResponse.getSuccessfulShards();
        int failedShards = searchResponse.getFailedShards();
        log.info("totalShards:"+totalShards+"");
        log.info("successfulShards:"+successfulShards+"");
        log.info("failedShards:"+failedShards+"");
        for(ShardSearchFailure failure : searchResponse.getShardFailures()){
            //应该在这里处理失败
            log.info("failure.toString():"+failure.toString());
        }
        //访问检索文档中的内容
        SearchHits hits = searchResponse.getHits();
        //SearchHits提供所有点击全局信息,比如命中总数或最大比分:
        long totalHits = hits.getTotalHits();
        float maxScore = hits.getMaxScore();
        log.info("totalHits:"+totalHits);
        log.info("maxScore:"+maxScore);
        //嵌套在SearchHits可以迭代的单个搜索结果中:
        for (SearchHit hit:hits) {
            //JSON-String或键/值对的映射形式返回文档源
            String sourceAsString = hit.getSourceAsString();
            log.info("sourceAsString:"+sourceAsString);
        }
        return hits;
    }
}
 

设置的是不使用分词的的查询,结果根据url字段查询不到文档,其他字段都可以搜索到文档

在kibana中可以搜索到,后来想到去看下kibana的请求格式是怎样的,然后就在es的配置中把请求日志输出到日志文档

使用请求设置es配置:

PUT /_settings
{

    "index.search.slowlog.threshold.query.debug": "0s",

    "index.search.slowlog.threshold.fetch.debug": "0s",

    "index.indexing.slowlog.threshold.index.debug": "0s"

}

查看es的日志文件

tail -f logs/cluster_index_search_slowlog.log

发现kibana是用的是query_string

而我的请求数据 在tomcat日志里: 

2019-03-01 14:02:48.659  INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils  : SearchRequest{searchType=QUERY_THEN_FETCH, indices=[logstash-mfq-php-filebeat-*], indicesOptions=IndicesOptions[ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=null, source={"from":0,"size":10,"query":{"term":{"url":{"value":"/Collage/getCurCollage","boost":1.0}}}}}

2019-03-01 14:02:48.668  INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils  : OK

2019-03-01 14:02:48.668  INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils  : 0s

2019-03-01 14:02:48.668  INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils  : totalShards:5

2019-03-01 14:02:48.668  INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils  : successfulShards:5

2019-03-01 14:02:48.669  INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils  : failedShards:0

2019-03-01 14:02:48.669  INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils  : totalHits:0

2019-03-01 14:02:48.669  INFO 8540 --- [nio-8080-exec-4] k.a.elk.common.utils.ElasticsearchUtils  : maxScore:NaN

在es请求日志里:

query_string是分词查询

后来尝试改用BoolQueryBuilder构造请求

BoolQueryBuilder queryBuilders = QueryBuilders.boolQuery().must(new QueryStringQueryBuilder("/Collage/getCurCollage

").field("url"));

再次访问 tomcat 日志看到请求数据

2019-03-01 14:09:50.282  INFO 33452 --- [nio-8080-exec-1] k.a.elk.common.utils.ElasticsearchUtils  : SearchRequest{searchType=QUERY_THEN_FETCH, indices=[logstash-mfq-php-filebeat-*], indicesOptions=IndicesOptions[ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=null, source={"from":0,"size":10,"query":{"bool":{"must":[{"query_string":{"query":"/Collage/getCurCollage","fields":["url^1.0"],"type":"best_fields","default_operator":"or","max_determinized_states":10000,"enable_position_increments":true,"fuzziness":"AUTO","fuzzy_prefix_length":0,"fuzzy_max_expansions":50,"phrase_slop":0,"escape":false,"auto_generate_synonyms_phrase_query":true,"fuzzy_transpositions":true,"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}}}}

找到了文档

在es日志中查看请求

猜测可能是由于es分词造成的原因

url 是text 类型 url.keyword 是keyword类型,es 会把text分词建立索引 keyword不会,

后来使用了url.keyword作为字段查询就可以查到文档

使用TermQueryBuilder 构建请求
TermQueryBuilder queryBuilders = QueryBuilders.termQuery("url.keyword","/Collage/getCurCollage");
2019-03-01 16:30:44.158  INFO 34640 --- [nio-8080-exec-1] k.a.elk.common.utils.ElasticsearchUtils  : SearchRequest{searchType=QUERY_THEN_FETCH, indices=[logstash-mfq-php-filebeat-*], indicesOptions=IndicesOptions[ignore_unavailable=false, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false], types=[], routing='null', preference='null', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=null, source={"from":0,"size":10,"query":{"term":{"url.keyword":{"value":"/Collage/getCurCollage","boost":1.0}}}}}

elasticsearch java api elasticsearch-rest-high-level 查询url文本找不到文档_第1张图片

 

 

你可能感兴趣的:(java,elasticsearch)