elasticsearch 去重查询并进行分页

去重查询的俩种方式:

在进行去重查询时,原来的目的是对于查询出的结果中一模一样的数据进行去重,但是各种百度发现都是对于单一字段的去重查询,最后索性新增了一个字段,将其他字段拼接了起来,从而根据拼接的字段进行去重查询

1.使用字段聚合+top_hits聚合方式

dsl:

GET sjck_personnel/_search
{
  "size": 0, 
  "aggs": {
    "query_agg": {
      "terms": {
        "field": "concat_field"
      },
      "aggs": {
        "query_top_hits": {
          "top_hits": {
            "size": 1
          }
        }
      }
    }
  }
}

javaApi:太麻烦了没用

2.使用collapse折叠功能(简单好用)

collapse 进行去重就是将重复的数据进行了折叠

dsl:

GET sjck_personnel/_search
{
  "query": {
    "bool": {
     "should": [
       {
         "bool": {
           "must": [
             {
               "term": {
                 "xm": {
                   "value": "雷晶晶"
                 }
               }
             }
           ]
         }
       }
     ]
    }
  }, 
  "collapse": {
    "field": "concat_field"
  }
}

javaApi

// 1.构建查询请求
SearchRequest searchRequest = new SearchRequest(indexName);
// 4.构建最外面的boolQuery
BoolQueryBuilder query = QueryBuilders.boolQuery();
// 5.构建查询请求
synQueryPersonnelIndexBuilder(query, options);
//6.高亮
HighlightBuilder highlightBuilder = new HighlightBuilder();
// 所有查询出来的字段全部高亮
HighlightBuilder.Field highlightTitle = new HighlightBuilder.Field("*").requireFieldMatch(false);
highlightTitle.highlighterType("unified");
highlightBuilder.field(highlightTitle);
// 将重复数据折叠
CollapseBuilder collapseBuilder = new CollapseBuilder("concat_field");
//从第几条开始
int scol = param.getPageSize() * (param.getPageNum() - 1);
// 3.构建高亮
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(query).highlighter(highlightBuilder).from(scol).size(param.getPageSize()).collapse(collapseBuilder);

// 2.将查询构建器放入查询请求中
searchRequest.source(sourceBuilder);

==============================================================
上面虽然完成了对查询结果的去重,但是!!!

因为collapse 只是对查询的结果数据进行折叠,所以结果返回的total 命中的总条数还是原来的总条数,总条数>显示出来的去重后的总条数

所以我再使用CollapseBuilder的基础上,又使用cardinality查询了去重后的总条数,然后将其返回,前端的分页就不会发生错误了

dsl:

GET sjck_personnel/_search
{
  "query": {
    "bool": {
     "should": [
       {
         "bool": {
           "must": [
             {
               "term": {
                 "xm": {
                   "value": "雷晶晶"
                 }
               }
             }
           ]
         }
       }
     ]
    }
  }
  , "aggs": {
    "total_size": {
      "cardinality": {
        "field": "concat_field"
      }
    }
  }
}

javaApi:

private long getCardinality(Elastic param, HashMap bindParams) {
    // 获取查询的索引列表
    String indexName = "sjck_personnel";
    // 获取查询的条件列表
    List> options = (List>) bindParams.get("conditions");

    // 1.构建查询请求
    SearchRequest searchRequest = new SearchRequest(indexName);
    // 4.构建最外面的boolQuery
    BoolQueryBuilder query = QueryBuilders.boolQuery();
    // 5.构建查询请求
    synQueryPersonnelIndexBuilder(query, options);
    //6.高亮
    HighlightBuilder highlightBuilder = new HighlightBuilder();
    // 所有查询出来的字段全部高亮
    HighlightBuilder.Field highlightTitle = new HighlightBuilder.Field("*").requireFieldMatch(false);
    highlightTitle.highlighterType("unified");
    highlightBuilder.field(highlightTitle);
    //从第几条开始
    int scol = param.getPageSize() * (param.getPageNum() - 1);
    // 3.构建高亮
    AggregationBuilder aggregation = AggregationBuilders.cardinality("total_size").field("concat_field");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder().query(query).highlighter(highlightBuilder).from(scol).size(param.getPageSize()).aggregation(aggregation);

    // 2.将查询构建器放入查询请求中
    searchRequest.source(sourceBuilder);
    SearchResponse searchResponse = null;
    try {
        searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
    } catch (ElasticsearchStatusException e) {
        logger.error("请检查elasticsearchIndex是否存在{},错误信息{}", e, e.getMessage());
    } catch (IOException e) {
        logger.error("搜索出错了{},错误信息{}", e, e.getMessage());
    }
    assert searchResponse != null;
    ParsedCardinality parsedCardinality = (ParsedCardinality) searchResponse.getAggregations().asList().get(0);
    return parsedCardinality.getValue();
}

去重查询+分页,搞定!(笨人本办法…)

你可能感兴趣的:(技术使用总结,知识总结,elasticsearch)