Elasticsearch笔记

1. elasticsearch java

1.1 elasticsearch查询空值字段

    queryBuilder.mustNot(QueryBuilders.existsQuery("recheckDetails.reCheckTime"));

1.2 java 操作elasticsearch实现两个字段对比

Map params = new HashMap<>();
Script script = new Script(ScriptType.INLINE,"painless",
    "doc['recheckDetails.reCheckTime'].value - doc['qualityTime'].value > 259200000",params);
ScriptQueryBuilder scriptQueryBuilder = new ScriptQueryBuilder(script);
queryBuilder.must(scriptQueryBuilder);

1.3 elasticsearch自定义查询

HashMap params=new HashMap<>();
HashMap data=new HashMap<>();
data.put("name","12345");
HashMap newdata=new HashMap<>();
newdata.put("name","789");
params.put("data",data);
params.put("newdata",newdata);
StringBuffer sb_json = new StringBuffer("if (ctx.op == \"create\") ctx._source=data; else ctx._source=newdata");
Script script = new Script(sb_json.toString(), ScriptService.ScriptType.INLINE, "groovy", params);
UpdateRequestBuilder urb= client.prepareUpdate("active2018-03-11", "active", "16");
urb.setScript(script);
urb.setScriptedUpsert(true);
urb.setUpsert("{}");//必须有这个值,否则会报document missing exception

 

2. elasticsearch 排序(★)

2.1 随机排序

sb.setSize(20); //随机排序后返回20条
Map params = new HashMap<>();
Script script = new Script(ScriptType.INLINE,"painless",
        "Math.random()",params);
ScriptSortBuilder scriptQueryBuilder = SortBuilders.
        scriptSort(script, ScriptSortBuilder.ScriptSortType.NUMBER).order(SortOrder.DESC);
sb.addSort(scriptQueryBuilder);

脚本排序

{
    "query" : {
     "query_string": {
         "query": "",
         "default_operator": "and"
          }
    },

    "sort" : {
        "_script" : {
            "type" : "number",
            "script" : {
                "inline": "doc['field_name'].value * factor",
                "params" : {
                    "factor" : 1.1
                }
            },

            "order" : "asc"
        }
    }
}

2.2 按指定字段排序

public void searchOrderBy()throws Exception{
    SearchRequestBuilder srb=client.prepareSearch("film").setTypes("dongzuo");
    SearchResponse sr=srb.setQuery(QueryBuilders.matchAllQuery())
            .addSort("publishDate", SortOrder.DESC)
            .execute()
            .actionGet(); // 分页排序所有
    SearchHits hits=sr.getHits();
    for(SearchHit hit:hits){
        System.out.println(hit.getSourceAsString());
    }
}

2.3 自定义排序

Script脚本自定义排序

Map params = new HashMap<>();
params.put("list",orderLst);
//"if(doc['orderNo.keyword'].value.equals(params.list[45]))return 1;return 0;"
String idOrCode =
        "if(params.list.indexOf(doc['orderNo.keyword'].value)>-1){return 2;}else{return 0;}";
Script script = new Script(ScriptType.INLINE,"painless",
        idOrCode,params);
ScriptSortBuilder scriptQueryBuilder = SortBuilders.
        scriptSort(script, ScriptSortBuilder.ScriptSortType.NUMBER).order(SortOrder.DESC);
searchRequestBuilder.addSort(scriptQueryBuilder);


String idOrCode =
        "if(params.list.indexOf(doc['orderNo.keyword'].value)>-1)" +
                "{return params.list.indexOf(doc['orderNo.keyword'].value);}else{return -2;}";

3. elasticsearch 聚合

3.1 按日期统计

public void test(@RequestParam("isFilter") boolean isFilter) {
        try {
            client = ElasticSearchClient.getClient();
            BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
            if(isFilter){
                //过滤质检项小于0
                queryBuilder.must(QueryBuilders.rangeQuery("itemScore.keyword").lt(0L));
            }
            AggregationBuilder aggregation = AggregationBuilders
                    .dateHistogram("agg")
                    .field("createTime")
                    .format("yyyy-MM-dd")
                    .dateHistogramInterval(DateHistogramInterval.DAY);
            //按order分组
            AggregationBuilder nameAgg = AggregationBuilders.terms("order").field("orderNo.keyword").size(1000);
            //按item分组
            AggregationBuilder itemAgg = AggregationBuilders.terms("item").field("item.keyword");
            AggregationBuilder countAgg = AggregationBuilders.cardinality("count").field("orderNo.keyword");
            aggregation.subAggregation(countAgg).subAggregation(itemAgg);

            SearchRequestBuilder reqBuilder = client.prepareSearch(ESConst.INDEX_VOICEPLATFORM)
                    .setTypes(ESConst.TYPE_VOICE_ITEM_SCORE).setQuery(queryBuilder).setSize(0)
                    .addAggregation(aggregation);

            SearchResponse resp = reqBuilder.execute().actionGet();
            logger.info("统计结果>",resp.toString());

            Histogram histogram = resp.getAggregations().get("agg");
            DateFormat sdf = new SimpleDateFormat("yyyy-MM-dd");
            for (Histogram.Bucket entry : histogram.getBuckets()) {
                String key = sdf.format(DateFormat.getDateInstance().parse(entry.getKey().toString()));
                // Doc count
                Long count = entry.getDocCount();
                //获取根据orderNo去重统计结果
                InternalCardinality cardinality = (InternalCardinality) entry.getAggregations().getProperty("count");
                //获取item分组结果
                Terms terms = (InternalTerms) entry.getAggregations().getProperty("item");
                for(Terms.Bucket term : terms.getBuckets()){
                    long value = term.getDocCount();
//                    System.out.println("itemCount:"+value);
                }
                logger.info(key + ":" + cardinality.getValue() + "条");
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

elasticsearch API统计脚本:

{
  "aggs": {
    "sales": {
      "date_histogram": {
        "field": "createTime",
        "interval": "month",
        "format": "yyyy-MM-dd",
        "min_doc_count": 0,
        "extended_bounds": {
          "min": 1535731200000,
          "max": 1544544000000
        }
      },
      "aggs": {
        "item": {
          "terms": {
            "field": "item.keyword"
          }
        }
      }
    }
  }
}

3.2 分组统计

Post请求url:http://IP:9200/index/type/_search/
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "actType.keyword": "login"
          }
        }
      ]
    }
  },
  "aggs": {
    "sales": {
      "terms": {
        "field": "userName.keyword",
        "size": 1000  // size必须要设置,不然只返回10条
      }
    }
  }
}
聚合嵌套:http://IP:9200/index/type/_search/
{
  "aggs": {
    "sales": {
      "date_histogram": {
        "field": "createTime",
        "interval": "month",
        "format": "yyyy-MM-dd",
        "min_doc_count": 0,
        "extended_bounds": { //这个参数强制返回整年
          "min": 1535731200000,
          "max": 1544544000000
        }
      },
      "aggs": {
        "item": {
          "terms": {
            "field": "item.keyword",
            "size": 100
          },
          "aggs": {
            "groupBY": {
              "cardinality": {
                "field": "orderNo.keyword"
              }
            }
          }
        }
      }
    }
  }
}

 

3.3 分组统计(Script)

TermsAggregationBuilder teamAgg = AggregationBuilders.terms("code_count").field("code.keyword");
TermsAggregationBuilder subAgg = AggregationBuilders.terms("type_count").field("type.keyword");
TermsAggregationBuilder sub1Agg = AggregationBuilders.terms("typeTag_count").field("typeTag.keyword");
Script script = new Script(ScriptType.INLINE,
        "painless", "doc['typeTag.keyword'].value+';'+doc['type.keyword'].value",
        new HashMap());
TermsAggregationBuilder callTypeTeamAgg =AggregationBuilders.terms("type_count").script(script);
searchRequestBuilder.addAggregation(teamAgg.subAggregation(callTypeTeamAgg));

3.4 elasticsearch 实现having聚合

client = ElasticSearchClient.getClient();
BoolQueryBuilder queryBuilder = QueryBuilders.boolQuery();
SearchRequestBuilder searchRequestBuilder;

searchRequestBuilder = client.prepareSearch(ESConst.INDEX_DATA);
searchRequestBuilder.setTypes(ESConst.TYPE_KFGDXX);
searchRequestBuilder.setQuery(queryBuilder);
searchRequestBuilder.setSize(0);
TermsAggregationBuilder ldhm = AggregationBuilders.terms("ldhm_count")
        .field("ldhm.keyword").size(Integer.MAX_VALUE);

//声明BucketPath,用于后面的bucket筛选
Map bucketsPathsMap = new HashMap<>(8);
bucketsPathsMap.put("orderCount", "_count");
// 返回 doc_count 大于1的结果
Script script = new Script("params.orderCount >1");
//构建bucket选择器
BucketSelectorPipelineAggregationBuilder bs = PipelineAggregatorBuilders.
        bucketSelector("having", bucketsPathsMap, script);

searchRequestBuilder.addAggregation(ldhm.subAggregation(bs));
SearchResponse response = searchRequestBuilder.execute().actionGet();
Terms terms = response.getAggregations().get("ldhm_count");

3.5 elasticsearch 聚合输出部分字段

String[] includes = {"ejzlmc","ejzlbs"};
TopHitsAggregationBuilder top = AggregationBuilders.topHits("top")
        .fetchSource(includes, Strings.EMPTY_ARRAY).size(20);
searchRequestBuilder.addAggregation(ldhm.subAggregation(top));
Terms terms = response.getAggregations().get("ejzlbs_count");

for(Terms.Bucket bucket : terms.getBuckets()){
    KFGDAgg kfgdAgg = new KFGDAgg();
    String ejzlbs = bucket.getKeyAsString().split(";")[0];
    String ldhm = bucket.getKeyAsString().split(";")[1];
    kfgdAgg.setEjzlbs(ejzlbs);
    kfgdAgg.setLdhm(ldhm);
    TopHits topHits = bucket.getAggregations().get("top");
    System.out.println(topHits.getHits().getAt(0).getSource().get("ywlbdm"));
    for(SearchHit hit : topHits.getHits()){
        System.out.println(hit.getSource().get("ywlbdm"));
    }
}

Elasticsearch笔记_第1张图片

注意事项:

Elasticsearch版本号:5.5.1

topHits方法是获取从上向下的k条数据,k通过size(k)开设置,对于去重来说,这里k=1

fetchSource(String[] includes,String[] excludes)是获取部分字段,参数为两个字符串数组,includes为要获取的字段数组,excludes为不要获取的字段数组,excludes可以为空,所以这里设置为Strings.EMPTY_ARRAY

size((1<<31)-1)是获取所有的分组,es5.X版本之前的是使用size(0)获取所有数据

 

4. elasticsearch 数据备份工具

4.1 elasticsearch-dump工具安装

elasticsearch数据备份工具
https://github.com/taskrabbit/elasticsearch-dump
elasticsearch-dump安装:
(local)
npm install elasticdump
./bin/elasticdump

(global)
npm install elasticdump -g
elasticdump

4.2 elasticdump使用

1、导出索引里单个type数据:

elasticdump --input=http://IP:9200/Index/type --output=data.json --type=data

2、导出整个索引数据:

elasticdump --input=http://IP:9200/Index --output= data.json --type=data

3、导入:

elasticdump --input=data.json --output=http://localhost:9200 --type=data

你可能感兴趣的:(Elasticsearch笔记)