1.建立连接
Client client = TransportClient.builder()
.build().addTransportAddress(new InetSocketTransportAddress
(InetAddress.getByName("localhost"), 9300));
2.断开连接
client.close();
3.新建索引
如下索引名为"index_name",类型名为"type_name"。设定其中只有2个字段name和age。name的类型为string,type的类型为long。store表示是否为这个字段建索引,比如查询或聚合时用不到age这个字段,就可以将store设为no。index对string类型字段默认为analyzed(其他类型默认为not_analyzed),就是说对这个字段会进行分词。我们希望查询时完全匹配,不应该分词,所以设为not_analyzed。
client.admin().indices().prepareCreate("index_name").execute().actionGet();
XContentBuilder builder = XContentFactory
.jsonBuilder()
.startObject()
.startObject("type_name")
.startObject("properties")
.startObject("name")
.field("type", "string")
.field("store", "yes")
.field("index","not_analyzed")
.endObject()
.startObject("age")
.field("type", "long")
.field("store", "no")
.endObject()
.endObject()
.endObject()
.endObject();
PutMappingRequest mapping = Requests.putMappingRequest("index_name").type("type_name").source(builder);
client.admin().indices().putMapping(mapping).actionGet();
4.删除索引
DeleteIndexResponse dResponse = client.admin().indices().prepareDelete(index)
.execute().actionGet();
5.插入数据
其实不做前面的新建索引工作,直接在本不存在的索引里插入数据,就会自动新建索引并插入数据。只是这个索引的一切都是默认的。
String json = "{\"name\":\"huyi\", \"age\":18}";
IndexResponse response = client.prepareIndex(index, type)
.setSource(json).execute().actionGet();
往索引index,类型type的类型里插入一条json数据。还可以插入一个
批量插入也很简单:
BulkRequestBuilder bulkRequest = client.prepareBulk();
for(String json : jsons){
bulkRequest.add(client.prepareIndex(index, type).setSource(json));
}
bulkRequest.execute().actionGet();
6.根据条件查询
先构造条件,类似于SQL里的WHERE条件,筛选数据。
比如想查找name = "huyi"的记录,
QueryBuilder query = QueryBuilders.termQuery("name","huyi");
如果想查找name LIKE "huyi"的记录,
QueryBuilder query = QueryBuilders.matchQuery("name","huyi");
如果想查询age在18-20(忽略前面说的age字段store为no)的记录,
QueryBuilder query = QueryBuilders.rangeQuery("age").gte(18).lte(20);
gte表示greater than equal,如果想要大于用gt,lte同理。
有时候查询条件是很多条件的and或or的拼接,拼接前先定义一个
BoolQueryBuilder query = QueryBuilders.boolQuery();
然后用
query.must(query1)
query.must(query2)
或
query.should(query1)
query.should(query2)
分别表示query1 AND query2和query1 OR query2。如此就连接了两个条件,可以增加更多的条件来连接,还可以query1也定义为BoolQueryBuilder,嵌套拼接下去。
构造了查询条件query后,
SearchRequestBuilder search = client
.prepareSearch(INDEX_NAME)
.setTypes(TYPE_NAME)
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setSize(10)
.setQuery(query);
SearchResponse response = search.execute().actionGet();
SearchHit[] hits = response.getHits().getHits();
hits就是查询结果数组,遍历hit可以拿到具体数据,
for(SearchHit hit : hits){
//hit.getSourceAsString()
//hit.field("name")
}
这里要注意,因为elasticSearch是为了查询而生,所以setSize()相当于是按照score排名最多返回多少个结果,如果不指定默认是10。那如果想要全量数据呢?用scroll。
SearchResponse scrollResp = client.prepareSearch(INDEX_NAME)
.setTypes(TYPE_NAME)
.setScroll(new TimeValue(60000))
.setQuery(query)
.setSize(100).get(); //max of 100 hits will be returned for each scroll
//Scroll until no hits are returned
do {
for (SearchHit hit : scrollResp.getHits().getHits()) {
String json = hit.getSourceAsString();
//...
}
scrollResp = client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(new TimeValue(60000)).execute().actionGet();
} while(scrollResp.getHits().getHits().length != 0); // Zero hits mark the end of the scroll and the while loop.
7.聚合(Aggregation)
es里的一个重要功能就是聚合,功能类似于SQL里的GROUP BY。
比如想实现SELECT name, SUM(age) ... GROUP BY name。
先构造聚合:
//按name聚合,记为name_term
TermsBuilder tb= AggregationBuilders.terms("name_term").field("name").size(0);
//加和age,记为age_sum
SumBuilder sb = AggregationBuilders.sum("age_sum").field("age");
//表示先后顺序,先按name聚合后的结果再进行sum操作
tb.subAggregation(sb);
//search为之前构造的SearchRequestBuilder,表示在此查询基础上添加聚合请求
search.addAggregation(tb);
再对聚合取结果:
//根据刚才自定义的聚合age的名字找回结果
Terms terms= search.get().getAggregations().get("name_term");
for(Terms.Bucket bucket : terms.getBuckets()) {
String name = bucket.getKeyAsString();
//根据刚才自定义的加和age的名字找回结果
Sum sum = bucket.getAggregations().get("sale_sum");
double ageSum = sum.getValue();
//...
}
注意聚合时一定要加size(0),这样会显示所有结果,否则默认聚合结果是10个。subAggregation表示顺序,先按前面的聚合,再按后面的聚合,可以嵌套多层。
如果是SELECT SUM(a), SUM(b) ... GROUP BY c, d可以写成:
TermBuilder cTb = AggregationBuilders.terms("cAgg").field(c).size(0);
TermBuilder dTb = AggregationBuilders.terms(dAgg").field(d).size(0);
SumBuilder aSb = AggregationBuilders.sum("aSum").field(a);
SumBuilder bSb = AggregationBuilders.sum("bSum").field(b);
dTb.subAggregation(aSb);
dTb.subAggregation(bSb);
cTb.subAggregation(dTb);
search.addAggregation(cTb);
取数据时再按顺序层层剥开:
Terms cTerms = search.get().getAggregations().get("cAgg");
for(Terms.Bucket cBukcet : cTerms.getBuckets()) {
Terms dTerms = search.get().getAggregations().get("dAgg");
for(Terms.Bucket dBucket : dTerms.getBuckets()) {
Sum aSum = dBucket.getAggregations().get("aSum");
Sum bSum = dBucket.getAggregations().get("bSum");
//...
}
}
8.嵌套
Es存储基于JSON格式,即支持嵌套查询。新建嵌套结构索引需加上
.field("type","nested")
如官方文档的实例,resellers可以看做一个array,array里的每个元素有name和price。
如果不显式指明type是nested,即使写成嵌套的格式,也不是nested类型。查询含有name为huyi的记录。
QueryBuilders.nestedQuery("resellers",
QueryBuilders.termQuery("resellers.name", "huyi"))
对名字聚合然后求每个名字的price和:
AggregationBuilder aggregation = AggregationBuilder.nested("nestedAgg") //为聚合取名字
.path("resellers") //这个嵌套结构往里找
.subAggregation(AggregationBuilders.terms("nameAgg")
.field("resellers.name").size(0) //名字相同的聚合
.subAggregation( //对每个分组内price求和
AggregationBuilders.sum("priceSum").field("price"))));
取结果仍旧按照聚合顺序层层取,nested对应Nested。
如果对里面进行聚合,又对外层再求和,需要用到ReverseNested。可以理解为nested是到下一级目录/,reverseNested是回到上一级目录../。具体官方文档写得很详细了。https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-aggregations-bucket-nested-aggregation.html