Java操作ElasticSearch
默认集群名为elasticsearch,如果集群名称和指定的不一致则在使用节点资源时会报错。
通过client.transport.sniff启动嗅探功能,这样只需要指定集群中的某一个节点(不一定是主节点),然后会加载集群中的其他节点,这样只要程序不停即使此节点宕机仍然可以连接到其他节点。
ES中一共有四种查询类型。
查询类型 | 描述 | 特点 |
---|---|---|
QUERY_AND_FETCH | 主节点将查询请求分发到所有的分片中,各个分片按照自己的查询规则即词频文档频率进行打分排序,然后将结果返回给主节点,主节点对所有数据进行汇总排序然后再返回给客户端,此种方式只需要和ES交互一次 | ①存在数据量和排序问题,主节点会汇总所有分片返回的数据,这样数据量会比较大②各个分片上的规则可能不一致 |
QUERY_THEN_FETCH | 主节点将请求分发给所有分片,各个分片打分排序后将数据的id和分值返回给主节点,主节点收到后进行汇总排序,再根据排序后的id到对应的节点读取对应的数据再返回给客户端,此种方式需要和ES交互两次 | 解决了数据量问题但是排序问题依然存在,是ES的默认查询方式 |
DFS_QUERY_AND_FETCH | 和前面两种的区别在于将各个分片的规则统一起来进行打分 | 解决了排序问题,但是仍然存在数据量问题 |
DFS_QUERY_THEN_FETCH | 和前面两种的区别在于将各个分片的规则统一起来进行打分 | 解决了排序和数据量问题但是效率是最差的 |
特点:
一个交互两次,一个交互一次;一个统一打分规则一个不统一;一个分片返回详细数据一个分片返回id。
通过curl和Java查询时都可以指定分页,但是页数越往后服务器的压力会越大。大多数搜索引擎都不会提供非常大的页数搜索,原因有两个一是用户习惯一般不会看页数大的搜索结果,因为越往后越不准确,二是服务器压力。
比如分片是5,分页单位是10,查询第10000到10010条记录,ES需要在所有分片上进行查询,每个分片会产生10010条排序后的数据然后返回给主节点,主节点接收5个分片的数据一共是50050条,然后再进行汇总,最后再取其中的10000到10010条数据返回给客户端,这样一来看似只请求了10条数据,但实际上ES要汇总5万多条数据,所以页码越大服务器的压力就越大。
查询时如果数据量很大,可以指定超时时间。即达到此时间后无论查询的结果是什么都会返回并且关闭连接,这样用户体验较好,缺点是查询出的数据可能不完整。
Java和curl都可以指定超时时间。
<dependency>
<groupId>org.elasticsearchgroupId>
<artifactId>elasticsearchartifactId>
<version>1.4.4version>
dependency>
<dependency>
<groupId>com.fasterxml.jackson.coregroupId>
<artifactId>jackson-databindartifactId>
<version>2.1.3version>
dependency>
Java操作ES集群步骤
①配置集群对象信息
②创建客户端
③查看集群信息
import java.io.IOException;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.concurrent.ExecutionException;
import org.elasticsearch.action.admin.indices.create.CreateIndexRequestBuilder;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexResponse;
import org.elasticsearch.action.bulk.BulkItemResponse;
import org.elasticsearch.action.bulk.BulkRequestBuilder;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.delete.DeleteResponse;
import org.elasticsearch.action.deletebyquery.DeleteByQueryResponse;
import org.elasticsearch.action.get.GetResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.search.SearchType;
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.action.update.UpdateResponse;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.cluster.node.DiscoveryNode;
import org.elasticsearch.common.collect.ImmutableList;
import org.elasticsearch.common.settings.ImmutableSettings;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.text.Text;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.elasticsearch.common.transport.TransportAddress;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.common.xcontent.XContentFactory;
import org.elasticsearch.index.query.FilterBuilders;
import org.elasticsearch.index.query.MatchQueryBuilder.Operator;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.Terms.Bucket;
import org.elasticsearch.search.aggregations.metrics.sum.Sum;
import org.elasticsearch.search.highlight.HighlightField;
import org.elasticsearch.search.sort.SortOrder;
import org.junit.Before;
import org.junit.Test;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
/**
* @description Java操作ES
* @version 2016年11月25日
*/
public class ElasticSearchTest {
TransportClient transportClient;
//索引库名称
String index = "sl0";
//类型名称
String type = "student";
@Before
public void before() {
/**
* 1.通过setting对象来指定集群配置信息
*/
Settings settings = ImmutableSettings.settingsBuilder()
.put("cluster.name", "elasticsearch")//指定集群名称
.put("client.transport.sniff", true)//启动嗅探功能
.build();
/**
* 2.创建客户端
* 通过setting来创建,若不指定则默认链接的集群名为elasticsearch
* 链接使用tcp协议即9300
*/
transportClient = new TransportClient(settings);
TransportAddress transportAddress = new InetSocketTransportAddress("192.168.1.200", 9300);
transportClient.addTransportAddress(transportAddress);
/**
* 3.查看集群信息
*/
ImmutableList connectedNodes = transportClient.connectedNodes();
for (DiscoveryNode discoveryNode : connectedNodes) {
System.out.println(discoveryNode.getHostAddress());
}
}
/**
* 通过prepareGet方法获取指定文档信息
*/
@Test
public void testGet() {
GetResponse getResponse = transportClient.prepareGet(index, type, "1").get();
System.out.println(getResponse.getSourceAsString());
}
/**
* 通过prepareUpdate更新索引库中文档,如果文档不存在则会报错
* org.elasticsearch.index.engine.DocumentMissingException: [sl01][2] [student][6]: document missing
at org.elasticsearch.action.update.UpdateHelper.prepare(UpdateHelper.java:82)
at org.elasticsearch.action.update.TransportUpdateAction.shardOperation(TransportUpdateAction.java:176)
at org.elasticsearch.action.update.TransportUpdateAction.shardOperation(TransportUpdateAction.java:170)
at org.elasticsearch.action.support.single.instance.TransportInstanceSingleOperationAction$AsyncSingleAction$1.run(TransportInstanceSingleOperationAction.java:187)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
* @throws IOException
*/
@Test
public void testUpdate() throws IOException {
XContentBuilder source = XContentFactory.jsonBuilder()
.startObject()
.field("name","will")
.endObject();
UpdateResponse updateResponse = transportClient.prepareUpdate(index, type, "3")
.setDoc(source).get();
System.out.println(updateResponse.getVersion());
}
/**
* 通过prepareIndex增加文档,参数为json字符串
* id相同做更新操作
*/
@Test
public void testIndexJson() {
String source = "{\"name\":\"qill\",\"age\":33}";
IndexResponse indexResponse = transportClient.prepareIndex(index, type, "4")
.setSource(source).get();
System.out.println(indexResponse.getVersion());
}
/**
* 通过prepareIndex增加文档,参数为Map
*/
@Test
public void testIndexMap() {
Map source = new HashMap();
source.put("name", "Alice");
source.put("age", 18);
IndexResponse indexResponse = transportClient.prepareIndex(index, type, "5")
.setSource(source).get();
System.out.println(indexResponse.getVersion());
}
/**
* 通过prepareIndex增加文档,参数为javaBean
* 就是将JavaBean对象转成JSON字符串
* @throws JsonProcessingException
*/
@Test
public void testIndexBean() throws JsonProcessingException {
Student stu = new Student();
stu.setName("Fresh");
stu.setAge(22);
ObjectMapper mapper = new ObjectMapper();
String source = mapper.writeValueAsString(stu);
IndexResponse indexResponse = transportClient.prepareIndex(index, type, "6")
.setSource(source).get();
System.out.println(indexResponse.getVersion());
}
/**
* 通过prepareIndex增加文档,参数为XContentBuilder
* @throws IOException
* @throws ExecutionException
* @throws InterruptedException
*/
@Test
public void testIndexBuilder() throws IOException, InterruptedException, ExecutionException {
XContentBuilder builder = XContentFactory.jsonBuilder()
.startObject()
.field("name","Avivi")
.field("age",30)
.endObject();
IndexResponse indexResponse = transportClient.prepareIndex(index, type, "7")
.setSource(builder)
.execute().get();//.execute().get()和.get()效果一样
System.out.println(indexResponse.getVersion());
}
/**
* 通过prepareIndex增加文档,参数直接以key-value格式设置,可以设置多个
*/
@Test
public void testIndexDirect() {
IndexResponse indexResponse = transportClient.prepareIndex(index, type, "8")
.setSource("name", "GAGA", "age", 35).get();
System.out.println(indexResponse.getVersion());
}
/**
* 通过prepareDelete删除文档
*/
@Test
public void testDelete() {
DeleteResponse deleteResponse = transportClient.prepareDelete(index, type, "9").get();
System.out.println(deleteResponse.getVersion());
//删除所有记录
DeleteByQueryResponse deleteByQueryResponse = transportClient.prepareDeleteByQuery(index)
.setTypes(type)
.setQuery(QueryBuilders.matchAllQuery())
.get();
System.out.println(deleteByQueryResponse.contextSize());//0
System.out.println(deleteByQueryResponse.isContextEmpty());//true
System.out.println(deleteByQueryResponse.status().getStatus());//200
}
/**
* 删除索引库,不可逆,慎用
* 浏览器访问出现
{
"error": "IndexMissingException[[sl01] missing]",
"status": 404
}
*/
@Test
public void testDeleteIndex() {
DeleteIndexResponse deleteIndexResponse = transportClient.admin().indices()
.prepareDelete("sl01").get();
System.out.println(deleteIndexResponse.isContextEmpty());
}
/**
* 通过prepareCount求索引库文档总数
*/
@Test
public void testCount() {
long count = transportClient.prepareCount(index).get().getCount();
System.out.println(count);
}
/**
* 通过prepareBulk执行批处理
* @throws IOException
*/
@Test
public void testBulk() throws IOException {
//1.生成bulk
BulkRequestBuilder bulk = transportClient.prepareBulk();
//2.新增处理
IndexRequest add = new IndexRequest(index, type, "10");
add.source(XContentFactory.jsonBuilder()
.startObject()
.field("name", "Henrry")
.field("age", 28)
.endObject());
//3.删除处理
DeleteRequest delete = new DeleteRequest(index, type, "4");
//4.修改处理
XContentBuilder source = XContentFactory.jsonBuilder()
.startObject()
.field("name", "jack")
.field("age",17)
.endObject();
UpdateRequest update = new UpdateRequest(index, type, "8");
update.doc(source);
bulk.add(delete);
bulk.add(add);
bulk.add(update);
//5.执行批处理
BulkResponse bulkResponse = bulk.get();
if (bulkResponse.hasFailures()) {
BulkItemResponse[] items = bulkResponse.getItems();
for (BulkItemResponse item : items) {
System.out.println(item.getFailureMessage());
}
} else {
System.out.println("全部执行成功!");
}
}
/**
* 通过prepareSearch查询索引库
* setQuery(QueryBuilders.matchQuery("name","jack"))
* setSearchType(SearchType.QUERY_THEN_FETCH)
*/
@Test
public void testSearch() {
SearchResponse searchResponse = transportClient.prepareSearch(index)
.setTypes(type)
//查询所有
//.setQuery(QueryBuilders.matchAllQuery())
//根据jack分词查询name,默认or
.setQuery(QueryBuilders.matchQuery("name", "Avivi").operator(Operator.AND))
//根据条件查询(注意关键词大小写AND,TO),支持name通配符,age大于等于0小于等于19
//.setQuery(QueryBuilders.queryString("name:A* AND age:[0 TO 19]"))
//查询时不分词
//.setQuery(QueryBuilders.termQuery("name", "Avivi"))
.setSearchType(SearchType.QUERY_THEN_FETCH)//查询类型
.setFrom(0).setSize(10)//分页
.addSort("age", SortOrder.DESC)//排序
.get();
SearchHits hits = searchResponse.getHits();
long total = hits.getTotalHits();
System.out.println(total);
SearchHit[] searchHits = hits.hits();
for (SearchHit searchHit : searchHits) {
System.out.println(searchHit.getSourceAsString());
}
}
/**
* 多索引,多类型查询
* Timeout //TODO timeout的作用是--
{"name":"Avivi0","age":1}
{"name":"Avivi5","age":6}
{"name":"Avivi12","age":13}
{"name":"Avivi17","age":18}
{"name":"Avivi19","age":20}
{"name":"Avivi24","age":25}
{"name":"Avivi31","age":32}
{"name":"Avivi36","age":37}
{"name":"Avivi43","age":44}
{"name":"Avivi48","age":49}
*
*/
@Test
public void testSearchsAndTimeout() {
SearchResponse searchResponse = transportClient.prepareSearch(index,"sl02")
.setTypes(type,"teacher")
.setQuery(QueryBuilders.matchAllQuery())
.setSearchType(SearchType.QUERY_THEN_FETCH)
.setTimeout("3")
.get();
SearchHits hits = searchResponse.getHits();
long totalHits = hits.getTotalHits();
System.out.println(totalHits);
SearchHit[] hits2 = hits.getHits();
for (SearchHit searchHit : hits2) {
System.out.println(searchHit.getSourceAsString());
}
}
/**
* 过滤
* lt 小于
* gt 大于
* lte 小于等于
* gte 大于等于
*
*/
@Test
public void testFilter() {
SearchResponse searchResponse = transportClient.prepareSearch(index).setTypes(type)
.setQuery(QueryBuilders.matchAllQuery())//查询所有
.setSearchType(SearchType.QUERY_THEN_FETCH)
//.setPostFilter(FilterBuilders.rangeFilter("age")
// .from(0).to(19).includeLower(true).includeUpper(true))
.setPostFilter(FilterBuilders.rangeFilter("age").gte(18).lte(22))
.setExplain(true)//explain为true表示根据数据相关度排序,和关键字匹配最高的排在前面
.get();
SearchHits hits = searchResponse.getHits();
long totalHits = hits.getTotalHits();
System.out.println(totalHits);
SearchHit[] hits2 = hits.getHits();
for (SearchHit searchHit : hits2) {
System.out.println(searchHit.getSourceAsString());
}
}
/**
* 高亮
*/
@Test
public void testHighLight() {
SearchResponse searchResponse = transportClient.prepareSearch(index).setTypes(type)
.setQuery(QueryBuilders.matchQuery("name", "Fresh")) //查询匹配的
.setSearchType(SearchType.QUERY_THEN_FETCH)
.addHighlightedField("name")
.setHighlighterPreTags("")
.setHighlighterPostTags("")
.get();
SearchHits hits = searchResponse.getHits();
long totalHits = hits.getTotalHits();
System.out.println(totalHits);
SearchHit[] hits2 = hits.getHits();
for (SearchHit searchHit : hits2) {
Map highlightFields = searchHit.getHighlightFields();
HighlightField highlightField = highlightFields.get("name");
if (null != highlightField) {
Text[] fragments = highlightField.fragments();
System.out.println(fragments[0]);
}
System.out.println(searchHit.getSourceAsString());
}
}
/**
* 分组
*/
@Test
public void testGroupBy() {
SearchResponse searchResponse = transportClient.prepareSearch(index).setTypes(type)
.setQuery(QueryBuilders.matchAllQuery())
.setSearchType(SearchType.QUERY_THEN_FETCH)
//根据age分组,默认返回10,size(0)也是10
.addAggregation(AggregationBuilders.terms("group_age").field("age").size(0))
.get();
Terms terms = searchResponse.getAggregations().get("group_age");
List buckets = terms.getBuckets();
for (Bucket bucket : buckets) {
System.out.println(bucket.getKey() + " " + bucket.getDocCount());
}
}
/**
* 聚合函数,本例只编写了sum,其他的聚合函数也可以实现
*/
@Test
public void testAggregationFunction() {
SearchResponse searchResponse = transportClient.prepareSearch(index).setTypes(type)
.setQuery(QueryBuilders.matchAllQuery())
.setSearchType(SearchType.QUERY_THEN_FETCH)
.addAggregation(AggregationBuilders.terms("group_name").field("name")
.subAggregation(AggregationBuilders.sum("sum_age").field("age")))
.get();
Terms terms = searchResponse.getAggregations().get("group_name");
List buckets = terms.getBuckets();
for (Bucket bucket : buckets) {
Sum sum = bucket.getAggregations().get("sum_age");
System.out.println(bucket.getKey()+" "+bucket.getDocCount()+" "+sum.getValue());
}
}
/**
* 生成sl02数据
* @throws IOException
* @throws ExecutionException
* @throws InterruptedException
*/
@Test
public void generateOtherIndexData() throws IOException {
for (int i = 25; i < 60; i++) {
XContentBuilder builder = XContentFactory.jsonBuilder()
.startObject()
.field("name","Teacher"+i)
.field("age",i+1)
.endObject();
transportClient.prepareIndex("sl02", "teacher", String.valueOf(i-24))
.setSource(builder)
.get();
}
}
/**
* 生成数据
* @throws IOException
* @throws ExecutionException
* @throws InterruptedException
*/
@Test
public void generateData() throws IOException {
for (int i = 0; i < 60; i++) {
XContentBuilder builder = XContentFactory.jsonBuilder()
.startObject()
.field("name","Avivi"+i)
.field("age",i+1)
.endObject();
transportClient.prepareIndex(index, type, String.valueOf(i+11))
.setSource(builder)
.get();
}
}
/**
* java操作settings和mappings
* @throws IOException
*/
@Test
public void testSettingsMappings() throws IOException {
//1.settings
HashMap settings_map = new HashMap();
settings_map.put("number_of_shards", 3);
settings_map.put("number_of_replicas", 1);
//2.mappings
XContentBuilder builder = XContentFactory.jsonBuilder()
.startObject()
.field("dynamic","student")
.startObject("properties")
.startObject("id")
.field("type","integer")
.field("store","yes")
.endObject()
.startObject("name")
.field("type","string")
.field("store","yes")
.field("index","analyzed")
.field("analyzer","id")
.endObject()
.endObject()
.endObject();
CreateIndexRequestBuilder prepareCreate = transportClient.admin().indices().prepareCreate("sl04");
prepareCreate.setSettings(settings_map).addMapping("student", builder).execute().actionGet();
}
/**
* 指定分片 查询
*
* 分片查询
Es会将数据均衡的存储在分片中,我们可以指定es去具体的分片或节点钟查询从而进一步的实现es极速查询。
1:randomizeacross shards
随机选择分片查询数据,es的默认方式
2:_local
优先在本地节点上的分片查询数据然后再去其他节点上的分片查询,本地节点没有IO问题但有可能造成负载不均问题。
数据量是完整的。
3:_primary
只在主分片中查询不去副本查,一般数据完整。
4:_primary_first
优先在主分片中查,如果主分片挂了则去副本查,一般数据完整。
5:_only_node
只在指定id的节点中的分片中查询,数据可能不完整。
6:_prefer_node
优先在指定你给节点中查询,一般数据完整。
7:_shards
在指定分片中查询,数据可能不完整。
8:_only_nodes
可以自定义去指定的多个节点查询,es不提供此方式需要改源码。
*
*/
@Test
public void testPreference()
{
SearchResponse searchResponse = transportClient.prepareSearch(index)
.setTypes(type)
//.setPreference("_local")
//.setPreference("_primary")
//.setPreference("_primary_first")
//.setPreference("_only_node:ZYYWXGZCSkSL7QD0bDVxYA")
//.setPreference("_prefer_node:ZYYWXGZCSkSL7QD0bDVxYA")
.setPreference("_shards:0,1,2")
.setQuery(QueryBuilders.matchAllQuery()).setExplain(true).get();
SearchHits hits = searchResponse.getHits();
System.out.println(hits.getTotalHits());
SearchHit[] hits2 = hits.getHits();
for(SearchHit h : hits2) {
System.out.println(h.getSourceAsString());
}
}
/**
* 调优
* 合并索引片段
*/
@Test
public void testOptimize() {
transportClient.admin().indices().prepareOptimize("sl01", "sl02")
.setMaxNumSegments(1).get();
}
/**
* 删除.del文件
*
删除文档
在es中删除一个文档后不会立即从硬盘中删除只会标记这个文档被删除,lucene会产生一个.del文件,
而在检索过程中这个文件还会参与检索只不过在最后会被过滤掉,这样其实也会影响效率,我们可以定期删除这些文件
*/
@Test
public void testOptimizeDel() {
transportClient.admin().indices().prepareOptimize("sl01", "sl02")
.setOnlyExpungeDeletes(true).get();
}
/**
* 路由参数
* es极速查询
Es将数据存储在不同的分片中,根据文档id通过内部算法得出要将文档存储在哪个分片上,
所以在查询时只有指定在对应的分片上进行查询就可以实现基于es的极速查询,但是前提是你需要知道数据在那个分片上。
还可以通过路由参数来设置数据存储在同一个分片中,setRouting(“”)
*
*/
@Test
public void testRoutingInsert() {
String source = "{\"name\":\"中山大学l\",\"num\":1800}";
IndexResponse indexResponse = transportClient.prepareIndex(index, "stu")
.setRouting("student")
.setSource(source).get();
System.out.println(indexResponse.getVersion());
}
/**
* 路由参数
*/
@Test
public void testRoutingSearch() {
SearchResponse searchResponse = transportClient.prepareSearch(index)
.setTypes("stu")
.setQuery(QueryBuilders.matchAllQuery())
//.setPreference("_shards:0,1,2")
.setRouting("student", "teacher")
.get();
SearchHits hits = searchResponse.getHits();
SearchHit[] hits2 = hits.getHits();
for(SearchHit h : hits2) {
System.out.println(h.getSourceAsString());
}
}
public class Student {
private String name;
private int age;
private String info;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
public String getInfo() {
return info;
}
public void setInfo(String info) {
this.info = info;
}
}
}
本文参考:
http://blog.csdn.net/ty4315/article/details/52434296