Elasticsearch提供了多种api。可以直接使用官方提供的Java API进行使用。ElasticSearc Java API。如果是使用Spring框架的项目,还可以用spring-data-elasticsearch的api。基于spring可以使用Annotation,索引文档不需要任何xml式的配置。而且使用上非常简便。其存储、查询接口继承了JpaRepository,所以对于引入JPA的项目来说,上手非常快。
spring-data-elasticsearch Doc 熟悉JPA以及使用过Spring-data-common项目的开发者,应该很快会上手spring-data-elasticsearch。首先要做的就是在gradle项目中,引入‘org.springframework.data:spring-data-elasticsearch:2.1.3.RELEASE’以及‘org.springframework.boot:spring-boot-starter-data-elasticsearch:your_springboot_version’ 。在我们对于索引数据的crud操作api中,主要用的是ElasticsearchRepository 接口,其继承与spring-data的基础repository包的接口CrudRepository。先看一下接口的主要方法:
public interface CrudRepositoryextends Serializable> extends Repository {
* Saves a given entity. Use the returned instance for further operations as the save operation might have changed the
* entity instance completely.
* @param entity
* @return the saved entity
extends T> S save(S entity);
* Saves all given entities.
* @param entities
* @return the saved entities
* @throws IllegalArgumentException in case the given entity is {@literal null}.
extends T> Iterable save(Iterable entities);
* Retrieves an entity by its id.
* @param id must not be {@literal null}.
* @return the entity with the given id or {@literal null} if none found
* @throws IllegalArgumentException if {@code id} is {@literal null}
T findOne(ID id);
* Returns whether an entity with the given id exists.
* @param id must not be {@literal null}.
* @return true if an entity with the given id exists, {@literal false} otherwise
* @throws IllegalArgumentException if {@code id} is {@literal null}
boolean exists(ID id);
* Returns all instances of the type.
* @return all entities
Iterable findAll();
* Returns all instances of the type with the given IDs.
* @param ids
* @return
Iterable findAll(Iterable ids);
* Returns the number of entities available.
* @return the number of entities
long count();
* Deletes the entity with the given id.
* @param id must not be {@literal null}.
* @throws IllegalArgumentException in case the given {@code id} is {@literal null}
void delete(ID id);
* Deletes a given entity.
* @param entity
* @throws IllegalArgumentException in case the given entity is {@literal null}.
void delete(T entity);
* Deletes the given entities.
* @param entities
* @throws IllegalArgumentException in case the given {@link Iterable} is {@literal null}.
void delete(Iterable extends T> entities);
* Deletes all entities managed by the repository.
void deleteAll();
其对于Elasticsearch的文档(@Document)的数据的操作就类似于JPA中对于数据库表(@Entity)的接口。可以用findByXX的方式进行查询,也可以自定义@Query()方式进行查询。在开发的过程中,对于一些特殊的查询场景,可以查询spring-data-elasticsearch源码中的示例,基本包含了各种场景的API,项目git:spring-data-elasticsearch Git
使用Springboot,可以在启动时对很多服务Bean进行注入。一下是通过Autowire方式,使用spring-boot-starter-data-elasticsearch:2.1.3.RELEASE来处理基于Springboot的微服务启动时连接Elasticsearch集群,以及注入应用代码需要使用的 ElasticsearchTemplate。Configuration类如下:
import org.apache.commons.lang3.StringUtils;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.InetSocketTransportAddress;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.elasticsearch.core.ElasticsearchTemplate;
import org.springframework.data.elasticsearch.repository.config.EnableElasticsearchRepositories;
import java.net.InetAddress;
* 使用的是es 2.4.4 版本,因为springboot 1.5.x,以及目前版本最多支持到es 2.x。
* Created by lijingyao on 2017/5/17 16:32.
@EnableElasticsearchRepositories(basePackages = "com.puregold.ms")
public class SearchConfig {
// 假设使用三个node,(一主两备)的配置。在实际的生产环境,需在properties文件中替换成实际ip(内网或者外网ip)
private String esHost;// master node
private String esHost2;//replica node
private String esHost3;//replica node
private int esPort;
private String esClusterName;
public TransportClient transportClient() throws Exception {
Settings settings = Settings.settingsBuilder()
.put("cluster.name", esClusterName)
TransportClient transportClient = TransportClient.builder()
.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(esHost), esPort));
if (StringUtils.isNotEmpty(esHost2)) {
transportClient.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(esHost2), esPort));
if (StringUtils.isNotEmpty(esHost3)) {
transportClient.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(esHost3), esPort));
return transportClient;
public ElasticsearchTemplate elasticsearchTemplate() throws Exception {
return new ElasticsearchTemplate(transportClient());
创建索引和文档,同JPA的 @Entity,@Table,可以通过在搜索的文档实体类添加@Document注解的方式,在启动Springboot应用时会直接创建以及更新Elasticsearch的index以及document。
下面创建一个示例。示例中包含两个Document,一个是OrderDocument,一个是DetailOrderDocument。示例中OrderDocument和DetailOrderDocument是parent-child关联,可以参考官方对于p-c的描述:indexing-parent-child。Elasticsearch支持多种对于文档模型的关联。在建立parent child关系的时候需要注意:child 需要根据parant的id进行路由,parantid 和child的parantid 必须是string。否则回在启动时报错:
nested exception is java.lang.IllegalArgumentException: Parent ID property should be String
@Document(indexName = OrderDocument.INDEX, type = OrderDocument.ORDER_TYPE, refreshInterval = "-1")
public class OrderDocument {
public static final String INDEX = "orders-test";
public static final String ORDER_TYPE = "order-document";
public static final String DETAIL_TYPE = "order-detail-document";
private String id;
// 订单备注,不需要分词,可以搜索
@Field(type = FieldType.String, index = FieldIndex.not_analyzed)
private String note;
// 订单名称,可以通过ik 分词器进行分词
@Field(type = FieldType.String, searchAnalyzer = "ik", analyzer = "ik")
private String name;
// 订单价格
@Field(type = FieldType.Long)
private Long price;
public String getId() {
return id;
public void setId(String id) {
this.id = id;
public String getNote() {
return note;
public void setNote(String note) {
this.note = note;
public String getName() {
return name;
public void setName(String name) {
this.name = name;
public Long getPrice() {
return price;
public void setPrice(Long price) {
this.price = price;
@Document(indexName = OrderDocument.INDEX, type = OrderDocument.DETAIL_TYPE, shards = 10, replicas = 2, refreshInterval = "-1")
public class DetailOrderDocument {
private String id;
// 指定主订单关联的父子关系
@Field(type = FieldType.String, store = true)
@Parent(type = OrderDocument.ORDER_TYPE)
private String parentId;
// 子订单价格
@Field(type = FieldType.Long)
private Long price;
public String getId() {
return id;
public void setId(String id) {
this.id = id;
public String getParentId() {
return parentId;
public void setParentId(String parentId) {
this.parentId = parentId;
public Long getPrice() {
return price;
public void setPrice(Long price) {
this.price = price;
以上就在 “orders-test” 索引中创建了两个Document。@Id注解对应着Elasticsearch的id。可以系统自动生成,也可以创建文档数据时指定固定的id,但是一定要保证唯一性。
启动好之后可以通过curl xget来查询索引的结构。结果如下:
"orders-test" : {
"aliases" : { },
"mappings" : {
"order-detail-document" : {
"_parent" : {
"type" : "order-document" },
"_routing" : {
"required" : true },
"properties" : {
"parentId" : { "type" : "string", "store" : true },
"price" : { "type" : "long" } }
"order-document" : {
"properties" : {
"name" : { "type" : "string", "analyzer" : "ik" },
"note" : { "type" : "string", "index" : "not_analyzed" },
"price" : { "type" : "long" } }
"settings" : {
"index" : {
"refresh_interval" : "-1",
"number_of_shards" : "10",
"creation_date" : "1511403448676",
"store" : {
"type" : "fs" },
"number_of_replicas" : "2",
"uuid" : "sHA5s7kEQA2AWCAA8-aBlQ",
"version" : {
"created" : "2040499" }
"warmers" : { }
另,刚才代码中,通过设置@Document的参数 number_of_shards,number_of_replicas。可以看到创建文档的settings参数:”number_of_shards” : “10”, “number_of_replicas” : “2”。如果不指定参数,则默认分别是 number_of_shards=5,number_of_replicas=1。其他默认参数可以查看public @interface Document源码。
用findOne 时会报错,可以用findById 来代替,用query terms精确查找是可以的
"error" : {
"root_cause" : [ {
"type" : "routing_missing_exception",
"reason" : "routing is required for [XX]/[YY]/[yourid]",
"index" : "forests"
} ],
"type" : "routing_missing_exception",
"reason" : "routing is required for [XX]/[YY]/[yourid]",
"index" : "forests"
"status" : 400
public interface DetailOrderDocumentRepository extends ElasticsearchRepository<DetailOrderDocument, String> {
List findByParentId(String parentId, Sort sort);
DetailOrderDocument findById(String id);
public class OrderManager {
private ElasticsearchTemplate elasticsearchTemplate;
public Page queryPagedOrders(Integer pageNo, Integer pageSize, String name, Long minPrice, Long maxPrice) {
// 默认,价格升序(为了支持丰富的排序场景,建议将所有可能的排序规则放到统一的enum中
Pageable pageable = new PageRequest(pageNo, pageSize, new Sort(new Sort.Order(Sort.Direction.ASC, "price")));
NativeSearchQueryBuilder nbq = new NativeSearchQueryBuilder().withIndices(OrderDocument.INDEX).withTypes(OrderDocument
BoolQueryBuilder bqb = boolQuery();
// 匹配订单name
if (StringUtils.isNotEmpty(name)) {
bqb.must(termQuery("name", name));
// 查询价格区间 minPrice<=price<=maxPrice
if (minPrice != null && minPrice >= 0) {
if (maxPrice != null && maxPrice >= 0) {
Page page = elasticsearchTemplate.queryForPage(nbq.withQuery(bqb).build(), OrderDocument.class);
return page;
curl -XDELETE 'http://yourip:9200/orders-test/?pretty'
但parant-child 关系mapping的时候,删除之后,如果想重建索引,在启动springboot的时候会出现异常:
can’t add a _parent field that points to an already existing type, that isn’t already a parent 解决方案是在@Document 属性中设置 createIndex = false(默认是true) ,只在parent document上设置就可以了.这样就可以自由删除index,启动时重建索引。
@Field(type = FieldType.String, searchAnalyzer = "ik", analyzer = "ik")
private String name;
Failed to execute phase [query], all shards failed; shardFailures {[X-XXXXX][YYYY][0]: RemoteTransportException[[your-node][yourip:9300][indices:data/read/search[phase/query]]]; nested:
QueryPhaseExecutionException[Result window is too large, from + size must be less than or equal to: [10000] but was [99020].
See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter.]; }{[X-XXXXX][YYYY][1]: RemoteTransportException[[
可以通过以下命令修改索引index_name。这个是index级别的设置,但是不建议更改设置,会增加ES node的内存负担。
curl -XPUT "http://your_cluster:9200/index_name/_settings" -d '{ "index" : { "max_result_window" : 500000 } }'
虽然可以解决索引数据量大的问题,但是接口的性能会有问题:基本上平均返回时间会+200-300ms。推荐用scroll api:
Elasticsearch在处理大结果集时可以使用scan和scroll。在Spring Data Elasticsearch中,可以向下面那样使用ElasticsearchTemplate来使用scan和scroll处理大结果集。可以参考:关于scroll 。
search api返回一个单一的结果“页”,而 scroll API 可以被用来检索大量的结果(甚至所有的结果),就像在传统数据库中使用的游标 cursor。
String scrollId = elasticsearchTemplate.scan(nbq.withQuery(bqb).build(), 1000, false);
Page page = elasticsearchTemplate.scroll(scrollId, 2000L, OrderDocument.class);
curl -XDELETE 'http://your_cluster:9200/orders-test/order-detail-document/_query?routing=parent_order_id&pretty' -H 'Content-Type: application/json' -d'
"query": {
"bool": {
"must": [
{ "term" :
{ "id" : "detail_order_id" }
http://your_cluster:9200/orders-test/order-detail-document/_search?&_routing= parent_order_id&q=id:detail_order_id&pretty
