SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能

一、Elasticsearch介绍

1.1 Elasticsearch是什么

Elasticsearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口。Elasticsearch是用Java语言开发的,并作为Apache许可条款下的开放源码发布,是一种流行的企业级搜索引擎。Elasticsearch用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。

1.2 Elasticsearch常用名词

文档:文档是索引和搜索数据的最小单位,一个文档可以理解为一个数据库表的一行数据,它和行的区别是,文档是有层次的(文档中可以包含新的文档,一个字段可以包含其他字段和取值,想象JSON格式的数据),无模式的(并非所有的文档都需要拥有相同的字段)。

类型:类型是文档的逻辑容器,相当于数据库中的一张表,每个类型中字段的定义称为映射。例如name字段可以映射为String。

索引:索引是映射类型的容器,相当于一个关系型数据库,是独立的大量文档集合。

节点:一个节点是一个Elasticsearch实例,在服务器上启动Elasticsearch后,就拥有了一个节点。如果在另一台服务器上启动Elasticsearch,这就是另一个节点。也可以通过启动多个Elasticsearch进程,在同一台服务器拥有多个节点。对于使用Elasticsearch的应用程序,集群中有1个还是多个节点都是透明的,默认情况下,可以连接集群中任一节点并访问完整的数据集,就好像集群只有单独的一个节点。

分片:分片是Elasticsearch处理的最小单元,一份分片是一个Lucene索引:一个包含倒排索引的文件目录,倒排索引的结构使得Elasticsearch在不扫描所有文档的情况下,就能告诉你哪些文档包含特定的词条(单词),倒排索引记录了词条出现在哪个文档的什么位置,以及出现的次数等映射信息。
SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第1张图片
分片分为主分片和副本分片,副本分片有助于提高搜索性能和容错率,Elasticsearch在索引的主分片和副本分片中进行搜索请求的负载均衡,原有主分片丢失后会成为新的主分片,副本分片可以在运行时进行添加和移除,而主分片不可以,在创建索引之前必须决定主分片的数量,过少的分片会限制可拓展性,过多的分片会影响性能。

水平拓展与垂直拓展:水平拓展:在节点中加入更多节点,一般是在另外服务器开启Elasticsearch实例。垂直拓展:为原来的节点增加更多硬件资源,如为虚拟机分配更多处理器,为物理机增加更多内存,更多的CPU,更快的磁盘。

1.3 分布式索引和搜索(多节点多分片)过程

索引过程:接收建立索引请求的Elasticsearch节点首先选择文档索引到哪个分片。默认地,文档在分片中均匀分布:对于每篇文档,分片是通过其ID字符串的散列决定的,每份分片拥有相同的散列范围,接收新文档的机会均等,一旦目标分片确定,接收请求的节点将文档转发到该分片所在的节点。随后,建立索引操作在所有目标分片的所有副本分片中进行,在所有可用副本分片完成文档索引后,建立索引命令就会成功返回。
SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第2张图片
搜索过程:搜索时,接收请求的节点将请求转发到一组包含所有数据的分片,Elasticsearch使用round-robin的轮询机制选择可用的分片(主分片或副本分片),并将搜索请求转发过去,Elasticsearch然后从这些分片收集结果,将其聚集到单一的回复,然后将回复返回给客户端应用程序。
SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第3张图片

想要实现功能:
搜索框的自动补全,智能提示
SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第4张图片

二、windows下Elasticsearch安装

Elasticsearch官网下载地址:https://www.elastic.co/cn/downloads/elasticsearch

SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第5张图片
也可以下载其他旧版本:
SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第6张图片
安装步骤参考: https://blog.csdn.net/kevlin_v/article/details/94616871

安装IK中文分词器和拼音分词器:https://blog.csdn.net/qq_28988969/article/details/79540620

三、Elasticsearch搭建集群

项目中用到了两台windows服务器10.254.24.54,10.254.24.55,于是就搭建了只有两个节点的简单集群,也可以搭建单机伪集群,操作一样。

3.1 修改elasticsearch.yml配置文件

10.254.24.55服务器

# 设置集群名称,集群内所有节点的名称必须一致。
cluster.name: my-esCluster
# 设置节点名称,集群内节点名称必须唯一。
node.name: node1
# 表示该节点会不会作为主节点,true表示会;false表示不会
node.master: true
# 当前节点是否可以作为数据节点,是:true、否:false
node.data: true
# 索引数据存放的位置
path.data: D:\TPI\elastic\data
# 日志文件存放的位置
#path.logs: /opt/elasticsearch/logs
# 需求锁住物理内存,是:true、否:false
#bootstrap.memory_lock: true
# 监听地址,用于访问该es
network.host: 0.0.0.0
# es对外提供的http端口,默认 9200
http.port: 9200
# TCP的默认监听端口,默认 9300
transport.tcp.port: 9300
# 设置这个参数来保证集群中的节点可以知道其它N个有master资格的节点。默认为1,对于大的集群来说,可以设置大一点的值(2-4)
discovery.zen.minimum_master_nodes: 1
node.max_local_storage_nodes: 2
# es7.x 之后新增的配置,写入候选主节点的设备地址,在开启服务后可以被选为主节点
discovery.seed_hosts: ["10.254.24.55:9300", "10.254.24.54:9301"] 
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5
# es7.x 之后新增的配置,初始化一个新的集群时需要此配置来选举master
cluster.initial_master_nodes: ["node1", "node2"]
# 是否支持跨域,是:true,在使用head插件时需要此配置
http.cors.enabled: true
# “*” 表示支持所有域名
http.cors.allow-origin: "*"
action.destructive_requires_name: true
action.auto_create_index: .security,.monitoring*,.watches,.triggered_watches,.watcher-history*
xpack.security.enabled: false
xpack.monitoring.enabled: true
xpack.graph.enabled: false
xpack.watcher.enabled: false
xpack.ml.enabled: false

10.254.24.54 服务器

# 设置集群名称,集群内所有节点的名称必须一致。
cluster.name: my-esCluster
# 设置节点名称,集群内节点名称必须唯一。
node.name: node2
# 表示该节点会不会作为主节点,true表示会;false表示不会
node.master: true
# 当前节点是否可以作为数据节点,是:true、否:false
node.data: true
# 索引数据存放的位置
path.data: D:\elasticsearch\data
# 日志文件存放的位置 
#path.logs: /opt/elasticsearch/logs
# 需求锁住物理内存,是:true、否:false
#bootstrap.memory_lock: true
# 监听地址,用于访问该es 
network.host: 0.0.0.0
# es对外提供的http端口,默认 9200
http.port: 9201
# TCP的默认监听端口,默认 9300
transport.tcp.port: 9301
# 设置这个参数来保证集群中的节点可以知道其它N个有master资格的节点。默认为1,对于大的集群来说,可以设置大一点的值(2-4)
discovery.zen.minimum_master_nodes: 1
node.max_local_storage_nodes: 2
# es7.x 之后新增的配置,写入候选主节点的设备地址,在开启服务后可以被选为主节点,这里要加端口号,否则可能导致启动时选主节点失败
discovery.seed_hosts: ["10.254.24.55:9300", "10.254.24.54:9301"] 
discovery.zen.fd.ping_timeout: 1m
discovery.zen.fd.ping_retries: 5
# es7.x 之后新增的配置,初始化一个新的集群时需要此配置来选举master
cluster.initial_master_nodes: ["node1", "node2"]
# 是否支持跨域,是:true,在使用head插件时需要此配置
http.cors.enabled: true
# “*” 表示支持所有域名
http.cors.allow-origin: "*"
action.destructive_requires_name: true
action.auto_create_index: .security,.monitoring*,.watches,.triggered_watches,.watcher-history*
xpack.security.enabled: false
xpack.monitoring.enabled: true
xpack.graph.enabled: false
xpack.watcher.enabled: false
xpack.ml.enabled: false

单机伪集群配置参考:https://blog.csdn.net/csdn565973850/article/details/104772551/

3.2 修改jvm.options配置文件

修改JVM堆的初始值和最大值,项目两台服务器都是16g,我将jvm堆初始值与最大值改成了8g

-Xms8g
-Xmx8g

集群规划与管理:https://www.cnblogs.com/leeSmall/p/9220535.html

3.3 启动并访问

以服务的方式启动:
SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第7张图片
分别访问两台服务器

10.254.24.54:9201
SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第8张图片
10.254.24.55:9200
SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第9张图片四、SpringBoot整合ElasticSearch Java High Level Rest Client

4.1 pom.xml

   <!--elasticsearch-->
        <!--elasticsearch-rest-high-level-client-->
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.6.2</version>
            <exclusions>
                <exclusion>
                    <groupId>org.elasticsearch</groupId>
                    <artifactId>elasticsearch</artifactId>
                </exclusion>
                <exclusion>
                    <groupId>org.elasticsearch.client</groupId>
                    <artifactId>elasticsearch-rest-client</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch -->
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>7.6.2</version>
        </dependency>
        <!--&lt;!&ndash; https://mvnrepository.com/artifact/org.elasticsearch.client/elasticsearch-rest-client &ndash;&gt;-->
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-client</artifactId>
            <version>7.6.2</version>
        </dependency>

Elasticsearch Java High Level Rest Client对SpringBoot版本有比较高的要求,一开始我SpringBoot版本时2.0.0版本的,整合后发现项目启动不了,也没报任何错误,后来就改成了2.2.2版本发现可以启动成功。同时如果项目之前已经导入了httpclient依赖的话,也可能因为httpclient版本冲突导致启动报错。

把项目原来引入的httpclient依赖注释掉:

<!-- httpclient begin -->
        <!--<dependency>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
            <version>4.3.5</version>
        </dependency>-->
        <!-- httpclient end -->

4.2 application.yml

elasticsearch:
  nodes: 10.254.24.55:9200, 10.254.24.54:9201
  schema: http
  max-connect-total: 50
  max-connect-per-route: 10
  connection-request-timeout-millis: 500
  socket-timeout-millis: 30000
  connect-timeout-millis: 1000
  #这个是我项目中做completion suggest的索引名称
  indexName: suggest

4.3 集成RestHighLevelClient

编写构造器:

package cnki.tpi.config;

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.stereotype.Component;

import java.util.List;

/**
 * ClassName: EsClientBuilder
 * Description:
 *
 * @author 小黄
 * date: 2020/4/13 20:07
 */
@Component
public class EsClientBuilder {
    private int connectTimeoutMillis = 1000;
    private int socketTimeoutMillis = 30000;
    private int connectionRequestTimeoutMillis = 500;
    private int maxConnectPerRoute = 10;
    private int maxConnectTotal = 30;

    private final List<HttpHost> httpHosts;


    private EsClientBuilder(List<HttpHost> httpHosts) {
        this.httpHosts = httpHosts;
    }


    public EsClientBuilder setConnectTimeoutMillis(int connectTimeoutMillis) {
        this.connectTimeoutMillis = connectTimeoutMillis;
        return this;
    }

    public EsClientBuilder setSocketTimeoutMillis(int socketTimeoutMillis) {
        this.socketTimeoutMillis = socketTimeoutMillis;
        return this;
    }

    public EsClientBuilder setConnectionRequestTimeoutMillis(int connectionRequestTimeoutMillis) {
        this.connectionRequestTimeoutMillis = connectionRequestTimeoutMillis;
        return this;
    }

    public EsClientBuilder setMaxConnectPerRoute(int maxConnectPerRoute) {
        this.maxConnectPerRoute = maxConnectPerRoute;
        return this;
    }

    public EsClientBuilder setMaxConnectTotal(int maxConnectTotal) {
        this.maxConnectTotal = maxConnectTotal;
        return this;
    }


    public static EsClientBuilder build(List<HttpHost> httpHosts) {
        return new EsClientBuilder(httpHosts);
    }


    public RestHighLevelClient create() {

        HttpHost[] httpHostArr = httpHosts.toArray(new HttpHost[0]);
        RestClientBuilder builder = RestClient.builder(httpHostArr);

        builder.setRequestConfigCallback(requestConfigBuilder -> {
            requestConfigBuilder.setConnectTimeout(connectTimeoutMillis);
            requestConfigBuilder.setSocketTimeout(socketTimeoutMillis);
            requestConfigBuilder.setConnectionRequestTimeout(connectionRequestTimeoutMillis);
            return requestConfigBuilder;
        });

        builder.setHttpClientConfigCallback(httpClientBuilder -> {
            httpClientBuilder.setMaxConnTotal(maxConnectTotal);
            httpClientBuilder.setMaxConnPerRoute(maxConnectPerRoute);
            return httpClientBuilder;
        });

        return new RestHighLevelClient(builder);
    }
}

配置类Configuration:

package cnki.tpi.config;

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.util.Assert;
import org.springframework.util.StringUtils;

import java.util.ArrayList;
import java.util.List;

/**
 * ClassName: ElasticConfig
 * Description:
 *
 * @author 小黄
 * date: 2020/4/13 17:44
 */
@Configuration
public class ElasticConfig {
    @Value("${elasticsearch.nodes}")
    private List<String> nodes;

    @Value("${elasticsearch.schema}")
    private String schema;

    @Value("${elasticsearch.max-connect-total}")
    private Integer maxConnectTotal;

    @Value("${elasticsearch.max-connect-per-route}")
    private Integer maxConnectPerRoute;

    @Value("${elasticsearch.connection-request-timeout-millis}")
    private Integer connectionRequestTimeoutMillis;

    @Value("${elasticsearch.socket-timeout-millis}")
    private Integer socketTimeoutMillis;

    @Value("${elasticsearch.connect-timeout-millis}")
    private Integer connectTimeoutMillis;

    @Bean
    public RestHighLevelClient getRestHighLevelClient() {
        List<HttpHost> httpHosts = new ArrayList<>();
        for (String node : nodes) {
            try {
                String[] parts = StringUtils.split(node, ":");
                Assert.notNull(parts,"Must defined");
                Assert.state(parts.length == 2, "Must be defined as 'host:port'");
                httpHosts.add(new HttpHost(parts[0], Integer.parseInt(parts[1]), schema));
            } catch (RuntimeException ex) {
                throw new IllegalStateException(
                        "Invalid ES nodes " + "property '" + node + "'", ex);
            }
        }

        return EsClientBuilder.build(httpHosts)
                .setConnectionRequestTimeoutMillis(connectionRequestTimeoutMillis)
                .setConnectTimeoutMillis(connectTimeoutMillis)
                .setSocketTimeoutMillis(socketTimeoutMillis)
                .setMaxConnectTotal(maxConnectTotal)
                .setMaxConnectPerRoute(maxConnectPerRoute)
                .create();
    }
}

五、完成completion suggest

5.1 核心操作接口IBaseElasticService

package cnki.tpi.service;

import cnki.tpi.entity.dto.ElasticSuggestDto;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.search.builder.SearchSourceBuilder;

import java.util.Collection;
import java.util.List;

public interface IBaseElasticService {

    void createIndex(String idxName,String idxSQL);

    void insertOrUpdateOne(String idxName, ElasticSuggestDto entity);

    void insertBatch(String idxName, List<ElasticSuggestDto> list);

    <T> void deleteBatch(String idxName, Collection<T> idList);

     List<String> searchCompletionSuggest(String idxName, SearchSourceBuilder builder);

    void deleteIndex(String idxName);

    void deleteByQuery(String idxName, QueryBuilder builder);

}

5.2 核心操作实现类BaseElasticServiceImpl

package cnki.tpi.service.impl;

import cnki.tpi.entity.dto.ElasticSuggestDto;
import cnki.tpi.service.IBaseElasticService;
import lombok.extern.slf4j.Slf4j;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.delete.DeleteRequest;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.support.IndicesOptions;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.client.indices.CreateIndexRequest;
import org.elasticsearch.client.indices.CreateIndexResponse;
import org.elasticsearch.client.indices.GetIndexRequest;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.xcontent.XContentType;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.reindex.DeleteByQueryRequest;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.suggest.Suggest;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;

import java.util.Collection;
import java.util.List;
import java.util.Spliterator;
import java.util.Spliterators;
import java.util.stream.Collectors;
import java.util.stream.StreamSupport;

/**
 * ClassName: BaseElasticServiceImpl
 * Description:
 *
 * @author 小黄
 * date: 2020/4/13 15:41
 */
@Service
@Slf4j
public class BaseElasticServiceImpl implements IBaseElasticService {

    @Autowired
    RestHighLevelClient restHighLevelClient;

    /**
     *  创建索引
     * @param idxName 索引名称
     * @param idxSQL 索引描述
     */
    @Override
    public void createIndex(String idxName,String idxSQL){
        try {
            if (!this.indexExist(idxName)) {
                log.info(" idxName={} 已经存在,idxSql={}",idxName,idxSQL);
                return;
            }
            CreateIndexRequest request = new CreateIndexRequest(idxName);
            buildSetting(request);
            request.mapping(idxSQL, XContentType.JSON);
//         request.settings() 手工指定Setting
            CreateIndexResponse res = restHighLevelClient.indices().create(request, RequestOptions.DEFAULT);
            if (!res.isAcknowledged()) {
                throw new RuntimeException("初始化失败");
            }
        } catch (Exception e) {
            e.printStackTrace();
            System.exit(0);
        }
    }

    /**
     * 断某个index是否存在
     * @param idxName index名
     * @return
     * @throws Exception
     */
    private boolean indexExist(String idxName) throws Exception {
        GetIndexRequest request = new GetIndexRequest(idxName);
        request.local(false);
        request.humanReadable(true);
        request.includeDefaults(false);
        request.indicesOptions(IndicesOptions.lenientExpandOpen());
        return restHighLevelClient.indices().exists(request, RequestOptions.DEFAULT);
    }

    /**
     * 设置分片
     * @param request
     */
    private void buildSetting(CreateIndexRequest request){
        request.settings(Settings.builder().put("index.number_of_shards",6)
                .put("index.number_of_replicas",1));
    }

    /**
     *
     * @param idxName 索引名
     * @param entity 文档对象
     */
    @Override
    public void insertOrUpdateOne(String idxName, ElasticSuggestDto entity) {
        IndexRequest request = new IndexRequest(idxName);
        //log.info("Data : id={},entity={}",entity.getId(),JSON.toJSONString(entity.getData()));
        request.id(entity.getId());
        request.source(entity.getData(), XContentType.JSON);
        try {
            restHighLevelClient.index(request, RequestOptions.DEFAULT);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

    /**
     * 批量插入文档数据
     * @param idxName
     * @param list
     */
    @Override
    public void insertBatch(String idxName, List<ElasticSuggestDto> list) {
        BulkRequest request = new BulkRequest();
        list.forEach(item -> request.add(new IndexRequest(idxName).id(item.getId())
                .source(item.getData(), XContentType.JSON)));
        try {
            restHighLevelClient.bulk(request, RequestOptions.DEFAULT);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

    /**
     * 批量删除
     * @param idxName 索引名称
     * @param idList 待删除列表
     * @param 
     */
    @Override
    public <T> void deleteBatch(String idxName, Collection<T> idList) {
        BulkRequest request = new BulkRequest();
        idList.forEach(item -> request.add(new DeleteRequest(idxName, item.toString())));
        try {
            restHighLevelClient.bulk(request, RequestOptions.DEFAULT);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

    /**
     * 搜索
     * @param idxName
     * @param builder
     * @return
     */
    @Override
    public List<String> searchCompletionSuggest(String idxName, SearchSourceBuilder builder) {
        SearchRequest request = new SearchRequest(idxName);
        request.source(builder);
        try {
            SearchResponse response = restHighLevelClient.search(request, RequestOptions.DEFAULT);
            List<String> list = StreamSupport.stream(Spliterators.spliteratorUnknownSize(response.getSuggest().iterator(), Spliterator.ORDERED), false)
                    .flatMap(suggestion -> suggestion.getEntries().get(0).getOptions().stream())
                    .map((Suggest.Suggestion.Entry.Option option) -> option.getText().toString())
                    .collect(Collectors.toList());
            return list;
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

    /**
     * 删除index
     * @param idxName
     */
    @Override
    public void deleteIndex(String idxName) {
        try {
            if (!this.indexExist(idxName)) {
                log.info(" idxName={} 已经存在",idxName);
                return;
            }
            restHighLevelClient.indices().delete(new DeleteIndexRequest(idxName), RequestOptions.DEFAULT);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

    /**
     * deleteByQuery
     * @param idxName
     * @param builder
     */
    @Override
    public void deleteByQuery(String idxName, QueryBuilder builder) {
        DeleteByQueryRequest request = new DeleteByQueryRequest(idxName);
        request.setQuery(builder);
        //设置批量操作数量,最大为10000
        request.setBatchSize(10000);
        request.setConflicts("proceed");
        try {
            restHighLevelClient.deleteByQuery(request, RequestOptions.DEFAULT);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

}

5.3 创建索引

我这里图方便直接用postman创建的索引,也可以用上面代码createIndex方法进行创建
SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第10张图片
索引json

{
  "settings": {
    "number_of_replicas": 1,
    "number_of_shards": 6,
    "analysis": {
      "analyzer": {
        "default": {
          "tokenizer": "ik_max_word"
        },
        "first_py_letter_analyzer": {
          "tokenizer": "first_py_letter"
        },
        "full_pinyin_letter_analyzer": {
          "tokenizer": "full_pinyin_letter"
        }
      },
      "tokenizer": {
        "first_py_letter": {
          "type": "pinyin",
          "keep_first_letter": true,
          "keep_full_pinyin": false,
          "keep_original": false,
          "limit_first_letter_length": 16,
          "lowercase": true,
          "trim_whitespace": true,
          "keep_none_chinese_in_first_letter": false,
          "none_chinese_pinyin_tokenize": false,
          "keep_none_chinese": true,
          "keep_none_chinese_in_joined_full_pinyin": true
        },
        "full_pinyin_letter": {
          "type": "pinyin",
          "keep_separate_first_letter": false,
          "keep_full_pinyin": false,
          "keep_original": false,
          "limit_first_letter_length": 16,
          "lowercase": true,
          "keep_first_letter": false,
          "keep_none_chinese_in_first_letter": false,
          "none_chinese_pinyin_tokenize": false,
          "keep_none_chinese": true,
          "keep_joined_full_pinyin": true,
          "keep_none_chinese_in_joined_full_pinyin": true
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "suggest": {
        "type": "completion",
        "analyzer": "default",
        "fields": {
          "keyword_pinyin": {
            "type": "completion",
            "analyzer": "full_pinyin_letter_analyzer"
          },
          "keyword_first_py": {
            "type": "completion",
            "analyzer": "first_py_letter_analyzer"
          }
        }

      }
    }
  }
}

5.4 插入索引数据与completion suggest搜索

IGmdiSuggestService

package cnki.tpi.service.gmdi;

import java.util.List;

public interface IGmdiSuggestService {

    /**
     * 从txt文档插入数据到es suggest索引中
     * @throws Exception
     */
    void insertSuggest(String filePath) throws Exception;

    /**
     * 根据用户输入内容获取自动补全提示语
     * @param searchValue
     * @return
     */
    List<String> searchCompletionSuggest(String searchValue);
}

GmdiSuggestServiceImpl

package cnki.tpi.service.gmdi.impl;

import cn.hutool.core.io.FileUtil;
import cnki.tpi.entity.dto.ElasticSuggestDto;
import cnki.tpi.service.IBaseElasticService;
import cnki.tpi.service.gmdi.IGmdiSuggestService;
import cnki.tpi.util.MD5;
import lombok.extern.slf4j.Slf4j;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.elasticsearch.search.suggest.SuggestBuilder;
import org.elasticsearch.search.suggest.SuggestBuilders;
import org.elasticsearch.search.suggest.SuggestionBuilder;
import org.elasticsearch.search.suggest.completion.FuzzyOptions;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Service;

import java.io.BufferedReader;
import java.io.File;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.regex.Pattern;

/**
 * ClassName: GmdiSuggestServiceImpl
 * Description:
 *
 * @author 小黄
 * date: 2020/4/13 16:05
 */
@Slf4j
@Service
public class GmdiSuggestServiceImpl implements IGmdiSuggestService {

    @Autowired
    private IBaseElasticService elasticService;
    @Value("${elasticsearch.indexName}")
    private String indexName;

    @Override
    public void insertSuggest(String filePath) throws Exception {
        File file = FileUtil.touch(filePath);
        BufferedReader reader = FileUtil.getReader(file, "UTF-8");
        String suggest = null;
        while ((suggest = reader.readLine()) != null) {
            String id = MD5.md5(suggest);
            HashMap<String, String> map = new HashMap<>();
            map.put("suggest", suggest);
            ElasticSuggestDto suggestDto = new ElasticSuggestDto(id, map);
            elasticService.insertOrUpdateOne(indexName, suggestDto);
        }
        reader.close();
    }

    @Override
    public List<String> searchCompletionSuggest(String searchValue) {
        String field = "suggest";
        if (checkLetter(searchValue)) {
            field = "suggest.keyword_pinyin";
        }
        //用SearchSourceBuilder来构造查询请求体
        List<String> result = new ArrayList<>();
        result = elasticService.searchCompletionSuggest(indexName, getSearchSourceBuilder(field, searchValue, false));
        if (result.size() == 0  && checkLetter(searchValue)) {
            //拼音fuzzy查询
            result = elasticService.searchCompletionSuggest(indexName, getSearchSourceBuilder(field, searchValue, true));
            if (result.size() == 0) {
                //首字母查询
                result = elasticService.searchCompletionSuggest(indexName,
                        getSearchSourceBuilder("suggest.keyword_first_py", searchValue, false));
            }
        }
        return result;
    }

    /**
     * getCompletionSuggestionBuilder
     *
     * @param field
     * @param value
     * @return
     */
    private SuggestionBuilder getCompletionSuggestionBuilder(String field, String value, Boolean isFuzzy) {
        if (isFuzzy) {
            return SuggestBuilders.completionSuggestion(field).prefix(value,
                    FuzzyOptions.builder().setFuzziness(2).build()).skipDuplicates(true).size(10);
        } else {
            return SuggestBuilders.completionSuggestion(field).prefix(value).skipDuplicates(true).size(10);
        }
    }

    /**
     * getSearchSourceBuilder
     *
     * @param field
     * @param value
     * @param isFuzzy 是否fuzzy模糊查询
     * @return
     */
    private SearchSourceBuilder getSearchSourceBuilder(String field, String value, Boolean isFuzzy) {
        SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
        SuggestionBuilder completionSuggestionBuilder = getCompletionSuggestionBuilder(field, value, isFuzzy);
        SuggestBuilder suggestBuilder = new SuggestBuilder();
        suggestBuilder.addSuggestion("search-suggest", completionSuggestionBuilder);
        sourceBuilder.suggest(suggestBuilder);
        return sourceBuilder;
    }

    /**
     * 只包含字母
     *
     * @return 验证成功返回true,验证失败返回false
     */
    private boolean checkLetter(String cardNum) {
        String regex = "^[A-Za-z]+$";
        return Pattern.matches(regex, cardNum);
    }

    /**
     * 验是否中文
     *
     * @param chinese 中文字符
     * @return 验证成功返回true,验证失败返回false
     */
    public static boolean checkChinese(String chinese) {
        String regex = "^[\u4E00-\u9FA5]+$";
        return Pattern.matches(regex, chinese);
    }

}

搜索广州,返回结果:
SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能_第11张图片
六、参考资料

https://www.jianshu.com/p/12d791cd29c1
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters.html#completion-suggester
http://www.appblog.cn/2019/06/04/ElasticSearch%207.x%20%E9%9B%86%E6%88%90RestHighLevelClient/
https://blog.csdn.net/qq_28988969/article/details/79540620
https://blog.csdn.net/wwd0501/article/details/80885987
https://blog.csdn.net/qushaming/article/details/90479091
https://www.jianshu.com/p/9e2c6a8e1b54
https://blog.csdn.net/wwd0501/article/details/80595201
https://blog.csdn.net/lixiaohai_918/article/details/89569611
https://www.cnblogs.com/feiquan/p/11888812.html
https://blog.csdn.net/woshiaotian/article/details/41479099
https://www.jianshu.com/p/acc8e86cc772
https://www.cnblogs.com/yfb918/p/10784951.html
https://www.cnblogs.com/leeSmall/p/9220535.html
https://blog.csdn.net/qq_35981283/article/details/86627170
https://blog.csdn.net/wzygis/article/details/51698309
https://blog.csdn.net/csdn565973850/article/details/104772551/
https://www.cnblogs.com/lianliang/p/7953891.html
https://stackoverflow.com/questions/53823154/elasticsearch-java-high-level-rest-client-suggest-search
http://www.likecs.com/default/index/show?id=39948
https://blog.csdn.net/chengyuqiang/article/details/89841544

你可能感兴趣的:(SpringBoot2.2.2+Elasticsearch7.6.2实现中文、拼音、拼音首字母智能提示功能)