总结:elasticsearch和ik分词器结合使用遇到的坑!自定义分词,远程热加载自定义分词

第一个坑本地自定义分词无法加载到es,或者加载了也无法使用.

注意你所建的dic文件路径是否正确,文件格式是否为(UTF-8-BOM)

修改IKAnalyzer.cfg.xml:

custom/mydic.dic

重庆es:

[2020-08-12T10:20:51,226][INFO ][o.w.a.d.Monitor          ] [Dict Loading] D:\software\elasticsearch\elasticsearch-6.2.2\plugins\elasticsearch\config\custom\mydic.dic

说明加载成功!

远程加载准备一个方法供ik的远程调用:


    http://localhost:10013/es-config/load-dic

实现热加载主要在head中添加

Last-Modified和ETag这两者都是字符串类型,只要有一个发生变化,该插件就会去抓取新的分词进而更新词库。 http 请求返回的内容格式是一行一个分词,换行符用 \n 。满足上面两点要求就可以实现热更新分词了,不需要重启 ES 实例,每分钟es会去请求一次.
@RestController
@RequestMapping("es-config")
public class ESConfigController {

    @GetMapping("load-dic")
    public void loadDic(HttpServletRequest request, HttpServletResponse response){

        OutputStream out=null;
        try {
            //读取字典文件
            String filePath="D:\\chenzhen\\Tomcat\\apache-tomcat-9.0.6\\mydic.dic";
            File file=new File(filePath);
            String content = "";
            if(file.exists()){
                // 读取文件内容
                FileInputStream fi = new FileInputStream(file);
                byte[] buffer = new byte[(int) file.length()];
                int offset = 0, numRead = 0;
                while (offset < buffer.length && (numRead = fi.read(buffer, offset, buffer.length - offset)) >= 0) {
                    offset += numRead;
                }
                fi.close();
                content = new String(buffer, "UTF-8");
            }
            // 返回数据
            out = response.getOutputStream();
            //实时更新
            response.setHeader("Last-Modified", String.valueOf(System.currentTimeMillis()));
            response.setHeader("ETag",String.valueOf(System.currentTimeMillis()));
            response.setContentType("text/plain; charset=utf-8");
            out.write(content.getBytes("utf-8"));
            out.flush();
        } catch (IOException  e) {
            e.printStackTrace();
        } finally {
            if (null != out) {
                try {
                    out.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }

        }}
}

然后启动es,可能会报这样的错误:

access denied ("java.net.SocketPermission" "127.0.0.1:10013" "connect,

原因是java的安全策略文件被拦截了

修改java的Java\jdk1.8.0_181\jre\lib\security\ava.policy文件:

在grant中添加:

permission java.net.SocketPermission "localhost:10013","connect,resolve";

再次启动:

[2020-08-12T10:33:01,770][INFO ][o.w.a.d.Monitor          ] 明天之后
[2020-08-12T10:33:01,771][INFO ][o.w.a.d.Monitor          ] 德玛西亚
[2020-08-12T10:33:01,774][INFO ][o.w.a.d.Monitor          ] 重新加载词典完毕...
[2020-08-12T10:34:01,627][INFO ][o.w.a.d.Monitor          ] 重新加载词典...
[2020-08-12T10:34:01,628][INFO ][o.w.a.d.Monitor          ] try load config from D:\software\elasticsearch\elasticsearch-6.2.2\config\analysis-ik\IKAnalyzer.cfg.xml
[2020-08-12T10:34:01,630][INFO ][o.w.a.d.Monitor          ] try load config from D:\software\elasticsearch\elasticsearch-6.2.2\plugins\elasticsearch\config\IKAnalyzer.cfg.xml
[2020-08-12T10:34:01,721][INFO ][o.w.a.d.Monitor          ] [Dict Loading] D:\software\elasticsearch\elasticsearch-6.2.2\plugins\elasticsearch\config\custom\mydic.dic
[2020-08-12T10:34:01,721][INFO ][o.w.a.d.Monitor          ] [Dict Loading] http://localhost:10013/es-config/load-dic
[2020-08-12T10:34:01,731][INFO ][o.w.a.d.Monitor          ] 
[2020-08-12T10:34:01,732][INFO ][o.w.a.d.Monitor          ] 明天之后
[2020-08-12T10:34:01,734][INFO ][o.w.a.d.Monitor          ] 德玛西亚
[2020-08-12T10:34:01,735][INFO ][o.w.a.d.Monitor          ] 重新加载词典完毕...
[2020-08-12T10:35:01,630][INFO ][o.w.a.d.Monitor          ] 重新加载词典...
[2020-08-12T10:35:01,631][INFO ][o.w.a.d.Monitor          ] try load config from D:\software\elasticsearch\elasticsearch-6.2.2\config\analysis-ik\IKAnalyzer.cfg.xml
[2020-08-12T10:35:01,636][INFO ][o.w.a.d.Monitor          ] try load config from D:\software\elasticsearch\elasticsearch-6.2.2\plugins\elasticsearch\config\IKAnalyzer.cfg.xml
[2020-08-12T10:35:01,771][INFO ][o.w.a.d.Monitor          ] [Dict Loading] D:\software\elasticsearch\elasticsearch-6.2.2\plugins\elasticsearch\config\custom\mydic.dic
[2020-08-12T10:35:01,771][INFO ][o.w.a.d.Monitor          ] [Dict Loading] http://localhost:10013/es-config/load-dic
[2020-08-12T10:35:01,777][INFO ][o.w.a.d.Monitor          ] 
[2020-08-12T10:35:01,778][INFO ][o.w.a.d.Monitor          ] 明天之后
[2020-08-12T10:35:01,780][INFO ][o.w.a.d.Monitor          ] 德玛西亚
[2020-08-12T10:35:01,781][INFO ][o.w.a.d.Monitor          ] 重新加载词典完毕...
[2020-08-12T10:36:01,629][INFO ][o.w.a.d.Monitor          ] 重新加载词典...
[2020-08-12T10:36:01,630][INFO ][o.w.a.d.Monitor          ] try load config from D:\software\elasticsearch\elasticsearch-6.2.2\config\analysis-ik\IKAnalyzer.cfg.xml
[2020-08-12T10:36:01,637][INFO ][o.w.a.d.Monitor          ] try load config from D:\software\elasticsearch\elasticsearch-6.2.2\plugins\elasticsearch\config\IKAnalyzer.cfg.xml
[2020-08-12T10:36:01,806][INFO ][o.w.a.d.Monitor          ] [Dict Loading] D:\software\elasticsearch\elasticsearch-6.2.2\plugins\elasticsearch\config\custom\mydic.dic
[2020-08-12T10:36:01,807][INFO ][o.w.a.d.Monitor          ] [Dict Loading] http://localhost:10013/es-config/load-dic
[2020-08-12T10:36:01,815][INFO ][o.w.a.d.Monitor          ] 

es自动去加载.

注意扩展的字典首行不写内容,从第二行开始.才能加载到

你可能感兴趣的:(es,elasticsearch,java)