ElasticSearch-7.4.2版本,IK分词安装

下载

到官网下载https://github.com/medcl/elasticsearch-analysis-ik对应版本的ik(直接下载releases版本,避免maven打包!!!

如果不是这个版本,则需要进入解压后的目录使用mvn package打包,然后在target->releases目录下会生成对应的zip文件)

安装

上传zip包到plugins目录,然后解压:

[es@Centos-51 plugins]$ ls

elasticsearch-analysis-ik-7.4.2.zip

[es@Centos-51 plugins]$ mkdir elasticsearch-analysis-ik-7.4.2

[es@Centos-51 plugins]$ ls

elasticsearch-analysis-ik-7.4.2  elasticsearch-analysis-ik-7.4.2.zip

[es@Centos-51 plugins]$ unzip elasticsearch-analysis-ik-7.4.2.zip -d ./elasticsearch-analysis-ik-7.4.2

Archive:  elasticsearch-analysis-ik-7.4.2.zip

  inflating: ./elasticsearch-analysis-ik-7.4.2/elasticsearch-analysis-ik-7.4.2.jar 

  inflating: ./elasticsearch-analysis-ik-7.4.2/httpclient-4.5.2.jar 

  inflating: ./elasticsearch-analysis-ik-7.4.2/httpcore-4.4.4.jar 

  inflating: ./elasticsearch-analysis-ik-7.4.2/commons-logging-1.2.jar 

  inflating: ./elasticsearch-analysis-ik-7.4.2/commons-codec-1.9.jar 

  inflating: ./elasticsearch-analysis-ik-7.4.2/plugin-descriptor.properties 

  inflating: ./elasticsearch-analysis-ik-7.4.2/plugin-security.policy 

  creating: ./elasticsearch-analysis-ik-7.4.2/config/

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/surname.dic 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/quantifier.dic 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/extra_stopword.dic 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/suffix.dic 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/extra_single_word_full.dic 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/extra_single_word.dic 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/preposition.dic 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/IKAnalyzer.cfg.xml 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/main.dic 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/stopword.dic 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/extra_main.dic 

  inflating: ./elasticsearch-analysis-ik-7.4.2/config/extra_single_word_low_freq.dic 

[es@Centos-51 plugins]$ ls

elasticsearch-analysis-ik-7.4.2  elasticsearch-analysis-ik-7.4.2.zip

[es@Centos-51 plugins]$ cd elasticsearch-analysis-ik-7.4.2

[es@Centos-51 elasticsearch-analysis-ik-7.4.2]$ ls

commons-codec-1.9.jar  commons-logging-1.2.jar  config  elasticsearch-analysis-ik-7.4.2.jar  httpclient-4.5.2.jar  httpcore-4.4.4.jar  plugin-descriptor.properties  plugin-security.policy

[es@Centos-51 elasticsearch-analysis-ik-7.4.2]$ cd ..

[es@Centos-51 plugins]$ ls

elasticsearch-analysis-ik-7.4.2  elasticsearch-analysis-ik-7.4.2.zip

[es@Centos-51 plugins]$ rm -rf elasticsearch-analysis-ik-7.4.2.zip

验证

使用ik_smart分词结果:

{

    "tokens": [

        {

            "token": "我",

            "start_offset": 0,

            "end_offset": 1,

            "type": "CN_CHAR",

            "position": 0

        },

        {

            "token": "是",

            "start_offset": 1,

            "end_offset": 2,

            "type": "CN_CHAR",

            "position": 1

        },

        {

            "token": "中国人",

            "start_offset": 2,

            "end_offset": 5,

            "type": "CN_WORD",

            "position": 2

        }

    ]

}

使用ik_max_word分词结果:

{

    "tokens": [

        {

            "token": "我",

            "start_offset": 0,

            "end_offset": 1,

            "type": "CN_CHAR",

            "position": 0

        },

        {

            "token": "是",

            "start_offset": 1,

            "end_offset": 2,

            "type": "CN_CHAR",

            "position": 1

        },

        {

            "token": "中国人",

            "start_offset": 2,

            "end_offset": 5,

            "type": "CN_WORD",

            "position": 2

        },

        {

            "token": "中国",

            "start_offset": 2,

            "end_offset": 4,

            "type": "CN_WORD",

            "position": 3

        },

        {

            "token": "国人",

            "start_offset": 3,

            "end_offset": 5,

            "type": "CN_WORD",

            "position": 4

        }

    ]

}

ik_max_word: 会将文本做最细粒度的拆分,比如会将“我是中国人”拆分为“我,是,中国人,中国,国人”,会穷尽各种可能的组合。

ik_smart: 会做最粗粒度的拆分,比如会将“我是中国人”拆分为“我,是,中国人”。

你可能感兴趣的:(ElasticSearch-7.4.2版本,IK分词安装)