ES中安装中文/拼音分词器(IK+pinyin)

ES作为最强大的全文检索工具(没有之一),中英文分词几乎是必备功能,下面简单说明下分词器安装步骤(详细步骤网上很多,本文只提供整体思路和步骤):

1. 下载中文/拼音分词器

IK中文分词器:https://github.com/medcl/elasticsearch-analysis-ik
拼音分词器:https://github.com/medcl/elasticsearch-analysis-pinyin
(竟然都是同一个作者的杰作,还有mmseg和简繁转换的类库,依然默默 watch)

2. 安装
  • 通过releases找到和es对应版本的zip文件,或者source文件(自己通过mvn package打包);当然也可以下载最新master的代码。
  • 进入elasticsearch安装目录/plugins;mkdir pinyin;cd pinyin;
  • cp 刚才打包的zip文件到pinyin目录;unzip解压
  • 部署后,记得重启es节点
3. 配置

** settings配置 **

PUT  my_index/_settings 
"index" : {
        "number_of_shards" : "3",
        "number_of_replicas" : "1",
        "analysis" : {
          "analyzer" : {
            "default" : {
              "tokenizer" : "ik_max_word"
            },
            "pinyin_analyzer" : {
              "tokenizer" : "my_pinyin"
            }
          },
          "tokenizer" : {
            "my_pinyin" : {
              "keep_separate_first_letter" : "false",
              "lowercase" : "true",
              "type" : "pinyin",
              "limit_first_letter_length" : "16",
              "keep_original" : "true",
              "keep_full_pinyin" : "true"
            }
          }
        }
      }

** mapping 配置 **

PUT my_index/index_type/_mapping
"ep" : {
        "_all" : {
          "analyzer" : "ik_max_word"
        },
        "properties" : {
            "name" : {
                "type" : "text",
                "analyzer" : "ik_max_word",
                "include_in_all" : true,
                "fields" : {
                    "pinyin" : {
                        "type" : "text",
                        "term_vector" : "with_positions_offsets",
                        "analyzer" : "pinyin_analyzer",
                        "boost" : 10.0
                      }
                 }
            }
      }
}
4. 测试

通过_analyze测试下分词器是否能正常运行:

GET my_index/_analyze
{
    "text":["刘德华"],
    "ananlyzer":"pinyin_analyzer"
}

向index中put中文数据:

POST my_index/index_type -d'
{
"name":"刘德华"
}
'

中文分词测试(通过查询字符串)
curl http://localhost:9200/my_index/index_type/_search?q=name:刘
curl http://localhost:9200/my_index/index_type/_search?q=name:刘德

拼音测试 (通过查询字符串)
curl http://localhost:9200/my_index/index_type/_search?q=name.pinyin:liu
curl http://localhost:9200/my_index/index_type/_search?q=name.pinyin:ldh
curl http://localhost:9200/my_index/index_type/_search?q=name.pinyin:de+hua

你可能感兴趣的:(ES中安装中文/拼音分词器(IK+pinyin))