所有不说明elastic 版本的博客都是耍流氓 。 ——某码农
原文链接
版本如题。拼音和中文分词一起的整个测试流程如下:
DELETE /index_name/
{
}
PUT /index_name/
{
"index": {
"analysis": {
"analyzer": {
"ik_pinyin_analyzer": {
"type": "custom",
"tokenizer": "ik_max_word",
"filter": ["my_pinyin", "word_delimiter"]
}
},
"filter": {
"my_pinyin": {
"type": "pinyin",
"first_letter": "prefix",
"padding_char": " "
}
}
}
}
}
PUT /index_name/app/_mapping
{
"app": {
"properties": {
"ProductCName": {
"type": "keyword",
"fields": {
"pinyin": {
"type": "text",
"store": false,
"term_vector": "with_positions_offsets",
"analyzer": "ik_pinyin_analyzer",
"boost": 10
}
}
},
"ProductEName":{
"type":"text",
"analyzer": "ik_max_word"
},
"Description":{
"type":"text",
"analyzer": "ik_max_word"
}
}
}
}
PUT /index_name/app/1
{
"ProductCName":"口红世家",
"ProductEName":"Red History",
"Description":"口红真是很棒的东西呢"
}
POST /index_name/_analyze?pretty
{
"analyzer": "pinyin",
"text":"王者荣耀"
}
{
"tokens": [
{
"token": "wang",
"start_offset": 0,
"end_offset": 1,
"type": "word",
"position": 0
},
{
"token": "wzry",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 0
},
{
"token": "zhe",
"start_offset": 1,
"end_offset": 2,
"type": "word",
"position": 1
},
{
"token": "rong",
"start_offset": 2,
"end_offset": 3,
"type": "word",
"position": 2
},
{
"token": "yao",
"start_offset": 3,
"end_offset": 4,
"type": "word",
"position": 3
}
]
}
配置完成之后发现分词不生效,查看 elastcsearch
启动日志发现 如下错误:
修改 路径 和 xml
结构 错误之后,重新加载了配置文件:
查看口红的 的搜索效果。发现搜索到了 含有 红
这个字的结果