IK Analyzer 扩展配置
custom/mydict.dic;custom/single_word_low_freq.dic
custom/ext_stopword.dic
(1)自己建立词库:每年都会涌现一些特殊的流行词,网红,蓝瘦香菇,喊麦,鬼畜,一般不会在ik的原生词典里
还未 ik\config\custom\mydict.dic 文件中添加 “喊麦”,进行分词
GET /my_index/_analyze
{
"text": "喊麦",
"analyzer": "ik_max_word"
}
{
"tokens": [
{
"token": "喊",
"start_offset": 0,
"end_offset": 1,
"type": "CN_WORD",
"position": 0
},
{
"token": "麦",
"start_offset": 1,
"end_offset": 2,
"type": "CN_WORD",
"position": 1
}
]
}
在mydict.dic 文件中添加 “喊麦”后,重启es,测试
GET /my_index/_analyze
{
"text": "喊麦",
"analyzer": "ik_max_word"
}
{
"tokens": [
{
"token": "喊麦",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 0
},
{
"token": "喊",
"start_offset": 0,
"end_offset": 1,
"type": "CN_WORD",
"position": 1
},
{
"token": "麦",
"start_offset": 1,
"end_offset": 2,
"type": "CN_WORD",
"position": 2
}
]
}