前言
- 推荐学习阮一鸣《Elasticsearch 核心技术与实战》
- 本文对 Elasticsearch 7.x 适用
- 同义词可以再建索引时(index-time synonyms)或者检索时(search-time synonyms)使用,一般在检索时使用
- 本文介绍的是 search-time synonyms
同义词文档格式
ipod, i-pod, i pod => ipod
马铃薯, 土豆, potato
试验步骤
添加同义词文件
- 在 Elasticsearch 的
config
目录下新建 analysis
目录,在 analysis
下添加同义词文件 synonym.txt
- 在检索时使用同义词,不需要重启 Elasticsearch,也不需要重建索引
创建索引
PUT my_index
{
"settings": {
"analysis": {
"filter": {
"word_syn": {
"type": "synonym_graph",
"synonyms_path": "analysis/synonym.txt"
}
},
"analyzer": {
"ik_smart_syn": {
"filter": [
"stemmer",
"word_syn"
],
"type": "custom",
"tokenizer": "ik_smart"
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "ik_max_word",
"search_analyzer": "ik_smart"
},
"author": {
"type": "keyword"
}
}
}
}
直接测试分词器
GET my_index/_analyze
{
"analyzer": "ik_smart_syn",
"text": "马铃薯"
}
{
"tokens" : [
{
"token" : "马铃薯",
"start_offset" : 0,
"end_offset" : 3,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "土豆",
"start_offset" : 0,
"end_offset" : 3,
"type" : "SYNONYM",
"position" : 0
},
{
"token" : "potato",
"start_offset" : 0,
"end_offset" : 3,
"type" : "SYNONYM",
"position" : 0
}
]
}
添加测试数据
POST my_index/_doc/1
{
"title": "马铃薯",
"author": "土豆"
}
GET my_index/_termvectors/1?fields=title
检索测试
GET my_index/_search
{
"query": {
"query_string": {
"analyzer": "ik_smart_syn",
"query": "title:potato AND author:potato"
}
}
}
{
"took" : 38,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.5753642,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.5753642,
"_source" : {
"title" : "马铃薯",
"author" : "土豆"
}
}
]
}
}
相关文档
本文出自 qbit snap