分析器:"standard", "ik_max_word" , "ik_smart"
- standard标准分析器是将每个字都分出来;而ik_max_word是将所有可能的词都分出来;ik_smart 是只分出自认为最正确的词;
# standard分词器
GET /_analyze
{
"analyzer": "standard",
"text": "练习分词"
}
# 分词结果
{
"tokens": [
{
"token": "练",
"start_offset": 0,
"end_offset": 1,
"type": "",
"position": 0
},
{
"token": "习",
"start_offset": 1,
"end_offset": 2,
"type": "",
"position": 1
},
{
"token": "分",
"start_offset": 2,
"end_offset": 3,
"type": "",
"position": 2
},
{
"token": "词",
"start_offset": 3,
"end_offset": 4,
"type": "",
"position": 3
},
{
"token": "的",
"start_offset": 4,
"end_offset": 5,
"type": "",
"position": 4
}
]
}
**************************************************************************
# ik_max_word分词器
GET /_analyze
{
"analyzer": "ik_max_word",
"text": "练习分词"
}
GET /_analyze
{
"analyzer": "ik_smart",
"text": "练习分词"
}
# 分词结果
{
"tokens": [
{
"token": "练习",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 0
},
{
"token": "分词",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 1
},
{
"token": "词",
"start_offset": 3,
"end_offset": 4,
"type": "CN_WORD",
"position": 2
}
]
}
******************************************************************************************
# ik_smart分词器
GET /_analyze
{
"analyzer": "ik_smart",
"text": "练习分词"
}
# 分词结果
{
"tokens": [
{
"token": "练习",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 0
},
{
"token": "分词",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 1
}
]
}
- term查询只匹配分词分出该词的查询,如果你分出的词时“练习”、“分词”(分析器是ik_smart),而你的查询语句如下
GET adu/adu2/_search
{"query": {"bool": {"filter": {"term": {
"title": "练"
}}}}}
最后是没有结果的
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
或者如果你的分词是“分”、“词”、“练”、“习”(分析器是standard),而你的查询如下:
GET adu/adu2/_search
{"query": {"bool": {"filter": {"term": {
"title": "练习"
}}}}}
最后也是没有结果的
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
# 只能按照分词结果查
GET adu/adu2/_search
{"query": {"bool": {"filter": {"term": {
"title": "练"
}}}}}
# 才有结果
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0,
"hits": [
{
"_index": "adu",
"_type": "adu2",
"_id": "1",
"_score": 0,
"_source": {
"title": "练习分词",
"city": "北京",
"time": 1
}
}
]
}
}
- 而match_phrase是可以先分词,然后在匹配,比较符合分析text类型的