注意:不能用默认elasticsearch-plugin install xxx.zip 进行自动安装
https://github.com/medcl/elasticsearch-analysis-ik/releases?after=v6.4.2 对应es版本安装
docker exec -it elasticsearch /bin/bash
进入es容器内部 默认在/usr/share/elasticsearch目录下
wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v5.6.11/elasticsearch-analysis-ik-5.6.11.zip
unzip 下载的文件
rm –rf *.zip
mv elasticsearch/ /usr/share/elasticsearch/plugins/ik
可以确认是否安装好了分词器
cd /usr/share/elasticsearch/bin
elasticsearch-plugin list
即可列出系统的分词器
然后重启elasticsearch
docker restart elasticsearch
如果wget的时候慢 可以下载下来复制到容器中 然后再解压
docker cp xxx.txt docker容器名或id:/xxx/xxx/xxxx
本地文件绝对路径 docker容器中文件路径
索引(新增)一个文档
PUT bank/external/1
{
"name": "John Doe"
}
使用默认
GET bank/_analyze
{
"text": "我是中国人"
}
请观察结果
{
"tokens": [
{
"token": "我",
"start_offset": 0,
"end_offset": 1,
"type": "" ,
"position": 0
},
{
"token": "是",
"start_offset": 1,
"end_offset": 2,
"type": "" ,
"position": 1
},
{
"token": "中",
"start_offset": 2,
"end_offset": 3,
"type": "" ,
"position": 2
},
{
"token": "国",
"start_offset": 3,
"end_offset": 4,
"type": "" ,
"position": 3
},
{
"token": "人",
"start_offset": 4,
"end_offset": 5,
"type": "" ,
"position": 4
}
]
}
使用分词器
GET bank/_analyze
{ "analyzer": "ik_smart",
"text": "我是中国人"
}
请观察结果
{
"tokens": [
{
"token": "我",
"start_offset": 0,
"end_offset": 1,
"type": "CN_CHAR",
"position": 0
},
{
"token": "是",
"start_offset": 1,
"end_offset": 2,
"type": "CN_CHAR",
"position": 1
},
{
"token": "中国人",
"start_offset": 2,
"end_offset": 5,
"type": "CN_WORD",
"position": 2
}
]
}
另外一个分词器
ik_max_word
GET bank/_analyze
{ "analyzer": "ik_max_word",
"text": "我是中国人"
}
请观察结果
{
"tokens": [
{
"token": "我",
"start_offset": 0,
"end_offset": 1,
"type": "CN_CHAR",
"position": 0
},
{
"token": "是",
"start_offset": 1,
"end_offset": 2,
"type": "CN_CHAR",
"position": 1
},
{
"token": "中国人",
"start_offset": 2,
"end_offset": 5,
"type": "CN_WORD",
"position": 2
},
{
"token": "中国",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 3
},
{
"token": "国人",
"start_offset": 3,
"end_offset": 5,
"type": "CN_WORD",
"position": 4
}
]
}
能够看出不同的分词器,分词有明显的区别,所以以后定义一个type不能再使用默认的mapping了,要手工建立mapping, 因为要选择分词器。
https://github.com/pipizhang/docker-elasticsearch-analysis-ik