通过前一篇的安装后:ElasticSearch6.2.4 安装OK了 我们继续安装IK分词器
以下是版本对照表(GitHub地址):
IK version | ES version |
---|---|
master | 6.x -> master |
6.2.4 | 6.2.4 |
6.1.3 | 6.1.3 |
5.6.8 | 5.6.8 |
5.5.3 | 5.5.3 |
5.4.3 | 5.4.3 |
5.3.3 | 5.3.3 |
5.2.2 | 5.2.2 |
5.1.2 | 5.1.2 |
1.10.6 | 2.4.6 |
1.9.5 | 2.3.5 |
1.8.1 | 2.2.1 |
1.7.0 | 2.1.1 |
1.5.0 | 2.0.0 |
1.2.6 | 1.0.0 |
1.2.5 | 0.90.x |
1.1.3 | 0.20.x |
1.0.0 | 0.16.2 -> 0.19.0 |
https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.4/elasticsearch-analysis-ik-6.2.4.zip
[payment@localhost elasticsearch-6.2.4]$ cd plugins/
[payment@localhost plugins]$ pwd
/home/payment/elasticSearch/elasticsearch-6.2.4/plugins
[payment@localhost plugins]$ unzip elasticsearch-analysis-ik-6.2.4.zip
[payment@gameServer elasticsearch-6.2.4]$ pwd
/home/payment/elasticSearch/elasticsearch-6.2.4
[payment@gameServer elasticsearch-6.2.4]$
[payment@gameServer elasticsearch-6.2.4]$ ./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.4/elasticsearch-analysis-ik-6.2.4.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.2.4/elasticsearch-analysis-ik-6.2.4.zip
[=================================================] 100%
-> Installed analysis-ik
[payment@gameServer elasticsearch-6.2.4]$
[payment@gameServer elasticsearch-6.2.4]$ ps -ef|grep elasticsearch
payment 27352 1 0 10:50 pts/0 00:00:39 /usr/local/java/jdk1.8.0_161//bin/java -Xms1g -Xmx1g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.io.tmpdir=/tmp/elasticsearch.oFTj99LA -XX:+HeapDumpOnOutOfMemoryError -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:logs/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=32 -XX:GCLogFileSize=64m -Des.path.home=/home/payment/elasticSearch/elasticsearch-6.2.4 -Des.path.conf=/home/payment/elasticSearch/elasticsearch-6.2.4/config -cp /home/payment/elasticSearch/elasticsearch-6.2.4/lib/* org.elasticsearch.bootstrap.Elasticsearch -d
payment 29017 26594 0 13:10 pts/0 00:00:00 grep elasticsearch
[payment@gameServer elasticsearch-6.2.4]$
[payment@gameServer elasticsearch-6.2.4]$
[payment@gameServer elasticsearch-6.2.4]$ kill -9 27352
[payment@gameServer elasticsearch-6.2.4]$ pwd
/home/payment/elasticSearch/elasticsearch-6.2.4
[payment@gameServer elasticsearch-6.2.4]$ ./bin/elasticsearch -d && tail -f logs/elasticsearch.log
[2018-06-06T13:12:28,029][INFO ][o.e.d.DiscoveryModule ] [SdEluaQ] using discovery type [zen]
[2018-06-06T13:12:28,536][INFO ][o.e.n.Node ] initialized
[2018-06-06T13:12:28,536][INFO ][o.e.n.Node ] [SdEluaQ] starting ...
[2018-06-06T13:12:28,711][INFO ][o.e.t.TransportService ] [SdEluaQ] publish_address {172.17.63.15:9300}, bound_addresses {172.17.63.15:9300}
[2018-06-06T13:12:28,721][INFO ][o.e.b.BootstrapChecks ] [SdEluaQ] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2018-06-06T13:12:31,765][INFO ][o.e.c.s.MasterService ] [SdEluaQ] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{IYnq99tLTjKcjGSXoxTS5w}{172.17.63.15}{172.17.63.15:9300}
[2018-06-06T13:12:31,769][INFO ][o.e.c.s.ClusterApplierService] [SdEluaQ] new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{IYnq99tLTjKcjGSXoxTS5w}{172.17.63.15}{172.17.63.15:9300}, reason: apply cluster state (from master [master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{IYnq99tLTjKcjGSXoxTS5w}{172.17.63.15}{172.17.63.15:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2018-06-06T13:12:31,782][INFO ][o.e.h.n.Netty4HttpServerTransport] [SdEluaQ] publish_address {172.17.63.15:9200}, bound_addresses {172.17.63.15:9200}
[2018-06-06T13:12:31,782][INFO ][o.e.n.Node ] [SdEluaQ] started
[2018-06-06T13:12:31,921][INFO ][o.e.g.GatewayService ] [SdEluaQ] recovered [0] indices into cluster_state
[2018-06-06T13:13:42,980][INFO ][o.e.n.Node ] [] initializing ...
[2018-06-06T13:13:43,141][INFO ][o.e.e.NodeEnvironment ] [SdEluaQ] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [402.8gb], net total_space [442.7gb], types [rootfs]
[2018-06-06T13:13:43,141][INFO ][o.e.e.NodeEnvironment ] [SdEluaQ] heap size [990.7mb], compressed ordinary object pointers [true]
[2018-06-06T13:13:43,143][INFO ][o.e.n.Node ] node name [SdEluaQ] derived from node ID [SdEluaQkTfi1p-yRtlxHSA]; set [node.name] to override
[2018-06-06T13:13:43,143][INFO ][o.e.n.Node ] version[6.2.4], pid[29196], build[ccec39f/2018-04-12T20:37:28.497551Z], OS[Linux/2.6.32-696.28.1.el6.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_161/25.161-b12]
[2018-06-06T13:13:43,143][INFO ][o.e.n.Node ] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch.vXQsyXAG, -XX:+HeapDumpOnOutOfMemoryError, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.path.home=/home/payment/elasticSearch/elasticsearch-6.2.4, -Des.path.conf=/home/payment/elasticSearch/elasticsearch-6.2.4/config]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [aggs-matrix-stats]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [analysis-common]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [ingest-common]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [lang-expression]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [lang-mustache]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [lang-painless]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [mapper-extras]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [parent-join]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [percolator]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [rank-eval]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [reindex]
[2018-06-06T13:13:43,782][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [repository-url]
[2018-06-06T13:13:43,783][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [transport-netty4]
[2018-06-06T13:13:43,783][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded module [tribe]
[2018-06-06T13:13:43,783][INFO ][o.e.p.PluginsService ] [SdEluaQ] loaded plugin [analysis-ik]
[2018-06-06T13:13:46,137][INFO ][o.e.d.DiscoveryModule ] [SdEluaQ] using discovery type [zen]
[2018-06-06T13:13:46,605][INFO ][o.e.n.Node ] initialized
[2018-06-06T13:13:46,605][INFO ][o.e.n.Node ] [SdEluaQ] starting ...
[2018-06-06T13:13:46,770][INFO ][o.e.t.TransportService ] [SdEluaQ] publish_address {172.17.63.15:9300}, bound_addresses {172.17.63.15:9300}
[2018-06-06T13:13:46,778][INFO ][o.e.b.BootstrapChecks ] [SdEluaQ] bound or publishing to a non-loopback address, enforcing bootstrap checks
[2018-06-06T13:13:49,828][INFO ][o.e.c.s.MasterService ] [SdEluaQ] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{OJnGIoaBRDaK0mBJRTarMQ}{172.17.63.15}{172.17.63.15:9300}
[2018-06-06T13:13:49,835][INFO ][o.e.c.s.ClusterApplierService] [SdEluaQ] new_master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{OJnGIoaBRDaK0mBJRTarMQ}{172.17.63.15}{172.17.63.15:9300}, reason: apply cluster state (from master [master {SdEluaQ}{SdEluaQkTfi1p-yRtlxHSA}{OJnGIoaBRDaK0mBJRTarMQ}{172.17.63.15}{172.17.63.15:9300} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2018-06-06T13:13:49,853][INFO ][o.e.h.n.Netty4HttpServerTransport] [SdEluaQ] publish_address {172.17.63.15:9200}, bound_addresses {172.17.63.15:9200}
[2018-06-06T13:13:49,861][INFO ][o.e.n.Node ] [SdEluaQ] started
[2018-06-06T13:13:49,973][INFO ][o.e.g.GatewayService ] [SdEluaQ] recovered [0] indices into cluster_state
启动并监听启动日志:
loaded plugin [analysis-ik]
检查分词:
[root@gameServer ~]# curl -XGET http://172.17.63.15:9200/_analyze?pretty -H 'Content-Type:application/json' -d'
{
"analyzer": "ik_smart",
"text": "听说看这篇博客的哥们最帅、姑娘最美"
}'
{
"tokens" : [
{
"token" : "听说",
"start_offset" : 0,
"end_offset" : 2,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "看",
"start_offset" : 2,
"end_offset" : 3,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "这篇",
"start_offset" : 3,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "博客",
"start_offset" : 5,
"end_offset" : 7,
"type" : "CN_WORD",
"position" : 3
},
{
"token" : "的",
"start_offset" : 7,
"end_offset" : 8,
"type" : "CN_CHAR",
"position" : 4
},
{
"token" : "哥们",
"start_offset" : 8,
"end_offset" : 10,
"type" : "CN_WORD",
"position" : 5
},
{
"token" : "最",
"start_offset" : 10,
"end_offset" : 11,
"type" : "CN_CHAR",
"position" : 6
},
{
"token" : "帅",
"start_offset" : 11,
"end_offset" : 12,
"type" : "CN_CHAR",
"position" : 7
},
{
"token" : "姑娘",
"start_offset" : 13,
"end_offset" : 15,
"type" : "CN_WORD",
"position" : 8
},
{
"token" : "最美",
"start_offset" : 15,
"end_offset" : 17,
"type" : "CN_WORD",
"position" : 9
}
]
}
解释(来源 GitHub):
ik_max_word 和 ik_smart 什么区别?
ik_max_word: 会将文本做最细粒度的拆分,比如会将“中华人民共和国国歌”拆分为“中华人民共和国,中华人民,中华,华人,人民共和国,人民,人,民,共和国,共和,和,国国,国歌”,会穷尽各种可能的组合;
ik_smart: 会做最粗粒度的拆分,比如会将“中华人民共和国国歌”拆分为“中华人民共和国,国歌”。