ES维护常见问题(持续更新)

1 存在未分片索引

1)找出未分片的索引

curl xxx/_cat/shards?v | grep UNASSIGNED

2)查看未分配的原因

curl -XGET 'http://xxx/_cluster/allocation/explain?pretty' -d '{
    "index": "index_name",
    "shard": 0,
    "primary": true
}'

3)根据原因进行处理,目前遇到过以下三种情况:

a. 某个节点的某个磁盘满(手动移动分片到大磁盘机器即可,最好用大磁盘机器)
b. allocation 限制,不能分配(修改allocation的规则即可)
c. 重试次数达到限制,依然不能分配(手动尝试即可 curl -XPOST 'xxx/_cluster/reroute?retry_failed&pretty')
2 下线剔除某台机器

1) 不能用数组的形式

curl -XPUT 'http: //xxx/_cluster/settings?pretty' -d'{
    "transient": {
        "cluster.routing.allocation.exclude._ip": ["ip1","ip2","ip3"]
    }
}'

2)逗号后面不能有空格

curl -XPUT 'http: //xxx/_cluster/settings?pretty' -d'{
    "transient": {
        "cluster.routing.allocation.exclude._ip": "ip1, ip2, ip3"
    }
}'

3)正确的写法

curl -XPUT 'http: //xxx/_cluster/settings?pretty' -d'{
    "transient": {
        "cluster.routing.allocation.exclude._ip": "ip1,ip2,ip3"
    }
}'
3 机器突然重启,ES节点不能正常启动

这种主要是因为状态文件为空导致的,查看日志,找出状态文件路径,删除即可。

Error injecting constructor, ElasticsearchException[java.io.IOException: failed to read [id:2, legacy:false, file:/data2/search/data/nodes/0/indices/QNpDowX_TwiIiqZlB9e92g/_state/state-2.st]]; nested: IOException[failed to read [id:2, legacy:false, file:/data2/search/data/nodes/0/indices/QNpDowX_TwiIiqZlB9e92g/_state/state-2.st]]; nested: IllegalStateException[class org.apache.lucene.store.BufferedChecksumIndexInput cannot seek backwards (pos=-16 getFilePointer()=0)];
  at org.elasticsearch.gateway.GatewayMetaState.(Unknown Source)
  while locating org.elasticsearch.gateway.GatewayMetaState
Caused by: ElasticsearchException[java.io.IOException: failed to read [id:2, legacy:false, file:/data2/search/data/nodes/0/indices/QNpDowX_TwiIiqZlB9e92g/_state/state-2.st]]; nested: IOException[failed to read [id:2, legacy:false, file:/data2/search/data/nodes/0/indices/QNpDowX_TwiIiqZlB9e92g/_state/state-2.st]]; nested: IllegalStateException[class org.apache.lucene.store.BufferedChecksumIndexInput cannot seek backwards (pos=-16 getFilePointer()=0)];

找出空文件删除即可:

find /data*/search/data/nodes/0/indices/ | grep state | grep "\.st" | xargs ls -l | awk '{if($5==0)print $0}' 
-rw-rw-r-- 1 search search    0 Oct  1 22:53 /data2/search/data/nodes/0/indices/QNpDowX_TwiIiqZlB9e92g/_state/state-2.st
-rw-rw-r-- 1 search search    0 Oct  1 22:53 /data3/search/data/nodes/0/indices/QNpDowX_TwiIiqZlB9e92g/_state/state-2.st
-rw-rw-r-- 1 search search    0 Oct  1 22:53 /data4/search/data/nodes/0/indices/QNpDowX_TwiIiqZlB9e92g/_state/state-1.st
-rw-rw-r-- 1 search search    0 Oct  1 22:53 /data4/search/data/nodes/0/indices/QNpDowX_TwiIiqZlB9e92g/_state/state-2.st
4 查询提示shard数达到限制
"reason" : "Trying to query 1344 shards, which is over the limit of 1000. This limit exists because querying 
many shards at the same time can make the job of the coordinating node very CPU and/or memory 
intensive. It is usually a better idea to have a smaller number of larger shards. Update 
[action.search.shard_count.limit] to a greater value if you really want to query that many shards at the same time."

修改配置即可:

curl -u admin:admin -XPUT 'https://localhost:9200/_cluster/settings' -H 'Content-Type: application/json' -d' 
{
    "persistent" : {
        "action.search.shard_count.limit" : "1500"
    }
}
'
or
curl -u admin:admin -XPUT 'https://localhost:9200/_cluster/settings' -H 'Content-Type: application/json' -d' 
{
    "transient" : {
        "action.search.shard_count.limit" : "1500"
    }
}
'

persistent :重启后依然有效
transient :重启后无效

5 standard分词器无法以逗号分割两个数字字母字符串

我开发了一个分词器a。有天查询的时候发现,标准分词器要比a查出的数据量少一半。查看不同文档的分词效果:

curl -XGET 'xx/xx/xx/AV9zRnrjq2szramqtpAT/_termvector?fields=strdescription&pretty=true'

发现标准分词没能将字符串分成想要的形式,比如说:


curl -XPOST 'xxx/_analyze?pretty' -H 'Content-Type: application/json' -d'
{
  "tokenizer": "standard",
  "text": "2d711b09bd0db0ad240cc83b30dd8014,2d711b09bd0db0ad240cc83b30dd8014,2d711b09bd0db0ad240cc83b30dd8014,2d711b09bd0db0ad240cc83b30dd8014"
}
'

针对与这种字符串我们期望字符串能够按照逗号分割,但是用标准分词你会发现,这个字符串的term是它自己。具体原因应该是:逗号两边都是数字,我们知道针对于一个大数字比如说10,000,我们喜欢在数字间加一个逗号,所以上述字符串无法分割。
解决方案:将逗号变为空格即可。

你可能感兴趣的:(ES维护常见问题(持续更新))