1、检查配置项recover_after_data_nodes: 3 # data nodes 的节点数,如果超出实际节点数,active_shards_percent: %NaN
访问 /_cat/shards 会报 blocked by: [SERVICE_UNAVAILABLE/1/state not recovered /错误,并且head 无法连接es 集群,集群状态为 red。
curl http://metron01:9200/_cat/health?v
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
1534901569 09:32:49 metron green 3 2 2 1 0 0 0 0 - NaN%
2 、删除索引目录下的文件
查看配置elasticsearch.yml中的data 目录
data: "/home/opt/lmm/es_data"
# 清空所有节点上的该目录下
rm -rf /home/opt/lmm/es_data/*
重启服务
3、shards存在UNASSIGNED
# 查询 UNASSIGNED 的索引名称和序号
curl -s "http://localhost:9200/_cat/shards" | grep UNASSIGNED
snort_index_2018.08.17.15 2 p UNASSIGNED
snort_index_2018.08.17.15 2 r UNASSIGNED
snort_index_2018.08.17.15 1 p UNASSIGNED
snort_index_2018.08.17.15 1 r UNASSIGNED
snort_index_2018.08.17.15 3 p UNASSIGNED
snort_index_2018.08.17.15 3 r UNASSIGNED
snort_index_2018.08.17.15 0 p UNASSIGNED
snort_index_2018.08.17.15 0 r UNASSIGNED
#索引名 snort_index_2018.08.17.15 序号为 0,1,2,3
查询data node 的ID
curl 'localhost:9200/_nodes/process?pretty'
{
"cluster_name" : "metron",
"nodes" : {
"I3_T2xI1RGCocy7sZcdg9w" : {
"name" : "metron03",
"transport_address" : "172.16.16.59:9300",
"host" : "172.16.16.59",
"ip" : "172.16.16.59",
"version" : "2.3.3",
"build" : "218bdf1",
"http_address" : "172.16.16.59:9200",
"attributes" : {
"master" : "false"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 16002,
"mlockall" : false
}
},
"xOk36eUVQhSgAHYv6VT77Q" : {
"name" : "metron01",
"transport_address" : "172.16.16.57:9300",
"host" : "172.16.16.57",
"ip" : "172.16.16.57",
"version" : "2.3.3",
"build" : "218bdf1",
"http_address" : "172.16.16.57:9200",
"attributes" : {
"data" : "false",
"master" : "true"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 25044,
"mlockall" : false
}
},
"ncgAeS6nQp-L0d3U0roVMA" : {
"name" : "metron02",
"transport_address" : "172.16.16.58:9300",
"host" : "172.16.16.58",
"ip" : "172.16.16.58",
"version" : "2.3.3",
"build" : "218bdf1",
"http_address" : "172.16.16.58:9200",
"attributes" : {
"master" : "false"
},
"process" : {
"refresh_interval_in_millis" : 1000,
"id" : 12653,
"mlockall" : false
}
}
}
}
针对每个shard 和序号 执行reroute
curl -XPOST 'metron01:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "snort_index_2018.08.17.15", # 索引名称
"shard" : 0, # shard ID
"node" : "I3_T2xI1RGCocy7sZcdg9w", # data node ID
"allow_primary" : true
}
}
]
}'
curl -XPOST 'metron01:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "snort_index_2018.08.17.15",
"shard" : 1,
"node" : "I3_T2xI1RGCocy7sZcdg9w",
"allow_primary" : true
}
}
]
}'
curl -XPOST 'metron01:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "snort_index_2018.08.17.15",
"shard" : 2,
"node" : "I3_T2xI1RGCocy7sZcdg9w",
"allow_primary" : true
}
}
]
}'
curl -XPOST 'metron01:9200/_cluster/reroute' -d '{
"commands" : [ {
"allocate" : {
"index" : "snort_index_2018.08.17.15",
"shard" : 3,
"node" : "I3_T2xI1RGCocy7sZcdg9w",
"allow_primary" : true
}
}
]
}'
如果出现如下错误:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "[allocate] allocation of [.kibana][0] on node {metron03}{WPJIoZ6CRTetEAH_j8nPBw}{172.16.16.59}{172.16.16.59:9300}{master=false} is not allowed, reason: [YES(primary is already active)][YES(allocation disabling is ignored)][YES(target node version [2.3.3] is same or newer than source node version [2.3.3])][YES(node passes include/exclude/require filters)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(no allocation awareness enabled)][YES(enough disk for shard on node, free: [187.9gb])][YES(allocation disabling is ignored)][YES(shard not primary or relocation disabled)][YES(below shard recovery limit of [4])][NO(shard cannot be allocated on same node [WPJIoZ6CRTetEAH_j8nPBw] it already exists on)]"
}
],
"type": "illegal_argument_exception",
"reason": "[allocate] allocation of [.kibana][0] on node {metron03}{WPJIoZ6CRTetEAH_j8nPBw}{172.16.16.59}{172.16.16.59:9300}{master=false} is not allowed, reason: [YES(primary is already active)][YES(allocation disabling is ignored)][YES(target node version [2.3.3] is same or newer than source node version [2.3.3])][YES(node passes include/exclude/require filters)][YES(total shard limit disabled: [index: -1, cluster: -1] <= 0)][YES(no allocation awareness enabled)][YES(enough disk for shard on node, free: [187.9gb])][YES(allocation disabling is ignored)][YES(shard not primary or relocation disabled)][YES(below shard recovery limit of [4])][NO(shard cannot be allocated on same node [WPJIoZ6CRTetEAH_j8nPBw] it already exists on)]"
},
"status": 400
}
可以通过设置副本数为0,es 自动删除多余的副本,再改回2,来处理:
curl -XPUT http://metron01:9200/.kibana/_settings -d'{
"index" : {
"number_of_replicas" : 0
}
}'
返回:{"acknowledged":true}
curl -XPUT http://metron01:9200/.kibana/_settings -d'{
"index" : {
"number_of_replicas" : 2
}
}'
返回:{"acknowledged":true}