1.小编今天凌晨的时候将es集群上线,早上发现有个.monitoring-es-6健康状态为yellow,我只有两个节点,看了settings发现有个副分片是unassigned shards,也学了将replicas设为零,然后再设回来,但发现还是没用
curl -XPUT localhost:9201/ .monitoring-es-6-2018.06.07/_settings -H "Content-Type: application/json" -d '{
"index":{
"number_of_replicas":0
}
}'
2.最终在https://blog.csdn.net/laoyang360/article/details/78443006找到解决方式
先看网上的解释:
GET /_cluster/allocation/explain
{
"index" : "test_idx",
"shard" : 0,
"primary" : true,
"current_state" : "unassigned",
"unassigned_info" : {
"reason" : "INDEX_CREATED",
"at" : "2017-01-16T18:12:39.401Z",
"last_allocation_status" : "no"
},
"can_allocate" : "no",
"allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions" : [
{
"node_id" : "tn3qdPdnQWuumLxVVjJJYQ",
"node_name" : "A",
"transport_address" : "127.0.0.1:9300",
"node_decision" : "no",
"weight_ranking" : 1,
"deciders" : [
{
"decider" : "filter",
"decision" : "NO",
"explanation" : "node matches index setting [index.routing.allocation.exclude.] filters [_name:\"A OR B\"]"
}
]
},
{
"node_id" : "qNgMCvaCSPi3th0mTcyvKQ",
"node_name" : "B",
"transport_address" : "127.0.0.1:9301",
"node_decision" : "no",
"weight_ranking" : 2,
"deciders" : [
{
"decider" : "filter",
"decision" : "NO",
"explanation" : "node matches index setting [index.routing.allocation.exclude.] filters [_name:\"A OR B\"]"
}
]
}
]
}
explain API
对索引test_idx
中的第一个主分片0进行了解释: 因为索引刚刚创建(unassigned_info
所示), 所以还处于未指派状态(current_state
所示). 但又因为没有节点被允许分配给该分片(allocate_explanation
所示), 所以分片处于不可分配状态(can_allocate
所示). 继续看每个节点的决策信息(node_allocation_decisions
), 可以看到因为创建索引时过滤了节点A和节点B, 所以filter decider
(decider
所示)给A发出的决定是不允许在A上分配分片('node_decision'所示, decider
的explanation
也对此做了说明). 在解释中也包含了改变当前状态需要调整的配置参数.
再看我的问题
GET /_cluster/allocation/explain
{
"index": ".monitoring-es-6-2018.06.04",
"shard": 0,
"primary": false,
"current_state": "unassigned",
"unassigned_info": {
"reason": "CLUSTER_RECOVERED",
"at": "2018-06-08T03:46:36.906Z",
"last_allocation_status": "no_attempt"
},
"can_allocate": "no",
"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
"node_allocation_decisions": [{
"node_id": "MQqrz6NrSM2JWwd-NbmMkA",
"node_name": "es-2",
"transport_address": "10.6.4.61:9301",
"node_attributes": {
"ml.machine_memory": "4141973504",
"ml.max_open_jobs": "20",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [{
"decider": "disk_threshold",
"decision": "NO",
"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [12.928963360212343%]"
}]
}, {
"node_id": "Y4PejqNoTyy6lliQpYnJGA",
"node_name": "es-1",
"transport_address": "10.6.4.60:9301",
"node_attributes": {
"ml.machine_memory": "4141973504",
"ml.max_open_jobs": "20",
"ml.enabled": "true"
},
"node_decision": "no",
"deciders": [{
"decider": "same_shard",
"decision": "NO",
"explanation": "the shard cannot be allocated to the same node on which a copy of the shard already exists [[.monitoring-es-6-2018.06.04][0], node[Y4PejqNoTyy6lliQpYnJGA], [P], s[STARTED], a[id=dJcjOxkCS9WbA2CHTO6HDg]]"
}]
}]
}
很简单,es-2是因为当前磁盘超过了85%导致es无法再为副分片分片,所以将加大磁盘什么的就行了,es会自动恢复数据
而es-1就不解释了,很简单