ElasticSearch6.2.4(16)——Elasticsearch 集群健康值红色终极解决方案

1.小编今天凌晨的时候将es集群上线,早上发现有个.monitoring-es-6健康状态为yellow,我只有两个节点,看了settings发现有个副分片是unassigned shards,也学了将replicas设为零,然后再设回来,但发现还是没用

curl -XPUT localhost:9201/ .monitoring-es-6-2018.06.07/_settings -H "Content-Type: application/json" -d '{
"index":{
"number_of_replicas":0
}
}'

2.最终在https://blog.csdn.net/laoyang360/article/details/78443006找到解决方式

先看网上的解释:

GET /_cluster/allocation/explain
{
  "index" : "test_idx",
  "shard" : 0,
  "primary" : true,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "INDEX_CREATED", 
    "at" : "2017-01-16T18:12:39.401Z",
    "last_allocation_status" : "no"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",   
  "node_allocation_decisions" : [ 
    {
      "node_id" : "tn3qdPdnQWuumLxVVjJJYQ",
      "node_name" : "A", 
      "transport_address" : "127.0.0.1:9300",
      "node_decision" : "no",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "filter",  
          "decision" : "NO", 
          "explanation" : "node matches index setting [index.routing.allocation.exclude.] filters [_name:\"A OR B\"]" 
        }
      ]
    },
    {
      "node_id" : "qNgMCvaCSPi3th0mTcyvKQ",
      "node_name" : "B", 
      "transport_address" : "127.0.0.1:9301",
      "node_decision" : "no",
      "weight_ranking" : 2,
      "deciders" : [
        {
          "decider" : "filter",
          "decision" : "NO",
          "explanation" : "node matches index setting [index.routing.allocation.exclude.] filters [_name:\"A OR B\"]"
        }
      ]
    }
  ]
}

explain API对索引test_idx中的第一个主分片0进行了解释: 因为索引刚刚创建(unassigned_info所示), 所以还处于未指派状态(current_state所示). 但又因为没有节点被允许分配给该分片(allocate_explanation所示), 所以分片处于不可分配状态(can_allocate所示). 继续看每个节点的决策信息(node_allocation_decisions), 可以看到因为创建索引时过滤了节点A和节点B, 所以filter decider(decider所示)给A发出的决定是不允许在A上分配分片('node_decision'所示, deciderexplanation也对此做了说明). 在解释中也包含了改变当前状态需要调整的配置参数.

再看我的问题

GET /_cluster/allocation/explain
{
	"index": ".monitoring-es-6-2018.06.04",
	"shard": 0,
	"primary": false,
	"current_state": "unassigned",
	"unassigned_info": {
		"reason": "CLUSTER_RECOVERED",
		"at": "2018-06-08T03:46:36.906Z",
		"last_allocation_status": "no_attempt"
	},
	"can_allocate": "no",
	"allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
	"node_allocation_decisions": [{
		"node_id": "MQqrz6NrSM2JWwd-NbmMkA",
		"node_name": "es-2",
		"transport_address": "10.6.4.61:9301",
		"node_attributes": {
			"ml.machine_memory": "4141973504",
			"ml.max_open_jobs": "20",
			"ml.enabled": "true"
		},
		"node_decision": "no",
		"deciders": [{
			"decider": "disk_threshold",
			"decision": "NO",
			"explanation": "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [12.928963360212343%]"
		}]
	}, {
		"node_id": "Y4PejqNoTyy6lliQpYnJGA",
		"node_name": "es-1",
		"transport_address": "10.6.4.60:9301",
		"node_attributes": {
			"ml.machine_memory": "4141973504",
			"ml.max_open_jobs": "20",
			"ml.enabled": "true"
		},
		"node_decision": "no",
		"deciders": [{
			"decider": "same_shard",
			"decision": "NO",
			"explanation": "the shard cannot be allocated to the same node on which a copy of the shard already exists [[.monitoring-es-6-2018.06.04][0], node[Y4PejqNoTyy6lliQpYnJGA], [P], s[STARTED], a[id=dJcjOxkCS9WbA2CHTO6HDg]]"
		}]
	}]
}

很简单,es-2是因为当前磁盘超过了85%导致es无法再为副分片分片,所以将加大磁盘什么的就行了,es会自动恢复数据

而es-1就不解释了,很简单



你可能感兴趣的:(搜索引擎)