背景：

某数据节点的磁盘告警，将该数据节点改为master节点，将磁盘空闲的master节点改为数据节点。

操作：

在es5.6.2集群中更改node的master和data的配置文件。

目前三个磁盘都有快要告警的问题，同时有三个master节点。

将三个数据节点同时改为master节点时，带来一个问题：

某索引下的某分片（主分片和副本）同时出现unassigned 的状态，使得该shard不可用。

造成这个的原因是：

出现unassigned的主分片及副本分片存储在被修改为master节点的数据节点上，将这些数据节点改为master节点后，集群不能找到该分片的数据，故不能重新选举主分片或创建副本分片。

解决该问题的方式：

尝试找到含有该分片的节点，重新设置为数据节点，然后调用以下allocate接口：

POST /_cluster/reroute

{

"commands" : [

{

"allocate_stale_primary" : {

"index" : "seed_record_new",

"shard" : 1,

"node" : "node0-3"

}

]

}

等待shard1 allocation的过程可以调用http://ip:9200/_cat/shards 查看shard的状态。

shard1 不再是unassigned的状态之后，可以将node0-3的分片move到其他节点上，再将该节点改为master节点。

POST /_cluster/reroute

{

"commands" : [

{

"move" : {

"index" : "seed_record_new",

"shard" : 1,

"from_node" : "node0-3",

"to_node" : "node0-6"

}

]

}

另外可以通过explain接口查询shard重新分配和迁移的情况

GET /_cluster/allocation/explain

{

"index": "seed_record_new",

"shard": 1,

"primary": false

}

result：

{

"index": "seed_record_new",

"shard": 1,

"primary": false,

"current_state": "initializing",

"unassigned_info": {

"reason": "PRIMARY_FAILED",

"at": "2019-01-09T03:04:55.454Z",

"details": "primary failed while replica initializing",

"last_allocation_status": "no_attempt"

"current_node": {

"id": "_M7olZoWRW-9_6cPvkob_A",

"name": "node1-6",

"transport_address": "[$ip]:9306",

"attributes": {

"ml.max_open_jobs": "10",

"box_type": "hot",

"ml.enabled": "true"

}

"explanation": "the shard is in the process of initializing on node [node1-6], wait until initialization has completed"

}

也可以在kibana--monitor--Shard Activity模块查看allocation及move的情况。

经验：

将数据节点改为master节点时，等待所有shard分片started状态后再进行下个节点修改。

allocation经验_elasticsearch_cluster_5.6.2

背景：

操作：

某索引下的某分片（主分片和副本）同时出现unassigned 的状态，使得该shard不可用。

造成这个的原因是：

解决该问题的方式：

经验：

你可能感兴趣的:(allocation经验_elasticsearch_cluster_5.6.2)