索引分片策略

Total shards per node

The cluster-level shard allocator tries to spread the shards of a single index across as many nodes as possible. However, depending on how many shards and indices you have, and how big they are, it may not always be possible to spread shards evenly.

The following dynamic setting allows you to specify a hard limit on the total number of shards from a single index allowed per node:

index.routing.allocation.total_shards_per_node
The maximum number of shards (replicas and primaries) that will be allocated to a single node. Defaults to unbounded.
You can also limit the amount of shards a node can have regardless of the index:

cluster.routing.allocation.total_shards_per_node
The maximum number of shards (replicas and primaries) that will be allocated to a single node globally. Defaults to unbounded (-1).

index.routing.allocation.total_shards_per_node 这个参数可以控制单个索引在同一个结点上最多分配几个shard。 默认是无上限,因此在扩容新结点的时候,很可能一个索引的很多shard分到同一个node。 具体设置多少,需要根据集群结点数量和一个index shard总数量(包含主和副复制片)来定。

例如10个node,index设置 5 primary + 5 replica。
设置index.routing.allocation.total_shards_per_node:1 可以保证这个索引在每个node上只分配一个shard。 这样设置好处是数据分布最均匀, 但是也有负面影响,比如如果有一个node挂了,就会有一个shard无法分配,变成UNASSIGNED状态。
如果设置index.routing.allocation.total_shards_per_node:2 ,则可能数据均衡状态不如设置为1那么理想,但是可以容忍一个node挂掉,因为shard可以再分配到其他node。 这个设置结合shard balancing heuristics做全局调配应该比较理想。

你可能感兴趣的:(索引分片策略)