Elasticsearch Split和shrink API

背景:

尝试解决如下问题:单分片存在过多文档,超过lucene限制

 

分析

1.一般为日志数据或者OLAP数据,直接删除索引重建

2.尝试保留索引,生成新索引

  - 数据写入新索引,查询时候包含 old_index,new_index

3.尝试split

split index API

如果需要将当前index的primary shard数量增加时,可以使用split index api。

会生成一个新index,但会保留原来的index。

步骤:

确保source index只读

PUT source_index/_settings
{
  "settings": {
    "index.blocks.write": true 
  }
}

spilt API修改primary shard数量

POST source_index/_split/new_index
{
  "settings": {
    "index.number_of_shards": 10
  }
}

监控执行进度

GET _cat/recovery/new_index

测试

版本 7.17.5

# 新建测试索引
PUT test_split
{
  
}

# 关闭source索引的写入
PUT /test_split/_settings
{
  "settings": {
    "index.blocks.write": true 
  }
}

# 执行split API
POST /test_split/_split/test_split_new
{
  "settings": {
    "index.number_of_shards": 12
  }
}

遇到报错并解决,在split API执行阶段:

1. source 索引必须是 read-only 的

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_state_exception",
        "reason": "index test_split must be read-only to resize index. use \"index.blocks.write=true\""
      }
    ],
    "type": "illegal_state_exception",
    "reason": "index test_split must be read-only to resize index. use \"index.blocks.write=true\""
  },
  "status": 500
}



2. source分片数(3)必须是target分片数的因子(所以target不能为11,可以为12)

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "the number of source shards [3] must be a factor of [11]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "the number of source shards [3] must be a factor of [11]"
  },
  "status": 400
}




应用

集群版本 6.8.5

设置source索引 "index.blocks.write": true 之后,执行split API异常:

{
  "error": {
    "root_cause": [
      {
        "type": "remote_transport_exception",
        "reason": "[es-log-all-2][10.xx.x.xx:9300][indices:admin/resize]"
      }
    ],
    "type": "illegal_state_exception",
    "reason": "the number of routing shards [5] must be a multiple of the target shards [20]"
  },
  "status": 500
}

即:目标索引的主分片个数必须是index.number_of_routing_shards的因数;

注意:number_of_routing_shards 不可以动态修改

结论:ES6.8无法通过split API解决索引分片过少的问题

官方doc:Split index API | Elasticsearch Guide [8.9] | Elastic

Shrink index API

如果需要将当前index的primary shard数量减少时,可以使用shrink index api。

会生成一个新index,但会保留原来的index。

(Shrinks an existing index into a new index with fewer primary shards.)

POST /my-index-000001/_shrink/shrunk-my-index-000001

步骤

# 新建index
PUT test_shrink
{
  
}

# 查看索引的shard在哪些node
GET _cat/shards/test_shrink?v

# 将所有主分片分配到node1,副本设置为0,设置readOnly
PUT test_shrink/_settings
{
  "settings": {
    "index.number_of_replicas": 0,
    "index.routing.allocation.require._name": "node-es-0",
    "index.blocks.write": true
  }
}

# 执行shrink API
POST /test_shrink/_shrink/new_test_shrink
{
  "settings": {
    "index.number_of_replicas": 1,
    "index.number_of_shards": 1, 
    "index.codec": "best_compression" 
  },
  "aliases": {
    "my_search_indices": {}
  }
}

如果上述命令修改成:

POST /test_shrink/_shrink/new_test_shrink
{
  "settings": {
    "index.number_of_replicas": 1,
    "index.number_of_shards": 2, 
    "index.codec": "best_compression" 
  },
  "aliases": {
    "my_search_indices": {}
  }
}

新的number_of_shards不是source index的number_of_shards的因子,那么出现如下错误:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "the number of source shards [3] must be a multiple of [2]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "the number of source shards [3] must be a multiple of [2]"
  },
  "status": 400
}

官方doc:Shrink index API | Elasticsearch Guide [8.9] | Elastic

你可能感兴趣的:(elasticsearch,大数据,搜索引擎)