_reindex可是个好东西,尤其是针对开发者而言,从小的方面讲在存储数据是我们常常可能由于字段类型的问题,值大小写的问题,分词器的问题导致查询不到,或者结构不对,或者分片数,副本数不对等这类问题,从大的方面讲,跨集群数据迁移的时候,你就需要用到关键指令 _reindex ,换句话说,数据库大家都用过吧,总有的时候需要调整表结构,或者值大小写等等这种恶心的情况,笨一点,新建一张正确的临时表,写个脚本,把数据从错误的表读取出来,通过程序处理数据符合预期后,在插入到新表,然后在删除旧表,在创建一个和旧表相同的表名,在把临时表数据导入到旧表中。这一系列操作下来,整个人都麻了。当然思路是这个思路,但是实现过程我们在elasticsearch中不需要写脚本,而是直接使用指令 _reindex 即可完成,废话不多少,懂的人自然懂。
如果elasticsearch集群配置了安全策略和权限策略, 则进行reindex必须拥有以下权限
reindex.remote.whitelist
最简单的使用方式
curl --location 'http://localhost:9200/_reindex' \
--header 'Content-Type: application/json' \
--data '{
"source": {
"index": "旧索引"
},
"dest": {
"index": "新索引"
}
}'
curl --location 'http://localhost:9200/_reindex' \
--header 'Content-Type: application/json' \
--data '{
"size": 100,
"source": {
"index": "source_index"
},
"dest": {
"index": "dest_index"
}
}'
curl --location 'http://localhost:9200/_reindex' \
--header 'Content-Type: application/json' \
--data '{
"source": {
"index": [
"source_index_1",
"source_index_2"
],
"type": [
"source_type_1",
"source_type_2"
]
},
"dest": {
"index": "dest_index"
}
}'
curl --location 'http://localhost:9200/_reindex' \
--header 'Content-Type: application/json' \
--data '{
"source": {
"index": "source_index_1",
"_source": [
"username",
"sex"
]
},
"dest": {
"index": "dest_index"
}
}'
curl --location 'http://192.168.5.235:9210/_reindex' \
--header 'Content-Type: application/json' \
--data '{
"script": {
"source": "String uppercaseId = ctx._id.toUpperCase(); ctx._source.remove(\"id\"); ctx._id = uppercaseId; ",
"lang": "painless"
},
"source": {
"index": "source_index"
},
"dest": {
"index": "dest_index"
}
}'
# 如果是_source中的值需要:String uppercaseUuid = ctx._source.ENTITY_UUID.toUpperCase(); ctx._source.remove(\"_source.ENTITY_UUID\"); ctx._source.ENTITY_UUID = uppercaseUuid;
# 跨集群传输时,如果单个document的平均大小超过100Kb,则有可能会报错,需要在source中指定size,定义每批次传输的doc个数
curl --location 'http://localhost:9200/_reindex' \
--header 'Content-Type: application/json' \
--data '{
"source": {
// "sort": {
// "date": "desc"
// },
// "query": {
// "match": {
// "test": "data"
// }
// },
// "size": 100,
"remote": {
"host": "http://otherhost:9200",
"username": "username",
"password": "password"
},
"index": "source_index"
},
"dest": {
"index": "dest_index"
}
}'
# version_type为internal则Elasticsearch强制性的将文档转储到目标中,覆盖具有相同类型和ID的任何内容
# version_type为external则做更新
curl --location 'http://localhost:9200/_reindex' \
--header 'Content-Type: application/json' \
--data '{
"source": {
"index": "source_index"
},
"dest": {
"index": "dest_index",
"version_type": "internal"
}
}'
# 只在dest index中添加不不存在的doucments。如果相同的documents已经存在,则会报version confilct的错误。
curl --location 'http://localhost:9200/_reindex' \
--header 'Content-Type: application/json' \
--data '{
"source": {
"index": "source_index"
},
"dest": {
"index": "dest_index",
"op_type": "create"
}
}'
curl --location 'http://localhost:9200/_reindex' \
--header 'Content-Type: application/json' \
--data '{
"conflicts": "proceed",
"source": {
"index": "source_index"
},
"dest": {
"index": "dest_index",
"op_type": "create"
}
}'
curl --location --request POST 'http://localhost:9200/_tasks?detailed=true&actions=*reindex'
curl --location 'http://localhost:9200/_reindex?slices=5&refresh=null' \
--header 'Content-Type: application/json' \
--data '{
"source": {
"index": "source_index"
},
"dest": {
"index": "dest_index"
}
}'