背景
Elasticsearch的副本机制提供了可靠性,可以容忍个别节点丢失而不影响集群的对外服务,但是并不能提供对灾难性故障的保护,所以需要对ES集群数据做一个完整的备份,以便在灾难性故障发生时,能快速恢复数据。ES官方提供了快照/恢复(Snapshot/Restore)的方式,支持的插件包括Azure Repository Plugin、S3 Repository Plugin、Hadoop HDFS Repository Plugin、Google Cloud Storage Respository Plugin,这里我使用Hadoop HDFS Repository插件,将ES中的数据备份到HDFS上。
-
说明
本文基于Elasticsearch-5.6.0、hadoop-2.6.0-cdh5.7.0,使用的插件及版本是repository-hdfs-5.6.0.zip,官网地址:
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/modules-snapshots.html
https://www.elastic.co/guide/en/elasticsearch/plugins/5.6/repository-hdfs.html
ES集群快照存在版本兼容性问题,请注意:
A snapshot of an index created in 5.x can be restored to 6.x.
A snapshot of an index created in 2.x can be restored to 5.x.
A snapshot of an index created in 1.x can be restored to 2.x.
我的情况是从5.6.0备份数据然后恢复到6.3.2,不存在这种兼容性问题。
-
操作步骤
1. 安装插件
分别在集群的各个节点安装repository-hdfs插件
在线安装:sudo bin/elasticsearch-plugin install repository-hdfs
离线安装:
先wget https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-5.6.0.zip
然后bin/elasticsearch-plugin install file:///data/elastic/repository-hdfs-5.6.0.zip
2. 创建仓库,并在ES注册
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository" -H 'Content-Type: application/json' -d'
{
"type": "hdfs",
"settings": {
"uri": "hdfs://golive-master:8020/",
"path": "elasticsearch/respositories/es_hdfs_repository",
"conf.dfs.client.read.shortcircuit": "true",
"conf.dfs.domain.socket.path": "/var/lib/hadoop-hdfs/dn_socket"
}
}
'
创建过程中遇到Permission denied的问题,我暂时关闭了hdfs权限,即修改hadoop各节点hdfs-site.xml,添加如下配置:
dfs.permissions
false
然后重启hdfs,再次执行上述创建仓库命令即可成功创建,查看hdfs目录如下:
可以通过如下命令查看仓库:
curl -X GET "172.16.221.104:9400/_snapshot/es_hdfs_repository"
返回结果如下:
{
"es_hdfs_repository": {
"type": "hdfs",
"settings": {
"path": "elasticsearch/respositories/es_hdfs_repository",
"uri": "hdfs://golive-master:8020/",
"conf": {
"dfs": {
"client": {
"read": {
"shortcircuit": "true"
}
},
"domain": {
"socket": {
"path": "/var/lib/hadoop-hdfs/dn_socket"
}
}
}
}
}
}
}
3. 创建快照
为所有索引创建快照:
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1?wait_for_completion=true" -H 'Content-Type: application/json' -d'
{
"indices": "*"
}
'
通常你会希望你的快照作为后台进程运行,不过有时候你会希望在你的脚本中一直等待到完成。这可以通过添加一个 wait_for_completion 标记实现:wait_for_completion=true,这个会阻塞调用直到快照完成。注意大型快照会花很长时间才返回。
https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html
4.恢复快照
curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_restore"
和快照类似, restore 命令也会立刻返回,恢复进程会在后台进行。如果你更希望你的 HTTP 调用阻塞直到恢复完成,添加 wait_for_completion 标记:
curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_restore?wait_for_completion=true"
我恢复的时候是恢复到一个新的集群(6.3.2的一个集群),因为没有在es注册HDFS仓库的位置,报错说找不到仓库,于是又通过创建仓库的命令注册了一下,再执行恢复命令就好了,这一点官方是这么说的:
All that is required is registering the repository containing the snapshot in the new cluster and starting the >restore process.
英文文档:https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html
中文文档:https://www.elastic.co/guide/cn/elasticsearch/guide/current/_restoring_from_a_snapshot.html
5.获取快照信息和状态
获取一个仓库中所有快照的完整列表,使用 _all 占位符替换掉具体的快照名称:
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/_all"
获取一个快照的详细信息:
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2"
获取一个快照更详细的信息:
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_status"
官方文档:
https://www.elastic.co/guide/cn/elasticsearch/guide/current/backing-up-your-cluster.html
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/modules-snapshots.html
-
附录:
以下是我当时备份/恢复数据用到的相关命令:
wget https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-5.6.0.zip
elasticsearch-5.6.0/bin/elasticsearch-plugin install file:///data/elastic/repository-hdfs-5.6.0.zip
curl 172.16.221.104:9400/_cat/indices?v
curl 172.16.221.104:9400/_cat/master?v
curl 172.16.221.104:9400/_cat/master?help
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository" -H 'Content-Type: application/json' -d'
{
"type": "hdfs",
"settings": {
"uri": "hdfs://golive-master:8020/",
"path": "elasticsearch/respositories/es_hdfs_repository",
"conf.dfs.client.read.shortcircuit": "true",
"conf.dfs.domain.socket.path": "/var/lib/hadoop-hdfs/dn_socket"
}
}
'
curl -X PUT "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1?wait_for_completion=true" -H 'Content-Type: application/json' -d'
{
"indices": "*"
}
'
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository"
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_status"
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.2/elasticsearch-analysis-ik-6.3.2.zip
[https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-6.3.2.zip](https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-hdfs/repository-hdfs-6.3.2.zip)
bin/elasticsearch-plugin install file:///data/elastic/repository-hdfs-6.3.2.zip
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_status"
curl 172.16.221.105:9400/_cat/master
curl 172.16.221.12:9400/_cat/nodes
curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_1/_restore"
curl -X POST "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_restore" -H 'Content-Type: application/json' -d'
{
"indices": "a*,l*,m*,u*,i*"
}
'
[https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.2.tar.gz](https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.3.2.tar.gz)
curl -X DELETE "172.16.221.105:9400/.kibana-6"
curl -X GET "172.16.221.105:9400/_cat/indices"
curl -X GET "172.16.221.105:9400/_snapshot/es_hdfs_repository/snapshot_2/_status"
curl -X POST "172.16.221.105:9400/a*,l*,m*,u*,i*/_close"
curl -X POST "172.16.221.105:9400/a*,l*,m*,u*,i*/_open"
curl -X GET http://172.16.221.105:9400/ad_base?pretty
curl -X GET http://172.16.221.105:9400/_cluster/health?pretty