postgresql数据库监控实验03-influxDB备份还原

postgresql数据库监控实验03-influxDB备份还原

环境

机器:
移动云的
10.176.140.72 plat-ecloud01-mgmt-monitor04 monitor04

操作系统:
CentOS Linux release 7.3.1611 (Core)

InfluxDB版本:
influxdb-1.7.9

前置需求

influx端口:

8086:数据写入influxdb的地址,8086为默认端口;

8088:数据备份恢复地址,8088为默认端口;

由于新版本的免费版influx已取消web管理界面(不再需要8083端口),建议使用第三方工具进行图形化管理:
如:InfluxDBStudio

下载地址:https://github.com/CymaticLabs/InfluxDBStudio/releases/tag/v0.2.0-beta.1

备份测试

说明:
Influxdb 1.5以前,backup创建的备份文件格式与企业版不兼容,推荐使用新版本支持的可兼容的备份方法。
backup和restore可以在influxd仅有较小版本差异的实力间备份还原,如可以从1.7.3备份并在1.7.7上还原。

1.创造数据:

创建test数据库用作测试:

[root@plat-ecloud01-mgmt-monitor04 ~]# influx -precision rfc3339
Connected to http://localhost:8086 version 1.7.9
InfluxDB shell version: 1.7.9
> create database test;
> show databases;
name: databases
name
----
_internal
test

插入数据测试,influxdb没有create table语句,直接使用insert会自动创建相应的measurement:

> use test
Using database test
> show measurements
> insert test_measurement,host=server01,user=root value=1
> show measurements
name: measurements
name
----
test_measurement

查看数据:

> select * from test_measurement
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T07:39:32.658929512Z server01 root 1

用相同方法制造些初始数据:

[root@plat-ecloud01-mgmt-monitor04 ~]# influx -precision rfc3339 -database test -execute 'select * from test_measurement'
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T07:39:32.658929512Z server01 root 1
2019-12-04T07:41:41.297177588Z server01 root 3
2019-12-04T07:41:41.716384909Z server01 root 3
2019-12-04T07:41:42.101365997Z server01 root 3
2019-12-04T07:41:42.461861224Z server01 root 3
2019-12-04T07:41:43.20379409Z  server01 root 3

2.备份

备份语句说明:

influxd backup
    [ -database <db_name> ]
    [ -portable ]
    [ -host <host:port> ]
    [ -retention <rp_name> ] | [ -shard <shard_ID> -retention <rp_name> ]
    [ -start <timestamp> [ -end <timestamp> ] | -since <timestamp> ]
    <path-to-backup>
  • [ -database ]:要备份的数据库。如果未指定,则备份所有数据库。
  • [ -portable ]:以较新的InfluxDB Enterprise兼容格式生成备份文件。
  • [ -host host:port ]:InfluxDB OSS实例的主机和端口。默认值为’127.0.0.1:8088’。远程连接所必需。例:-host 127.0.0.1:8088
  • [ -retention ]:备份的保留策略。如果未指定,则默认为使用所有保留策略。如果指定,-database则为必需。
  • [ -shard ]:要备份的分片的分片ID。如果指定,-retention 则为必需。
  • [ -start ]:包括所有从指定时间戳记开始的点(RFC3339格式)。与不兼容-since。例:-start 2015-12-24T08:12:23Z
  • [ -end ]]:排除指定时间戳记(RFC3339格式)之后的所有结果。与不兼容-since。如果不使用-start,则将从1970-01-01开始备份所有数* 据。例:-end 2015-12-31T08:12:23Z
  • [ -since ]:在指定的时间戳记RFC3339格式之后执行增量备份。-start除非有旧版备份支持,否则请改用。

我将分别对数据库进行全库备份、按时间戳截取备份、按时间戳增量备份:

a.全库备份

备份整个test数据库(全备):

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influxd backup -portable -host localhost:8088 -database test /opt/influxdb/influxdb_test_all
2019/12/04 15:42:56 backing up metastore to /opt/influxdb/influxdb_test_all/meta.00
2019/12/04 15:42:56 backing up db=test
2019/12/04 15:42:56 backing up db=test rp=autogen shard=151 to /opt/influxdb/influxdb_test_all/test.autogen.00151.00 since 0001-01-01T00:00:00Z
2019/12/04 15:42:56 backup complete:
2019/12/04 15:42:56     /opt/influxdb/influxdb_test_all/20191204T074256Z.meta
2019/12/04 15:42:56     /opt/influxdb/influxdb_test_all/20191204T074256Z.s151.tar.gz
2019/12/04 15:42:56     /opt/influxdb/influxdb_test_all/20191204T074256Z.manifest

产生文件:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# ll /opt/influxdb/influxdb_test_all/ -h
total 12K
-rw-------. 1 root root 295 Dec  4 15:42 20191204T074256Z.manifest
-rw-r-----. 1 root root 334 Dec  4 15:42 20191204T074256Z.meta
-rw-------. 1 root root 334 Dec  4 15:42 20191204T074256Z.s151.tar.gz

b.按时间戳截取备份

备份指定时间段数据(增量):

备份07:00:00~07:40:00的数据:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influxd backup -portable -host localhost:8088 -database test -start 2019-12-04T07:00:00Z -end 2019-12-04T07:40:00Z /opt/influxdb/influxdb_test_start_end
2019/12/04 15:48:50 backing up metastore to /opt/influxdb/influxdb_test_start_end/meta.00
2019/12/04 15:48:50 backing up db=test
2019/12/04 15:48:50 backing up db=test rp=autogen shard=151 to /opt/influxdb/influxdb_test_start_end/test.autogen.00151.00 with boundaries start=2019-12-04T07:00:00Z, end=2019-12-04T07:40:00Z
2019/12/04 15:48:50 backup complete:
2019/12/04 15:48:50     /opt/influxdb/influxdb_test_start_end/20191204T074850Z.meta
2019/12/04 15:48:50     /opt/influxdb/influxdb_test_start_end/20191204T074850Z.s151.tar.gz
2019/12/04 15:48:50     /opt/influxdb/influxdb_test_start_end/20191204T074850Z.manifest

产生文件:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# ll /opt/influxdb/influxdb_test_start_end/ -h
total 12K
-rw-------. 1 root root 295 Dec  4 15:48 20191204T074850Z.manifest
-rw-r-----. 1 root root 334 Dec  4 15:48 20191204T074850Z.meta
-rw-------. 1 root root 335 Dec  4 15:48 20191204T074850Z.s151.tar.gz

c.按时间戳增量备份

备份07:40:00以后的数据:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influxd backup -portable -host localhost:8088 -database test -since 2019-12-04T07:40:00Z /opt/influxdb/influxdb_test_since
2019/12/04 15:49:07 backing up metastore to /opt/influxdb/influxdb_test_since/meta.00
2019/12/04 15:49:07 backing up db=test
2019/12/04 15:49:07 backing up db=test rp=autogen shard=151 to /opt/influxdb/influxdb_test_since/test.autogen.00151.00 since 2019-12-04T07:40:00Z
2019/12/04 15:49:07 backup complete:
2019/12/04 15:49:07     /opt/influxdb/influxdb_test_since/20191204T074907Z.meta
2019/12/04 15:49:07     /opt/influxdb/influxdb_test_since/20191204T074907Z.s151.tar.gz
2019/12/04 15:49:07     /opt/influxdb/influxdb_test_since/20191204T074907Z.manifest

产生文件:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# ll /opt/influxdb/influxdb_test_since/ -h
total 12K
-rw-------. 1 root root 295 Dec  4 15:49 20191204T074907Z.manifest
-rw-r-----. 1 root root 334 Dec  4 15:49 20191204T074907Z.meta
-rw-------. 1 root root 334 Dec  4 15:49 20191204T074907Z.s151.tar.gz

可注意到三种备份产生出来的文件完全一样,因为influxdb的备份是按照包为单位进行备份的,而我创造的这些测试数据全在一个包内。

3.还原

还原语句说明:

influxd restore [ -db <db_name> ]
    -portable | -online
    [ -host <host:port> ]
    [ -newdb <newdb_name> ]
    [ -rp <rp_name> ]
    [ -newrp <newrp_name> ]
    [ -shard <shard_ID> ]
    <path-to-backup-files>
  • -portable:对InfluxDB OSS使用新的企业兼容备份格式。推荐使用而不是-online。可以将在InfluxDB Enterprise上创建的备份还原到InfluxDB OSS实例。
  • -online:使用旧版备份格式。仅在-portable无法使用较新的选项时使用。
  • [ -host host:port ]:InfluxDB OSS实例的主机和端口。默认值为’127.0.0.1:8088’。远程连接所必需。例:-host 127.0.0.1:8088
  • [ -db | -database ]:要从备份还原的数据库的名称。如果未指定,将还原所有数据库。
  • [ -newdb ]:将在目标系统上导入存档数据的数据库的名称。如果未指定,则使用的值-db。新的数据库名称对于目标系统必须是唯一的。
  • [ -rp ]:将从备份中恢复的保留策略的名称。需要-db设置。如果未指定,将使用所有保留策略。
  • [ -newrp ]:要在目标系统上创建的保留策略的名称。需要-rp设置。如果未指定,则使用该-rp值。
  • [ -shard ]:要恢复的分片的分片ID。如果指定,则-db和-rp是必需的。

注意:

我们不能直接还原到已经存在的数据库中。如果尝试将restore命令运行到现有数据库中,则会收到以下消息:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influxd restore -portable -db test /opt/influxdb/influxdb_test_all/
2019/12/04 15:49:55 error updating meta: DB metadata not changed. database may already exist
restore: DB metadata not changed. database may already exist

失败,因为这个库已存在。。

只能先将现有数据库备份还原到临时数据库,再同步到原库。

先来将刚刚的3个备份还原到3个新库,再以全库备份产生的新库当作临时库来做测试:

a.全库备份还原:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influxd restore -portable -db test -newdb test_all /opt/influxdb/influxdb_test_all/
2019/12/04 15:50:20 Restoring shard 151 live from backup 20191204T074256Z.s151.tar.gz

查看数据:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influx -precision rfc3339 -database test_all -execute 'select * from test_measurement'
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T07:39:32.658929512Z server01 root 1
2019-12-04T07:41:41.297177588Z server01 root 3
2019-12-04T07:41:41.716384909Z server01 root 3
2019-12-04T07:41:42.101365997Z server01 root 3
2019-12-04T07:41:42.461861224Z server01 root 3
2019-12-04T07:41:43.20379409Z  server01 root 3

b.按时间戳截取备份还原:

start&end的增量备份:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influxd restore -portable -db test -newdb test_start_end /opt/influxdb/influxdb_test_start_end/
2019/12/04 15:51:36 Restoring shard 151 live from backup 20191204T074850Z.s151.tar.gz

查看数据:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influx -precision rfc3339 -database test_start_end -execute 'select * from test_measurement'
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T07:39:32.658929512Z server01 root 1
2019-12-04T07:41:41.297177588Z server01 root 3
2019-12-04T07:41:41.716384909Z server01 root 3
2019-12-04T07:41:42.101365997Z server01 root 3
2019-12-04T07:41:42.461861224Z server01 root 3
2019-12-04T07:41:43.20379409Z  server01 root 3

c.按时间戳增量备份

since的增量备份:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influxd restore -portable -db test -newdb test_since /opt/influxdb/influxdb_test_since/
2019/12/04 15:53:13 Restoring shard 151 live from backup 20191204T074907Z.s151.tar.gz

查看数据:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influx -precision rfc3339 -database test_since -execute 'select * from test_measurement'
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T07:39:32.658929512Z server01 root 1
2019-12-04T07:41:41.297177588Z server01 root 3
2019-12-04T07:41:41.716384909Z server01 root 3
2019-12-04T07:41:42.101365997Z server01 root 3
2019-12-04T07:41:42.461861224Z server01 root 3
2019-12-04T07:41:43.20379409Z  server01 root 3

注意:我在测试start&end及since时经常发现备份的结果包含指定时间段外的数据,这是由于influxdb是按照数据块来备份的,相同数据块内的数据都会被备份!!而我这个测试模拟了一种极端情况,即所有数据都在一个包中,所以所有数据都被备份了下来。

数据库损坏实验

1.数据库数据完全丢失

尝试重建test库的measurement,并写入新数据,来模拟数据全丢失:

[root@plat-ecloud01-mgmt-monitor04 influxdb]# influx -precision rfc3339
Connected to http://localhost:8086 version 1.7.9
InfluxDB shell version: 1.7.9
> use test
Using database test
> drop measurement test_measurement
> insert test_measurement,host=server01,user=root value=4
> insert test_measurement,host=server01,user=root value=5
> insert test_measurement,host=server01,user=root value=6
> select * from test_measurement
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T08:06:10.408564036Z server01 root 4
2019-12-04T08:06:11.623521832Z server01 root 5
2019-12-04T08:06:12.790467551Z server01 root 6

将临时库test_all的数据考入test:

> use test_all
Using database test_all
> SELECT * INTO test..:MEASUREMENT FROM /.*/ GROUP BY *
name: result
time                 written
----                 -------
1970-01-01T00:00:00Z 6

查看测试test库数据:

> use test
Using database test
> select * from test_measurement
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T07:39:32.658929512Z server01 root 1
2019-12-04T07:41:41.297177588Z server01 root 3
2019-12-04T07:41:41.716384909Z server01 root 3
2019-12-04T07:41:42.101365997Z server01 root 3
2019-12-04T07:41:42.461861224Z server01 root 3
2019-12-04T07:41:43.20379409Z  server01 root 3
2019-12-04T08:06:10.408564036Z server01 root 4
2019-12-04T08:06:11.623521832Z server01 root 5
2019-12-04T08:06:12.790467551Z server01 root 6

数据已成功考入,恢复正常。

2.数据库数据部分丢失

原始数据:
test库:

> use test
Using database test
> select * from test_measurement
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T07:39:32.658929512Z server01 root 1
2019-12-04T07:41:41.297177588Z server01 root 3
2019-12-04T07:41:41.716384909Z server01 root 3
2019-12-04T07:41:42.101365997Z server01 root 3
2019-12-04T07:41:42.461861224Z server01 root 3
2019-12-04T07:41:43.20379409Z  server01 root 3
2019-12-04T08:06:10.408564036Z server01 root 4
2019-12-04T08:06:11.623521832Z server01 root 5
2019-12-04T08:06:12.790467551Z server01 root 6

test_all库:

> use test_since
Using database test_since
> select * from test_measurement
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T07:39:32.658929512Z server01 root 1
2019-12-04T07:41:41.297177588Z server01 root 3
2019-12-04T07:41:41.716384909Z server01 root 3
2019-12-04T07:41:42.101365997Z server01 root 3
2019-12-04T07:41:42.461861224Z server01 root 3
2019-12-04T07:41:43.20379409Z  server01 root 3

现假设test_since库为生产库,而test库为临时库,现在test_all丢失了部分数据,再在库中插入部分新数据:

> use test_since
Using database test_since
> insert test_measurement,host=server01,user=root value=8
> insert test_measurement,host=server01,user=root value=8
> insert test_measurement,host=server01,user=root value=8
> select * from test_measurement
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T07:39:32.658929512Z server01 root 1
2019-12-04T07:41:41.297177588Z server01 root 3
2019-12-04T07:41:41.716384909Z server01 root 3
2019-12-04T07:41:42.101365997Z server01 root 3
2019-12-04T07:41:42.461861224Z server01 root 3
2019-12-04T07:41:43.20379409Z  server01 root 3
2019-12-04T08:39:48.494394169Z server01 root 8
2019-12-04T08:39:49.287698005Z server01 root 8
2019-12-04T08:39:51.346885323Z server01 root 8

将test库的数据拷到test_since:

> use test
Using database test
> SELECT * INTO test_since..:MEASUREMENT FROM /.*/ GROUP BY *
name: result
time                 written
----                 -------
1970-01-01T00:00:00Z 9

查看数据:

> use test_since
Using database test_since
> select * from test_measurement
name: test_measurement
time                           host     user value
----                           ----     ---- -----
2019-12-04T07:39:32.658929512Z server01 root 1
2019-12-04T07:41:41.297177588Z server01 root 3
2019-12-04T07:41:41.716384909Z server01 root 3
2019-12-04T07:41:42.101365997Z server01 root 3
2019-12-04T07:41:42.461861224Z server01 root 3
2019-12-04T07:41:43.20379409Z  server01 root 3
2019-12-04T08:06:10.408564036Z server01 root 4
2019-12-04T08:06:11.623521832Z server01 root 5
2019-12-04T08:06:12.790467551Z server01 root 6
2019-12-04T08:39:48.494394169Z server01 root 8
2019-12-04T08:39:49.287698005Z server01 root 8
2019-12-04T08:39:51.346885323Z server01 root 8

重复数据不会产生冲突,将会直接用新数据覆盖,成功恢复。

你可能感兴趣的:(监控)