HP Vertica数据库的备份和恢复

Backing Up and Restoring the Database

HP Vertica支持一个综合的应用,vbr.pyPython script,它的功能包括:back up, restore, list backups,把数据库复制到其他集群。备份支持object-levelbackups,备份用户和表。对于全库,可以创建全量或者增量的备份。如果存在一个全量的备份,我们可以恢复全库,也可以恢复一个或者多个数据库对象。使用vbr.py备份集支持的保存位置:

A.    本地目录(the nodes in the cluster);

B.    集群外的一个或者多个主机;

C.    A different HP Vertica cluster(可以有效复制数据库)

1 兼容性要求

you cannot restore a version 6.x backup toa version 7.x database;

HP Vertica does support restores within thesame major release.

Forexample you can restore a version 7.0 backup to a version 7.1 database

2 自动定期备份

将vbr.py的运行参数放到一个脚本文件中。利用linux的cron定时调度备份操作。

3 备份方式

Vbr.py支持3中备份方式:

a.     Full backups

b.     Object-level backups

c.     Hard link local backups

我们可以通过用户自定义的描述名称来指示全库或者对象备份,比如FullDBSnap, Schema1Bak, Table1Bak

注:

recovery主要是处理数据一致性的问题--一个应用archivelog及redo

restore单纯还原文件了

 

3.1 全库备份

一个全库的备份集包括:databasecatalog,用户模式,表和其他对象。备份集是数据库备份时刻的一个镜像或者快照。灾难恢复的时候,可以使用一个全量的备份集还原不完备或者损坏的数据。

         当一个全量备份集存在的时候,vbr.py在全量之后,创建一系列连续的快照,记录数据的变化。

         Archives包含了一系列想通名称的备份集。

3.2 Object-Level备份

一个object-level备份,包含了一个或者多个表或者用户;当一个对象的备份存在的时候,我们可以恢复它的全部内容,但是不能指定恢复其中的一个特定的。

Note: HP Verticadoes not support object level backups on Hadoop Distributed File System

(HDFS) storage.

         Object-Level备份支持如下基本的对象:

可选择对象

可以选择作为object-level备份一部分的对象,比如T1和T2表

依赖对象

由于依赖关系,必须作为备份集一部分的对象。比如,对一个包含外键的表创建备份集,则vbr.py因为表约束,自动包含主键的表。Projections也是依赖对象

Principal objects

The objects on which both selected and dependent objects depend are

called principal objects. For example, each table and projection has an

owner, and each is a principal object.

 

3.3 本地备份的硬链接

一系列的hard file links来匹配数据文件。A hard link local backup是数据库catalog(登记目录)的备份。

4 什么时候备份数据库

升级版本之前:Before you upgrade HP Vertica to another release。

删除分区之前:Beforeyou drop a partition

加载大的数据卷之后:Afteryou load a large volume of data.

修改集群(添加删除更好节点)之后:Before and after you add, remove, or replace nodes in your databasecluster                                                                                                                   

还原集群之后:After recovering a cluster from a crash。

 

注:全库还原之后,集群也需要还原。所以更新集群配置之后,需要重新备份。

When you restore a full database backup, you mustrestore to a cluster that is identical

to the one on which you created the backup. For thisreason, always create a new full backup

after adding, removing, or replacing nodes.

 

5 配置备份主机

使用vbr.py的配置文件,指定集群中的哪个节点备份到哪个主机上面。对于备份主机的要求:

A.    有足够备份的磁盘空间。

B.    集群可以通过SSH访问备份主机。

C.    备份主机可以无密码SSH访问数据库的管理员用户。

D.    备份主机的python和rsync版本和HP Vertica installer安装的版本相同。

5.1 创建备份主机的配置文件

注:考虑到备份的性能,推荐集群中的每个node备份到一个单独的备份主机上。

for optimal network performance whencreating a backup, HP Vertica recommends having each node in the cluster use adedicated backup host。

对于全量备份和对象基本的备份,创建分开的配置文件不同的名称的配置文件,对于相同的节点,相同的备份主机和备份目录。针对一个备份目录,最后只存放一个database的备份集。

5.2 估算备份主机的磁盘需求

首先,需要考虑增量备份的磁盘的容量。如果使用多个archive,磁盘容量还好增加。

HP推荐每个备份主机的容量,是数据库使用空间的2倍(HP Vertica recommendsthat each backup host has space for at least twice the database footprint size)

         确认磁盘空间:

select sum(used_bytes) fromstorage_containers where node_name='v_mydb_node0001';

或者

select node_name,sum(used_bytes) assize_in_bytes from v_monitor.storage_containers group

by node_name;

5.3 Log File估算日志文件的备份要求

HP推荐每个备份主机给vbr.pylog files分配1GB的空间;log file不会自动删除,有必要手段删除。

         执行命令vbr.py—setupconfig配置配置文件和参数的时候,有个参数tempDir指定vbr.py在备份主机写日志。默认位置是在每个主机的/tmp/vbr目录下面。日志文件描述了进程,吞吐,每个node发生的任何报错。

5.4 确定备份主机是可以访问的。

         在数据库节点和备份主机之间,防火墙需要允许ssh和rsync协议通过port 50000连接。备份主机的python和rsync版本和HP Vertica installer安装或支持的版本相同。

5.5 设置无密码SSH访问

【1】备份主机的访问账户,拥有写备份目录和日志目录(默认/tmp/)的权限。

         【2】相应的备份节点有用集群中任一节点的无密码ssh访问权限。

5.6 增加备份主机SSH协议最大的连接数的设置

多个数据库节点备份到一个备份主机的时候(N:1),SSH daemon(sshd)增加ssh连接数量的设置,默认每个host的ssh连接数是10;

1. Log on as root to access the config file.

2. Open the SSH configuration file (/etc/ssh/sshd_config) in a text editor.

3. Locate the #MaxStartups parameter.

4. Remove the comment character (#) and increase the value from the default of 10. 大于所有要连接主机的数量。

5. Save the file.

6. Reload the file using the following command:

sudo /etc/init.d/sshd reload

7. Exit from root.

 

6 配置本地备份主机的Hard Link

When specifyingthe backupHost parameter for your hard link local configuration files, use the databasehost names (or IP addresses) as known to Admintools, rather than the node names.Host names (or IP addresses) are what you used when setting up the cluster. Donot use localhost for the backupHost parameter.

6.1 列出hostname

         selectnode_name, host_name from node_resources;

7 创建vbr.py配置文件

back up andrestore a full or object-level backup, or to copy a cluster,vbr.pyutility是通过配置文件来完成这些操作的。

注:

Note: You must be logged on as dbadmin, not root, to create the vbr configuration file.

         创建配置文件命令:

#dbadmin用户执行。

/opt/vertica/bin/vbr.py --setupconfig

例如:

[dbadmin@localhost ~]$ /opt/vertica/bin/vbr.py --setupconfig

Snapshot name (backup_snapshot): fullbak

Backup vertica configurations? (n) [y/n]: y

Number of restore points (1): 3

Specify objects (no default):

Vertica user name (dbadmin):

Save password to avoid runtime prompt? (n) [y/n]: y

Password to save in vbr config file (no default):

Node v_vmart_node0001

Backup host name (no default): 127.0.0.1

Backup directory (no default): /home/dbadmin/backups

Config file name (fullbak1.ini):

Change advanced settings? (n) [y/n]: n

Saved vbr configuration to fullbak1.ini.

7.1 指定一个备份名称

Snapshot name (backup_snapshot):

不能空白。备份名称,不需要添加时间。

全库备份和对象备份的配置文件虽然不同,但是备份目录是相同的。

例如:

a full database backup, called fullbak.iniwould have these snapshotName and backupDir

parameter values:

snapshotName=fullbak

backupDir=/home/dbadmin/data/backups

 

The configuration file for the object-levelbackup, called objectbak.ini, would have these

parameter values:

snapshotName=objectbak

backupDir=/home/dbadmin/data/backups

7.2 备份Vertica Configuration File

一定要备份数据库的配置文件。配置文件vertica.conf保存在catalog的目录。

Enter y to include a copy of the vertica.conffile in the backup. The file is not saved by default:

Backup vertica configurations? (n) [y/n]:

The vertica.conf file containsconfiguration parameters you have changed. The file is stored in

the catalog directory. Press Enter toaccept the default, or n to omit the file explicitly.

7.3 保存多个还原点

Number of restore points (1):

指定还原点的数据量:默认为1。

7.4 选择Full or Object-Level Backups

Specify objects (no default):

输入表名的形式为:schema.objectname

输入多个表名或者schema名称,用逗号分开名称

输入的名称,在配置文件的Objects parameter中列出来。

7.5 输入User Name

输入谁会调用vbr.py的用户:

Vertica user name (dbadmin):

7.6 保存账号密码

vbr.py执行的时候,是否提示输入密码

Save password to avoid runtime prompt? (n)[y/n]:

 

7.7 指定 Backup Host and Directory

应用列出集群中的每个节点:为没节点节点输入备份主机和目录

Node v_vmart_node0001

Backup host name (no default):

Backup directory (no default):

输入的配置信息保存在文件[Mapping] 选项卡中。如

[Mapping]

v_vmart_node0001 = 127.0.0.1:/home/dbadmin/backups

7.8 保存配置文件

输入一个配置文件的名称

Config file name (fullbak.ini):

注:由于配置文件默认保存在数据库集群上,为了防止文件丢失,最好copy一份到备份主机上面。

7.9 Continuing to Advanced Settings

Change advanced settings? (n) [y/n]:

7.10 配置文件的例子

[Misc]

; Section headings are enclosed by square brackets.

; Comments have leading semicolons.

; Option and values are separated by an equal sign.

snapshotName = exampleBackup

; For simplicity, use the same temp directory location on

; all backup hosts. The utility must be able to write to this

; directory.

tempDir = /tmp/vbr

; Vertica binary directory should be the location of

; vsql & bootstrap. By default it's /opt/vertica/bin

;verticaBinDir =

; include vertica configuration in the backup

verticaConfig = True

; how many times to rety operations if some error occurs.

retryCount = 5

retryDelay = 1

restorePointLimit = 5

[Database]

; db parameters

dbName = exampleDB

dbUser = dbadmin

dbPassword = password

; if this parameter is True, vbr will prompt user for db password every time

dbPromptForPassword = False

[Transmission]

encrypt = False

checksum = False

port_rsync = 50000

; total bandwidth limit for all backup connections in KBPS, 0 for unlimited

total_bwlimit_backup = 0

; total number of backup connections, -u for unlimited

concurrency_backup = 0

; total bandwidth limit for all restore connections in KBPS, 0 for unlimited

total_bwlimit_restore = 0

; total number of restore connections, -u for unlimited

concurrency_restore = 0

[Mapping]

; backupDir ignored for copy cluster task

v_exampledb_node0001 = backup01:/home/dbadmin/backups

v_exampledb_node0002 = backup02:/home/dbadmin/backups

v_exampledb_node0003 = backup03:/home/dbadmin/backups

 

7.11 改变Overwrite参数值

针对对象级别的备份,如果配置文件以及存在,可以添加overrite参数。

7.12 配置要求的VBR Parameters

/opt/vertica/bin/vbr.py –setupconfig

例子:

> /opt/vertica/bin/vbr.py --setupconfig

Snapshot name (snapshotName): ExampleBackup

Backup Vertica configurations? (n) [y/n] y

Number of restore points? (1): 5

Specify objects (no default): dim, dim2

Vertica user name (current_user): dbadmin

Save password to avoid runtime prompt? (n)[y/n]: y

Password to save in vbr config file (no default): mypw234

Node v_example_node0001

Backup host name (no default): backup01

Backup directory (no default): /home/dbadmin/backups

Node v_exampledb_node0002

Backup host name: backup02

Backup directory: /home/dbadmin/backups

Node v_exampledb_node0003

Backup host name: backup03

Backup directory: /home/dbadmin/backups

Config file name: exampleBackup.ini

Change advanced settings? (n)[y/n]: n

Administrator's Guide

Backing Up and Restoring the Database

HP

7.13 配置高级的VBR Parameters

port_rsync,The Port number forthe rsync daemon. The default value is 50000.

retryCount,The number of timesto retry if a connection attempt fails. The default value is 2.

tempDir,

total_bwlimit_backup,The totalbandwidth limit in KBps for backup connections

total_bwlimit_restore,The totalbandwidth limit in KBps for restore connections

7.14 Configuring the Hard Link Local Parameter

手动添加

[Transmission]

hardLinkLocal = True

如果配置文件有高级参数,则

[Transmission]

encrypt = False

checksum = False

port_rsync = 50000

total_bwlimit_backup = 0

total_bwlimit_restore = 0

hardLinkLocal = True

 

8 使用Hard File Link Local Backups

在数据库本地创建备份集,包括全库或者对象的。使用hard link local backups,对比远程有如下好处:

         A.速度, In a hardlink local backup, vbr.py does not copy files (as long asthe backup directory exists on the same file system as the database catalog anddata directories).

         B.减少网络活动:

         C.更少的磁盘空间。Sincethe backup includes a copy of the catalog and hard file links。

since a hard link local backup saves a fullcopy of the catalog each time you run

vbr.py, the disk size will increase withthe catalog size over time

         应用场景是开发设计阶段,备份一个用户或者表,当新的开发不成功的时候,可以快速的恢复。

9 创建全量和增量的备份

条件是:

A.    数据库是运行的,down的节点是不会被备份的。

B.    所有的备份主机是运行的,可用的。

 

在数据库集群的发起节点,使用databaseadministrator 账号执行vbr.py脚本,但是不能是root用户。

9.1 执行Vbr 不带可选参数

执行vbr只需要如下选项:

l --task backup

l --config-file config_file

例如

> vbr.py --task backup --config-filemyconfig.ini Copying...

Enter vertica password for user dbadmin:

Preparing...

FoundDatabase port: 5433

[==================================================]100%

All child processes terminatedsuccessfully.

Committing changes on all backupsites...

backup done!

9.2 创建备份的最佳实践

a.    Createseparate configuration files to create full- and object-level backups

b.     Usethe same backup host directory location for both kinds of backups

c.      Forbest network performance, use one backup host per cluster node

d.     Useone directory on each backup-node to store successive backups

e.     Forfuture reference, append the major Vertica version number to the configurationfile name(mybackup76x)

9.3 Object-Level备份

9.4 备份的位置和存储

备份的目录,针对每个数据库节点创建一个子目录;然后针对每个备份集的名称创建子目录;

备份集的名字就是配置文件中snapshotname参数指定的名字。

9.5 保存增量备份

当使用一个相同的配置参数的时候,vbr.py自动创建增量的备份;

9.6 什么时候vbr.py删除旧的备份

当存在的备份数量超过restorePointLimit的时候,删除备份集。

Running the vbr.py utility with the --taskbackup command deletes the oldest backup whenever

the total number that exist exceeds the restorePointLimitvalue in the configuration file.

如果restorePointLimit = 5,则只保留5份备份集。

10 备份的目录结构和内容

Multiple Restore Points

11 创建Object-Level的备份

注:缺点

if you create an object-level backup containing two schemas, schema1 and

schema2, and later want to restore schema1, you cannot do so without also restoring schema2.

To restore a single object, you can create a single object backup.

 

11.1 调起vbr.py Backup

vbr.py --task backup --config-fileobjectbak.ini

11.2 备份路径和命名

全量备份和对象备份使用一个顶级的目录;

注:

全量备份和对象备份不能使用相同的名称

Note: You must use unique names for full- and object-level backups stored at one location.

Otherwise, creating successive backups will overwrite the previous version.

11.3 Object-Level的最佳实践

l Create one configuration file for each object-level backup

l Create a different configuration file to create a full database backup

l For best network performance, use one backup host per cluster node

l Use one directory on each backup-node to store successive backups

l For future reference, append the major Vertica version number to theconfiguration file name(mybackup7x)

11.4 命名惯例

AIR1_daily_arrivals_snapshot

AIR2_hourly_arrivals_snapshot

AIR2_hourly_departures_snapshot

AIR3_daily_departures_snapshot

11.5 并发创建备份

he vbr.py utility currently permits only oneinstance of the backup script per initiator node.

l Assign one initiator node to createbackup for a given tenant.

l Give each object backup initiated on adifferent node a unique backup name.

l Start the backup script on differentinitiator nodes to create their specific tenant backups

concurrently.

11.6 确定备份频率

Always take backups after any event thatsignificantly modifies the database

11.7 理解Object-Level的内容

对象级别的备份包括:

a.     存储: Data files belonging to the specified object (s)

b.     元数据:Including the cluster topology, timestamp, epoch, AHM, and so on

c.     Catalog片段:persistentcatalog objects serialized into the principal and dependent objects

 

11.8 Making Changes After an Object-Level Backup

表或者用户被删除,随后的备份中这个用户或者表也被删除。如果不保存归档,那么这个表永远的丢失了。After creating an object-level backup, dropping schemas and tablesfrom the database means the objects will also be dropped from subsequentbackups. If you do not save an archive of the object backup, such objects couldbe lost permanently.

 

备份之后,改变一个表名,随后恢复,那么这个表不会被还原。

Changing a table name after creating atable backup will not persist after restoring the backup.

 

如果删除了一个用户,但是备份的表属于这个用户(表是依赖对象),那么还原的时候,这个用户也会被还原。If you a drop a user after a backup, and the user is the owner ofany selected or dependent objects, restoring the backup also restores the user.

 

To restore a dropped table from a backup:

1. Rename the newly created table from t1 to t2. –OID不同的,名称也不同

2. Restore the backup containing t1.

3. Restore t1. Tables t1 and t2 now coexist(共存了)

 

 

 

11.9 理解Overwrite 参数

Owerwrite参数处理,对象备份的还原时候,存在两个相同的OIDS的时候。

情形1:

1. Create an object backup of mytable.

2. After the backup, rename mytable to mytable2.

3. Restoring the backup causes mytable to overwrite mytable2 (Overwrite=true).

即使表名不一样了,但是OIDS是相同的。

creating a new table of the same name (witha different OID) is handled differently.

此时,会报错,因为表名相同,但是OIDS不同,因此owerrite机制失效。

1. Create a table backup of mytable.

2. Drop mytable.

3. Create a new table, called mytable.

4. Restoring the backup does NOT overwritethe table, and causes an error, since there is an OID

conflict.

11.10 改变Principal 和依赖对象

【1】

备份了一个表,删除了这个表属于的用,如果还原这个表的时候,会创建这个用户,按照备份时候的权限创建用。

if you drop the owner of a table includedin a backup, restoring the snapshot recreates the user, along with anypermissions that existed when the backup was taken.

【2】

还原父对象的时候,子对象会被删除,重新一起还原。

11.11 考虑约束引用

涉及到某个关联的约束条件的全部数据库对象必须都备份。

同一个用户的下的约束可以被还原,但是如果备份一个用户的外键在其他用户,出发外键所在的其他用户也一起被备份,否则不能还原约束;

For example, a schema

with tables whose constraints referenceonly tables in the same schema can be backed up, but a

schema containing a table with an FK/PK constrainton a table in another schema cannot, unless

you include the other schema in the list ofselected objects

11.12 对象基本备份的配置文件

vbr.py默认的配置为:不同的备份文件名称,相同的备份目录;

经常是创建一个集群的配置文件,多个对象的配置文件,指向相同的位置;

保存相同位置的好处:vbr.py有个额外的条款保证,在全库恢复还原的时候,对象基本的备份可以被使用;

Note: Attempting to restore a full database using an object-level configuration file fails,

resulting in this error:

VMart=> /tmp/vbr.py --config-file=Table2.ini -t restore

Preparing...

Invalid metadata file. Cannot restore.

11.13 Backup Epochs

Each backup includes the epoch to which itscontents can be restored. This epoch will be close to

the latest epoch, though the epoch couldchange during the time the backup is being written. If an

epoch change occurs while a backup is beingwritten, the storage is split to indicate the different

epochs.

The vbr.py utility attempts to create anobject-level backup five times before an error occurs and

the backup fails.

 

11.14 最大数量的备份集

每个备份目录,最多500个备份集。

This maximum is set

by rsync, and does not include archives, sothe total number of saved backups at the same location

can exceed 500.

For example, if a database has 500 schemas,S1 – S499, the full database, including archives of

earlier snapshots, can be backed up alongwith backups for each schema.

12 创建Hard Link Local 备份

需要确保:

数据库是运行的

用户比如dbadmin有读写备份目录的权限;

When you create afull- or object-level hard link local backup, these are the backup contents:

Backup Catalog Database files

Full backup : Full Copy Hard file links to all database files

Object-level backup :Full copy Hard file links for all objects

例如:

/opt/vertica/bin/vbr.py--task backup --config fullbak.ini

12.1 指定Hard Link Local Backup的位置

如果参数hardLinkLocal=True,但是备份路径在其他节点上,会报错,停止备份;

12.2 创建Hard Link Local Backups for Tape Storage

【1】创建配置文件:

/opt/vertica/bin/vbr.py –setupconfig

【2】修改配置文件:

[Transmission]选项包含hardLinkLocal=True.

【3】执行备份选项

/opt/vertica/bin/vbr.py --task backup--config-file localbak.ini

【4】复制the hard link local backup directory到其他外部存储介质

【5】当需要恢复还原数据库的时候,

复制备份集到their original backup directory

【6】使用之前的配置文件,进行数据的恢复

/opt/vertica/bin/vbr.py --task restore--config-file localbak.ini

 

13 中断备份过程

放弃备份,use Ctrl+C or send a SIGINT to the Python process running the backuputility

中断之后:

【1】    备份的文件仍然存在

【2】    重新执行,会断点继续执行The next backup process picks up where the interrupted process leftoff.

14 查看备份集

【1】        vbr.py来查看备份集。

【2】        监控备份过程查看DATABASE_SNAHPSHOTS

【3】     查看历史的备份信息:DATABASE_BACKUPS

14.1 通过vbr.py列出备份集

dbadmin@node01 temp]$/opt/vertica/bin/vbr.py --task listbackup --config-file /home/dbadm

in/table2bak.ini

包括:

backup OID, the epoch, which object(s) (ifapplicable)

were backed up, and the backup host(192.168.223.33) used for each node

14.2 监控database_snapshots

VMart=> select * fromdatabase_snapshots;

node_name | snapshot_name |is_durable_snapshot | total_size_bytes | storage_cost

14.3 监控database_backups

注:

However, do not use the backup_timestamp value whenrestoring an archive.

VMart=>select * from v_monitor.database_backups;

15 还原Full Database Backups

条件为:

1The database is down

【2】The node names and the IP addresses must also be identical

【3】数据库必须在集群上创建了。

【4】the database name matches the name in the backup, and all of thenode names match the names of the nodes in the configuration file, you canrestore to it.

15.1 Restoring the Most Recent Backup

使用数据库管理员账户,不能使用root用;

Thefollowing example uses the db.ini configurationfile, which includes the superuser's

password:

>vbr.py--task restore --config-file db.ini

Copying...

1871652633out of 1871652633, 100%

Allchild processes terminated successfully.

restoredone!

15.2 Restoring an Archive

当我们保存了多个备份的时候,使用一个archive来还原;

列出所有的archives,选择一个区还原

vbr.py –listbackup 还需要指定一个配置文件;

--archive参数 加上目录名称的时间的后缀,例如

> vbr.py --task restore --config-file fullbak.ini --archive=20121111_205841

因此--archive参数指定了archive的子目录,OID指定了archive中的backup;

An archive can comprise multiple snapshots,including both fulland

object-level backups.

 

15.3 试图还原一个正在运行的节点;

全库还原的时候,node必须为down状态的;

如果node是up的,会报错的,报错信息如下:

doc:tests/doc/tools $ vbr.py --config-file=doc.ini -t restore --nodes=v_doc_node0001

Warning: trying to restore to an UP cluster

Warning: Node state of v_doc_node0001 is UP; node must be DOWN for restore; ignoring

restore on this node.

Nothing to do

restore done!

15.4 试图还原数据库到一个新的集群中

还原全库的时候,node name和ip必须和备份的时候完全一模一样才能还原;

The vbr.py utility does NOT supportrestoring a full database backup to an

alternate cluster with different host namesand IP addresses.

16 还原对象级别的备份

还原数据库对象的时候,数据库必须是运行的;

另外,不能还原数据库对象到一个空白的数据库中;

You cannot restore an object-level backupinto an empty database.

而且,我们只能还原所有的备份,不能还原备份的一部分;

16.1 备份的位置

全库备份和对象备份表存放在一个位置的缺点是,还原全库之后,再还原对象会报错;

Note: Using different backup locations in which to create full- or object-level backups results in

incompatible object-level backups. Attempting to restore an object-level backup after restoring

a full database will fail.

 

16.2 数据库对象还原的集群要求;

全库备份和对象备份存放在一个位置;数据库down,还原全库备份之后,数据库on,然后还原对象的备份;

对象还原的时候,不用管node的状态,会自动更新,把node加入到集群中,并且启动node;

Regardless of the node states when a backupwas taken, you do not have to manage node states

before restoring an object-level backup.Any node that is DOWN when you restore an object-level

backup is updated when the node rejoins thecluster and comes UP after an object-level restore.

16.3 集群拓扑改变之后的还原对象

向集群中添加节点之后,vbr.py支持对象级别的还原;而且新增的node可以被updated;

 

但是删除node,更改node的name,更改node的ip,vbr.py是不支持还原的;

 

16.4 Projection Epoch After Restore

所有的对象基本的备份和还原事件被当做DDL事件;

If a table does not participate

in an object-level backup, possibly due toa node being down, restoring the backup has this effect

on the projection:

l Its epoch is reset to 0

l It must recover any data that it does nothave (by comparing epochs and other recovery

procedures)

16.5 在备份还原期间Catalog被锁定

HP Vertica交易遵循严格的锁机制来保证数据的完整性;

 

When restoring an object-level backup intoa cluster that is UP, vbr.py begins by copying data and

managing storage containers, potentiallysplitting the containers if necessary. While copying and

managing data is the largest task in anyrestore process, it does not require any database locks.

 

After completing data copying tasks, vbr.pyfirst requires a table object lock (O-lock), and then a

global catalog lock (GCLX).

 

如果其他数据库进程的DML语句获得了表级锁(O-lock on the table),

vbr.py is blocked from progress until theDML statement completes and

releases the lock。

GCLX 保证了cataLog的数据一致性完整性问题;。

 

Database system operations, such as the tuple mover (TM) transferringdata from

memory to disk, are canceled to permit object-level restore to complete

 

16.6 Catalog还原事件

每个对象级别的备份都包含了一部分database catalog,称为snippet,它包含可选择对象,依赖对象,父对象等;由于catalog snippet的结构和database的database catalog结构相同,

当对象还原的时候,catalog shippet也会更新local和global catalog;

17 还原Hard Link Local Backups

If you have created both full- andobject-level backups and the database fails, first restore the full

database backup. You can then restore fromthe object-level backups.

17.1 避免OID和Epoch的冲突

If you create full- and object-level backups in the same backupdirectory (recommended),

进行全库还原的时候,vbr.py也会自动识别最新的oid和epoch的数据库对象备份

vbr.py determines the latest OID and epochof the object-level backups as

well.

例如:

1.Create a full hard link local backup in backup directory /home/dbadmin/backups, with

configurationfile mybak.ini:

[dbadmin@node01~]$ /opt/vertica/bin/vbr.py --task backup --config-file mybak.ini

2.Create an object-level hard link local backup for Table1 using the same backup directory, with

configurationfile table1bak.ini:

[dbadmin@node01 ~]$ /opt/vertica/bin/vbr.py--task backup --config-file table1bak.ini

3.Create an object-level hard link local backup for Table2, using the same backup directory, with

configurationfile table2bak.ini.

[dbadmin@node01 ~]$ /opt/vertica/bin/vbr.py--task backup --config-file table2bak.ini

目录为:

还原的时候发生如下的情况:

【1】

vbr.py detects the maximum object ID (OID)and epochs,并且在还原的数据库中更新;

This prevents OID and epoch conflicts from

occurring when object-level backups arerestored after the newly restored database

 

【2】

If the database and object-level tablebackups are not in the same backup directory, vbr.py

reverts the maximum OID and epoch for afull database restore back to the table OID and epoch.

Further attempts to restore either table1or table2 backups will then fail, preventing any

potential conflicts.

 

17.2 Transferring Backups to and From Remote Storage

to another storage media, such as tape。

Completethe following steps to restore hard link local backups from external media:

1. Ifthe original backup directory no longer exists on one or more local backup hostnodes,

recreatethe directory. The directory structure into which you restore hard link backupfiles

must beidentical to what existed when the backup was created. For example, if youcreated

hardlink local backups at the following backup directory, then recreate that directorystructure:

/home/dbadmin/backups/localbak

2. Copythe backup files to their original backup directory, as specified for each nodein the

configurationfile with the backupHostandbackupDir parameters. For example, this

configurationfile shows the backupDirparameter forv_vmart_node0001:

[Mapping0]dbNode= v_vmart_node0001

backupHost = node03

backupDir = /home/dbadmin/backups/localbak

3. Torestore the latest version of the backup, move the backup files to thisdirectory:

/home/dbadmin/backups/localbak/node_name/snapshotname

4. Torestore a different backup version, move the backup files to this directory:

/home/dbadmin/backups/localbak/node_name/snapshotname_archivedate_timestamp

5. Whenthe backup files are returned to their original backup directory, use theoriginal

configurationfile to invoke vbr.pyas follows:

>/opt/vertica/bin/vbr.py--task restore --config-file localbak.ini

If thephysical files are restored from tape into the correct directories, and you usethe

configurationfile that specifies hardLinkLocal= true, restoring thebackup succeeds.

 

18 还原数据库到相同的集群

把一个全库备份还原一个相同的集群的数据库的一般步骤:

1. Stopping thedatabase you intend to restore.

Note: If you restore datato a single node, the node has already stopped. You do not need

to stop the database.

 

2. Restoring the Database (vbr.py).

 

3. Starting thedatabase.

Note: If you restored allthe nodes using the backup, you are using a manual recovery

method. For moreinformation, see the Failure Recovery in the See Also section below.

Administration Toolsreturns the message, "Database startup failed," after you attempt to

restart the databaseand then offers to restart the database from an earlier epoch. Click

Yes.

 

4. After thedatabase starts, connect to it through the Administration Tools or MC andverify that it was successfully restored by running some queries.

19 Removing Backups

1.       Use vbr.py --task listbackupto identify the backup and archivenames that reside on the local or remote backup host (requires a configurationfile).

2.      Delete the backup directories

19.1 Deleting Backup Directories

dbadmin@node01 temp]$/opt/vertica/bin/vbr.py --task listbackup --config-file /home/dbadm

in/table2bak.ini

 

snapshotName

 backupHost

 backupDir

dbNode

 

3. Connect to the backup host.

4. Navigate to the backup directory (thebackupDir parameter value in the configuration file).

5. To delete all backups, delete thetop-level snapshot directory

-or-

6. To delete an archive, navigate to thesubdirectory for each database node, located below the

top-level snapshot directory, and deletethe archive directory.

 

20 Copying the Database to Another Cluster

例如,在生产环境和开发环境之间,复制数据库。

存放HP Vertica catalog , data, and temp directories的存储路径,在目录库和源库之间,必须完全相同的。查看源库的目录结构:

VMart=> \xExpanded display is on.

VMart=> select node_name,storage_path, storage_usage from disk_storage;

-[ RECORD 1 ]-+-----------------------------------------------------

node_name | v_vmart_node0001

storage_path | /home/dbadmin/VMart/v_vmart_node0001_catalog/Catalog

storage_usage | CATALOG

-[ RECORD 2 ]-+-----------------------------------------------------

node_name | v_vmart_node0001

storage_path | /home/dbadmin/VMart/v_vmart_node0001_data

storage_usage | DATA,TEMP

-[ RECORD 3 ]-+-----------------------------------------------------

node_name | v_vmart_node0001

storage_path | home/dbadmin/SSD/schemas

storage_usage | DATA

-[ RECORD 4 ]-+-----------------------------------------------------

node_name | v_vmart_node0001

storage_path | /home/dbadmin/SSD/tables

storage_usage | DATA

-[ RECORD 5 ]-+-----------------------------------------------------

node_name | v_vmart_node0001

storage_path | /home/dbadmin/SSD/schemas

storage_usage | DATA

-[ RECORD 6 ]-+-----------------------------------------------------

node_name | v_vmart_node0002

storage_path | /home/dbadmin/VMart/v_vmart_node0002_catalog/Catalog

storage_usage | CATALOG

-[ RECORD 7 ]-+-----------------------------------------------------

node_name | v_vmart_node0002

storage_path | /home/dbadmin/VMart/v_vmart_node0002_data

storage_usage | DATA,TEMP

-[ RECORD 8 ]-+-----------------------------------------------------

node_name | v_vmart_node0002

storage_path | /home/dbadmin/SSD/tables

storage_usage | DATA

20.1 Identifying Node Names for Target Cluster

select node_name from nodes;

You need to know the exact names thatAdmintools supplied to all nodes in the source database

before configuring the target cluster.

To run Admintools from the command line, enter a command suchas this for the VMart database:

$ /opt/vertica/bin/admintools -t node_map -dVMART

20.2 Configuring the Target Cluster

l  Have the same number of nodes the sourcecluster.

l  Have a database with the same name as thesource database. The target database can be

completely empty.

l  Have the same node names as the source cluster.The nodes names listed in the NODES

system tables on both clusters must match.

l  Be accessiblefrom the source cluster.

l  Have the same database administrator account,and all nodes must allow a database

administrator of the source cluster to login through SSHwithout a password.

 

注:

You need to configure each host in the targetcluster to accept the SSH

authentication of the source cluster.

 

l  Haveadequatedisk space for the vbr.py --task copycluster command to complete.

 

20.3 Creating a Configuration File for CopyCluster

You cannot use an object-level backup withthe copycluster command. You must use a full

database backup.

 

[Misc]

snapshotName = CopyVmart

verticaConfig = False

restorePointLimit = 5

tempDir = /tmp/vbr

retryCount = 5

retryDelay = 1

[Database]

dbName = vmart

dbUser = dbadmin

dbPassword = password

dbPromptForPassword = False

[Transmission]

encrypt = False

checksum = False

port_rsync = 50000

[Mapping]

; backupDir is not used for cluster copy

v_vmart_node0001= test-host01:/home/dbadmin/backups

v_vmart_node0002= test-host02:/home/dbadmin/backups

v_vmart_node0003= test-host03:/home/dbadmin/backups

20.4 Copying the Database

The target cluster must be stopped beforeyou invoke copycluster

目标集群需要stop

To copy the cluster, run vbr.py from a nodein the source database using the database

administrator account, passing the --taskcopycluster --config-file CopyVmart.ini

command.

> vbr.py --config-fileCopyVmart.ini --task copyclusterCopying...

1871652633 out of 1871652633, 100%

All child processes terminatedsuccessfully.

copycluster done!

21 Backup and Restore Utility Reference

21.1 VBR Utility Reference

语法:

/opt/vertica/bin/vbr.py { command }

... [ --archive file ]

... [ --config-file file ]

... [ --debug <em+>level</em>]

... [ --nodes node1 [, noden, ...] ]

... [ --showconfig ]

【1】

--setupconfig

【2】

--task {backup| copycluster| listbackup|restore }

backup creates afull-database, or object-level backup,

depending on what you have specified in the

configuration file

l copycluster copiesthe database to another HP

Vertica cluster.

l listbackup displaysthe existing backups associated

with the configuration file you supply. Use the

information in this display to get the name of a snapshot

you want to restore. See the Related Tasks section for

more infromation about restoring the database.

l restore Restoresa full- or object-level database

backup. Requires a configuration file.

【3】

--archive file Used with --task backup or--task restore commands

【4】

--config-file file

【5】

--nodes node1[,...]

【7】

--debug level

【6】

--showconfig

21.2 VBR Configuration File Reference

21.2.1 [Misc] Miscellaneous Settings

【1】snapshotName

【2】tempDir /tmp

【3】verticaBinDir /opt/vertica/bin

【4】verticaConfig False 

Indicates whether the HP Verticaconfiguration file is

included in the backup, in addition to thedatabase data.

【4】    restorePointLimit 1

【5】    overwrite True

【6】    retryCount 2

【7】    retryDelay 1

21.2.2 [Database] Database Access Settings

21.2.3 [Transmission] Data Transmission During Backup

port_rsync 50000

hardLinkLocal False

port_ssh_backup 22

21.2.4 [Mapping]

[Mapping]

v_vmart_node0001 =127.0.0.1:/home/dbadmin/backups

 

 

************************************************************** ** 欢迎转发,注明原文:blog.csdn.net/clark_xu   徐长亮的专栏 ** 谢谢您的支持,欢迎关注微信公众号:clark_blog  **************************************************************

 

你可能感兴趣的:(HP,vertica,vertica,vertica备份)