Backing Up and Restoring the Database
HP Vertica支持一个综合的应用,vbr.py Python script,它的功能包括:back up, restore, list backups,把数据库复制到其他集群。备份支持object-level backups,备份用户和表。对于全库,可以创建全量或者增量的备份。如果存在一个全量的备份,我们可以恢复全库,也可以恢复一个或者多个数据库对象。使用vbr.py备份集支持的保存位置:
A. 本地目录(the nodes in the cluster);
B. 集群外的一个或者多个主机;
C. A different HP Vertica cluster(可以有效复制数据库)
1 兼容性要求
you cannot restore a version 6.x backup to a version 7.x database;
HP Vertica does support restores within the same major release.
For example you can restore a version 7.0 backup to a version 7.1 database。
2 自动定期备份
将vbr.py的运行参数放到一个脚本文件中。利用linux的cron定时调度备份操作。
3 备份方式
Vbr.py支持3中备份方式:
a. Full backups
b. Object-level backups
c. Hard link local backups
我们可以通过用户自定义的描述名称来指示全库或者对象备份,比如FullDBSnap, Schema1Bak, Table1Bak
注:
recovery主要是处理数据一致性的问题–一个应用archivelog及redo
restore单纯还原文件了
3.1 全库备份
一个全库的备份集包括:database catalog,用户模式,表和其他对象。备份集是数据库备份时刻的一个镜像或者快照。灾难恢复的时候,可以使用一个全量的备份集还原不完备或者损坏的数据。
当一个全量备份集存在的时候,vbr.py在全量之后,创建一系列连续的快照,记录数据的变化。
Archives包含了一系列想通名称的备份集。
3.2 Object-Level备份
一个object-level备份,包含了一个或者多个表或者用户;当一个对象的备份存在的时候,我们可以恢复它的全部内容,但是不能指定恢复其中的一个特定的。
Note: HP Vertica does not support object level backups on Hadoop Distributed File System
(HDFS) storage.
Object-Level备份支持如下基本的对象:
可选择对象 可以选择作为object-level备份一部分的对象,比如T1和T2表
依赖对象 由于依赖关系,必须作为备份集一部分的对象。比如,对一个包含外键的表创建备份集,则vbr.py因为表约束,自动包含主键的表。Projections也是依赖对象
Principal objects The objects on which both selected and dependent objects depend are
called principal objects. For example, each table and projection has an
owner, and each is a principal object.
3.3 本地备份的硬链接
一系列的hard file links来匹配数据文件。A hard link local backup是数据库catalog(登记目录)的备份。
4 什么时候备份数据库
升级版本之前:Before you upgrade HP Vertica to another release。
删除分区之前:Before you drop a partition
加载大的数据卷之后:After you load a large volume of data.
修改集群(添加删除更好节点)之后:Before and after you add, remove, or replace nodes in your database cluster
还原集群之后:After recovering a cluster from a crash。
注:全库还原之后,集群也需要还原。所以更新集群配置之后,需要重新备份。
When you restore a full database backup, you must restore to a cluster that is identical
to the one on which you created the backup. For this reason, always create a new full backup
after adding, removing, or replacing nodes.
5 配置备份主机
使用vbr.py的配置文件,指定集群中的哪个节点备份到哪个主机上面。对于备份主机的要求:
A. 有足够备份的磁盘空间。
B. 集群可以通过SSH访问备份主机。
C. 备份主机可以无密码SSH访问数据库的管理员用户。
D. 备份主机的python和rsync版本和HP Vertica installer安装的版本相同。
5.1 创建备份主机的配置文件
注:考虑到备份的性能,推荐集群中的每个node备份到一个单独的备份主机上。
for optimal network performance when creating a backup, HP Vertica recommends having each node in the cluster use a dedicated backup host。
对于全量备份和对象基本的备份,创建分开的配置文件不同的名称的配置文件,对于相同的节点,相同的备份主机和备份目录。针对一个备份目录,最后只存放一个database的备份集。
5.2 估算备份主机的磁盘需求
首先,需要考虑增量备份的磁盘的容量。如果使用多个archive,磁盘容量还好增加。
HP推荐每个备份主机的容量,是数据库使用空间的2倍(HP Vertica recommends that each backup host has space for at least twice the database footprint size)
确认磁盘空间:
select sum(used_bytes) from storage_containers where node_name=’v_mydb_node0001’;
或者
select node_name,sum(used_bytes) as size_in_bytes from v_monitor.storage_containers group
by node_name;
5.3 Log File估算日志文件的备份要求
HP推荐每个备份主机给vbr.py log files分配1GB的空间;log file不会自动删除,有必要手段删除。
执行命令vbr.py—setupconfig配置配置文件和参数的时候,有个参数tempDir指定vbr.py在备份主机写日志。默认位置是在每个主机的/tmp/vbr目录下面。日志文件描述了进程,吞吐,每个node发生的任何报错。
5.4 确定备份主机是可以访问的。
在数据库节点和备份主机之间,防火墙需要允许ssh和rsync协议通过port 50000连接。备份主机的python和rsync版本和HP Vertica installer安装或支持的版本相同。
5.5 设置无密码SSH访问
【1】备份主机的访问账户,拥有写备份目录和日志目录(默认/tmp/)的权限。
【2】相应的备份节点有用集群中任一节点的无密码ssh访问权限。
5.6 增加备份主机SSH协议最大的连接数的设置
多个数据库节点备份到一个备份主机的时候(N:1),SSH daemon (sshd)增加ssh连接数量的设置,默认每个host的ssh连接数是10;
1. Log on as root to access the config file.
2. Open the SSH configuration file (/etc/ssh/sshd_config) in a text editor.
3. Locate the #MaxStartups parameter.
4. Remove the comment character (#) and increase the value from the default of 10. 大于所有要连接主机的数量。
5. Save the file.
6. Reload the file using the following command:
sudo /etc/init.d/sshd reload
7. Exit from root.
6 配置本地备份主机的Hard Link
When specifying the backupHost parameter for your hard link local configuration files, use the database host names (or IP addresses) as known to Admintools, rather than the node names. Host names (or IP addresses) are what you used when setting up the cluster. Do not use localhost for the backupHost parameter.
6.1 列出hostname
select node_name, host_name from node_resources;
7 创建vbr.py配置文件
back up and restore a full or object-level backup, or to copy a cluster,vbr.py utility是通过配置文件来完成这些操作的。
注:
Note: You must be logged on as dbadmin, not root, to create the vbr configuration file.
创建配置文件命令:
/opt/vertica/bin/vbr.py –setupconfig
例如:
[dbadmin@localhost ~]$ /opt/vertica/bin/vbr.py –setupconfig
Snapshot name (backup_snapshot): fullbak
Backup vertica configurations? (n) [y/n]: y
Number of restore points (1): 3
Specify objects (no default):
Vertica user name (dbadmin):
Save password to avoid runtime prompt? (n) [y/n]: y
Password to save in vbr config file (no default):
Node v_vmart_node0001
Backup host name (no default): 127.0.0.1
Backup directory (no default): /home/dbadmin/backups
Config file name (fullbak1.ini):
Change advanced settings? (n) [y/n]: n
Saved vbr configuration to fullbak1.ini.
7.1 指定一个备份名称
Snapshot name (backup_snapshot):
不能空白。备份名称,不需要添加时间。
全库备份和对象备份的配置文件虽然不同,但是备份目录是相同的。
例如:
a full database backup, called fullbak.ini would have these snapshotName and backupDir
parameter values:
snapshotName=fullbak
backupDir=/home/dbadmin/data/backups
The configuration file for the object-level backup, called objectbak.ini, would have these
parameter values:
snapshotName=objectbak
backupDir=/home/dbadmin/data/backups
7.2 备份Vertica Configuration File
一定要备份数据库的配置文件。配置文件vertica.conf保存在catalog的目录。
Enter y to include a copy of the vertica.conf file in the backup. The file is not saved by default:
Backup vertica configurations? (n) [y/n]:
The vertica.conf file contains configuration parameters you have changed. The file is stored in
the catalog directory. Press Enter to accept the default, or n to omit the file explicitly.
7.3 保存多个还原点
Number of restore points (1):
指定还原点的数据量:默认为1。
7.4 选择Full or Object-Level Backups
Specify objects (no default):
输入表名的形式为:schema.objectname
输入多个表名或者schema名称,用逗号分开名称
输入的名称,在配置文件的Objects parameter中列出来。
7.5 输入User Name
输入谁会调用vbr.py的用户:
Vertica user name (dbadmin):
7.6 保存账号密码
vbr.py执行的时候,是否提示输入密码
Save password to avoid runtime prompt? (n) [y/n]:
7.7 指定 Backup Host and Directory
应用列出集群中的每个节点:为没节点节点输入备份主机和目录
Node v_vmart_node0001
Backup host name (no default):
Backup directory (no default):
输入的配置信息保存在文件[Mapping] 选项卡中。如
[Mapping]
v_vmart_node0001 = 127.0.0.1:/home/dbadmin/backups
7.8 保存配置文件
输入一个配置文件的名称
Config file name (fullbak.ini):
注:由于配置文件默认保存在数据库集群上,为了防止文件丢失,最好copy一份到备份主机上面。
7.9 Continuing to Advanced Settings
Change advanced settings? (n) [y/n]:
7.10 配置文件的例子
[Misc]
; Section headings are enclosed by square brackets.
; Comments have leading semicolons.
; Option and values are separated by an equal sign.
snapshotName = exampleBackup
; For simplicity, use the same temp directory location on
; all backup hosts. The utility must be able to write to this
; directory.
tempDir = /tmp/vbr
; Vertica binary directory should be the location of
; vsql & bootstrap. By default it’s /opt/vertica/bin
;verticaBinDir =
; include vertica configuration in the backup
verticaConfig = True
; how many times to rety operations if some error occurs.
retryCount = 5
retryDelay = 1
restorePointLimit = 5
[Database]
; db parameters
dbName = exampleDB
dbUser = dbadmin
dbPassword = password
; if this parameter is True, vbr will prompt user for db password every time
dbPromptForPassword = False
[Transmission]
encrypt = False
checksum = False
port_rsync = 50000
; total bandwidth limit for all backup connections in KBPS, 0 for unlimited
total_bwlimit_backup = 0
; total number of backup connections, -u for unlimited
concurrency_backup = 0
; total bandwidth limit for all restore connections in KBPS, 0 for unlimited
total_bwlimit_restore = 0
; total number of restore connections, -u for unlimited
concurrency_restore = 0
[Mapping]
; backupDir ignored for copy cluster task
v_exampledb_node0001 = backup01:/home/dbadmin/backups
v_exampledb_node0002 = backup02:/home/dbadmin/backups
v_exampledb_node0003 = backup03:/home/dbadmin/backups
7.11 改变Overwrite参数值
针对对象级别的备份,如果配置文件以及存在,可以添加overrite参数。
7.12 配置要求的VBR Parameters
/opt/vertica/bin/vbr.py –setupconfig
例子:
/opt/vertica/bin/vbr.py –setupconfig
Snapshot name (snapshotName): ExampleBackup
Backup Vertica configurations? (n) [y/n] y
Number of restore points? (1): 5
Specify objects (no default): dim, dim2
Vertica user name (current_user): dbadmin
Save password to avoid runtime prompt? (n)[y/n]: y
Password to save in vbr config file (no default): mypw234
Node v_example_node0001
Backup host name (no default): backup01
Backup directory (no default): /home/dbadmin/backups
Node v_exampledb_node0002
Backup host name: backup02
Backup directory: /home/dbadmin/backups
Node v_exampledb_node0003
Backup host name: backup03
Backup directory: /home/dbadmin/backups
Config file name: exampleBackup.ini
Change advanced settings? (n)[y/n]: n
Administrator’s Guide
Backing Up and Restoring the Database
HP
7.13 配置高级的VBR Parameters
port_rsync,The Port number for the rsync daemon. The default value is 50000.
retryCount,The number of times to retry if a connection attempt fails. The default value is 2.
tempDir,
total_bwlimit_backup,The total bandwidth limit in KBps for backup connections
total_bwlimit_restore,The total bandwidth limit in KBps for restore connections
7.14 Configuring the Hard Link Local Parameter
手动添加
[Transmission]
hardLinkLocal = True
如果配置文件有高级参数,则
[Transmission]
encrypt = False
checksum = False
port_rsync = 50000
total_bwlimit_backup = 0
total_bwlimit_restore = 0
hardLinkLocal = True
8 使用Hard File Link Local Backups
在数据库本地创建备份集,包括全库或者对象的。使用hard link local backups,对比远程有如下好处:
A. 速度, In a hardlink local backup, vbr.py does not copy files (as long as the backup directory exists on the same file system as the database catalog and data directories).
B.减少网络活动:
C.更少的磁盘空间。Since the backup includes a copy of the catalog and hard file links。
since a hard link local backup saves a full copy of the catalog each time you run
vbr.py, the disk size will increase with the catalog size over time
应用场景是开发设计阶段,备份一个用户或者表,当新的开发不成功的时候,可以快速的恢复。
9 创建全量和增量的备份
条件是:
A. 数据库是运行的,down的节点是不会被备份的。
B. 所有的备份主机是运行的,可用的。
在数据库集群的发起节点,使用database administrator 账号执行vbr.py脚本,但是不能是root用户。
9.1 执行Vbr 不带可选参数
执行vbr只需要如下选项:
l –task backup
l –config-file config_file
例如
vbr.py –task backup –config-file myconfig.ini Copying…
Enter vertica password for user dbadmin:
Preparing…
Found Database port: 5433
[==================================================] 100%
All child processes terminated successfully.
Committing changes on all backup sites…
backup done!
9.2 创建备份的最佳实践
a. Create separate configuration files to create full- and object-level backups
b. Use the same backup host directory location for both kinds of backups
c. For best network performance, use one backup host per cluster node
d. Use one directory on each backup-node to store successive backups
e. For future reference, append the major Vertica version number to the configuration file name(mybackup76x)
9.3 Object-Level备份
9.4 备份的位置和存储
备份的目录,针对每个数据库节点创建一个子目录;然后针对每个备份集的名称创建子目录;
备份集的名字就是配置文件中snapshotname参数指定的名字。
9.5 保存增量备份
当使用一个相同的配置参数的时候,vbr.py自动创建增量的备份;
9.6 什么时候vbr.py删除旧的备份
当存在的备份数量超过restorePointLimit的时候,删除备份集。
Running the vbr.py utility with the –task backup command deletes the oldest backup whenever
the total number that exist exceeds the restorePointLimit value in the configuration file.
如果restorePointLimit = 5,则只保留5份备份集。
10 备份的目录结构和内容
Multiple Restore Points
11 创建Object-Level的备份
注:缺点
if you create an object-level backup containing two schemas, schema1 and
schema2, and later want to restore schema1, you cannot do so without also restoring schema2.
To restore a single object, you can create a single object backup.
11.1 调起vbr.py Backup
vbr.py –task backup –config-file objectbak.ini
11.2 备份路径和命名
全量备份和对象备份使用一个顶级的目录;
注:
全量备份和对象备份不能使用相同的名称
Note: You must use unique names for full- and object-level backups stored at one location.
Otherwise, creating successive backups will overwrite the previous version.
11.3 Object-Level的最佳实践
l Create one configuration file for each object-level backup
l Create a different configuration file to create a full database backup
l For best network performance, use one backup host per cluster node
l Use one directory on each backup-node to store successive backups
l For future reference, append the major Vertica version number to the configuration file name(mybackup7x)
11.4 命名惯例
AIR1_daily_arrivals_snapshot
AIR2_hourly_arrivals_snapshot
AIR2_hourly_departures_snapshot
AIR3_daily_departures_snapshot
11.5 并发创建备份
he vbr.py utility currently permits only one instance of the backup script per initiator node.
l Assign one initiator node to create backup for a given tenant.
l Give each object backup initiated on a different node a unique backup name.
l Start the backup script on different initiator nodes to create their specific tenant backups
concurrently.
11.6 确定备份频率
Always take backups after any event that significantly modifies the database
11.7 理解Object-Level的内容
对象级别的备份包括:
a. 存储: Data files belonging to the specified object (s)
b. 元数据:Including the cluster topology, timestamp, epoch, AHM, and so on
c. Catalog片段:persistent catalog objects serialized into the principal and dependent objects
11.8 Making Changes After an Object-Level Backup
表或者用户被删除,随后的备份中这个用户或者表也被删除。如果不保存归档,那么这个表永远的丢失了。After creating an object-level backup, dropping schemas and tables from the database means the objects will also be dropped from subsequent backups. If you do not save an archive of the object backup, such objects could be lost permanently.
备份之后,改变一个表名,随后恢复,那么这个表不会被还原。
Changing a table name after creating a table backup will not persist after restoring the backup.
如果删除了一个用户,但是备份的表属于这个用户(表是依赖对象),那么还原的时候,这个用户也会被还原。If you a drop a user after a backup, and the user is the owner of any selected or dependent objects, restoring the backup also restores the user.
To restore a dropped table from a backup:
1. Rename the newly created table from t1 to t2. –OID不同的,名称也不同
2. Restore the backup containing t1.
3. Restore t1. Tables t1 and t2 now coexist(共存了)
11.9 理解Overwrite 参数
Owerwrite参数处理,对象备份的还原时候,存在两个相同的OIDS的时候。
情形1:
1. Create an object backup of mytable.
2. After the backup, rename mytable to mytable2.
3. Restoring the backup causes mytable to overwrite mytable2 (Overwrite=true).
即使表名不一样了,但是OIDS是相同的。
creating a new table of the same name (with a different OID) is handled differently.
此时,会报错,因为表名相同,但是OIDS不同,因此owerrite机制失效。
1. Create a table backup of mytable.
2. Drop mytable.
3. Create a new table, called mytable.
4. Restoring the backup does NOT overwrite the table, and causes an error, since there is an OID
conflict.
11.10 改变Principal 和依赖对象
【1】
备份了一个表,删除了这个表属于的用,如果还原这个表的时候,会创建这个用户,按照备份时候的权限创建用。
if you drop the owner of a table included in a backup, restoring the snapshot recreates the user, along with any permissions that existed when the backup was taken.
【2】
还原父对象的时候,子对象会被删除,重新一起还原。
11.11 考虑约束引用
涉及到某个关联的约束条件的全部数据库对象必须都备份。
同一个用户的下的约束可以被还原,但是如果备份一个用户的外键在其他用户,出发外键所在的其他用户也一起被备份,否则不能还原约束;
For example, a schema
with tables whose constraints reference only tables in the same schema can be backed up, but a
schema containing a table with an FK/PK constraint on a table in another schema cannot, unless
you include the other schema in the list of selected objects
11.12 对象基本备份的配置文件
vbr.py默认的配置为:不同的备份文件名称,相同的备份目录;
经常是创建一个集群的配置文件,多个对象的配置文件,指向相同的位置;
保存相同位置的好处:vbr.py有个额外的条款保证,在全库恢复还原的时候,对象基本的备份可以被使用;
Note: Attempting to restore a full database using an object-level configuration file fails,
resulting in this error:
VMart=> /tmp/vbr.py –config-file=Table2.ini -t restore
Preparing…
Invalid metadata file. Cannot restore.
11.13 Backup Epochs
Each backup includes the epoch to which its contents can be restored. This epoch will be close to
the latest epoch, though the epoch could change during the time the backup is being written. If an
epoch change occurs while a backup is being written, the storage is split to indicate the different
epochs.
The vbr.py utility attempts to create an object-level backup five times before an error occurs and
the backup fails.
11.14 最大数量的备份集
每个备份目录,最多500个备份集。
This maximum is set
by rsync, and does not include archives, so the total number of saved backups at the same location
can exceed 500.
For example, if a database has 500 schemas, S1 – S499, the full database, including archives of
earlier snapshots, can be backed up along with backups for each schema.
12 创建Hard Link Local 备份
需要确保:
数据库是运行的
用户比如dbadmin有读写备份目录的权限;
When you create a full- or object-level hard link local backup, these are the backup contents:
Backup Catalog Database files
Full backup : Full Copy Hard file links to all database files
Object-level backup :Full copy Hard file links for all objects
例如:
/opt/vertica/bin/vbr.py –task backup –config fullbak.ini
12.1 指定Hard Link Local Backup的位置
如果参数hardLinkLocal=True,但是备份路径在其他节点上,会报错,停止备份;
12.2 创建Hard Link Local Backups for Tape Storage
【1】创建配置文件:
/opt/vertica/bin/vbr.py –setupconfig
【2】修改配置文件:
[Transmission]选项包含hardLinkLocal=True.
【3】执行备份选项
/opt/vertica/bin/vbr.py –task backup –config-file localbak.ini
【4】复制the hard link local backup directory到其他外部存储介质
【5】当需要恢复还原数据库的时候,
复制备份集到their original backup directory
【6】使用之前的配置文件,进行数据的恢复
/opt/vertica/bin/vbr.py –task restore –config-file localbak.ini
13 中断备份过程
放弃备份,use Ctrl+C or send a SIGINT to the Python process running the backup utility
中断之后:
【1】 备份的文件仍然存在
【2】 重新执行,会断点继续执行The next backup process picks up where the interrupted process left off.
14 查看备份集
【1】 vbr.py来查看备份集。
【2】 监控备份过程查看DATABASE_SNAHPSHOTS
【3】 查看历史的备份信息:DATABASE_BACKUPS
14.1 通过vbr.py列出备份集
dbadmin@node01 temp]$ /opt/vertica/bin/vbr.py –task listbackup –config-file /home/dbadm
in/table2bak.ini
包括:
backup OID, the epoch, which object(s) (if applicable)
were backed up, and the backup host (192.168.223.33) used for each node
14.2 监控database_snapshots
VMart=> select * from database_snapshots;
node_name | snapshot_name | is_durable_snapshot | total_size_bytes | storage_cost
14.3 监控database_backups
注:
However, do not use the backup_timestamp value when restoring an archive.
VMart=> select * from v_monitor.database_backups;
15 还原Full Database Backups
条件为:
【1】The database is down
【2】The node names and the IP addresses must also be identical
【3】数据库必须在集群上创建了。
【4】the database name matches the name in the backup, and all of the node names match the names of the nodes in the configuration file, you can restore to it.
15.1 Restoring the Most Recent Backup
使用数据库管理员账户,不能使用root用;
The following example uses the db.ini configuration file, which includes the superuser’s
password:
vbr.py –task restore –config-file db.ini
Copying…
1871652633 out of 1871652633, 100%
All child processes terminated successfully.
restore done!
15.2 Restoring an Archive
当我们保存了多个备份的时候,使用一个archive来还原;
列出所有的archives,选择一个区还原
vbr.py –listbackup 还需要指定一个配置文件;
–archive参数 加上目录名称的时间的后缀,例如
vbr.py –task restore –config-file fullbak.ini –archive=20121111_205841
因此–archive参数指定了archive的子目录,OID指定了archive中的backup;
An archive can comprise multiple snapshots, including both fulland
object-level backups.
15.3 试图还原一个正在运行的节点;
全库还原的时候,node必须为down状态的;
如果node是up的,会报错的,报错信息如下:
doc:tests/doc/tools $ vbr.py –config-file=doc.ini -t restore –nodes=v_doc_node0001
Warning: trying to restore to an UP cluster
Warning: Node state of v_doc_node0001 is UP; node must be DOWN for restore; ignoring
restore on this node.
Nothing to do
restore done!
15.4 试图还原数据库到一个新的集群中
还原全库的时候,node name和ip必须和备份的时候完全一模一样才能还原;
The vbr.py utility does NOT support restoring a full database backup to an
alternate cluster with different host names and IP addresses.
16 还原对象级别的备份
还原数据库对象的时候, 数据库必须是运行的;
另外,不能还原数据库对象到一个空白的数据库中;
You cannot restore an object-level backup into an empty database.
而且,我们只能还原所有的备份,不能还原备份的一部分;
16.1 备份的位置
全库备份和对象备份表存放在一个位置的缺点是,还原全库之后,再还原对象会报错;
Note: Using different backup locations in which to create full- or object-level backups results in
incompatible object-level backups. Attempting to restore an object-level backup after restoring
a full database will fail.
16.2 数据库对象还原的集群要求;
全库备份和对象备份存放在一个位置;数据库down,还原全库备份之后,数据库on,然后还原对象的备份;
对象还原的时候,不用管node的状态,会自动更新,把node加入到集群中,并且启动node;
Regardless of the node states when a backup was taken, you do not have to manage node states
before restoring an object-level backup. Any node that is DOWN when you restore an object-level
backup is updated when the node rejoins the cluster and comes UP after an object-level restore.
16.3 集群拓扑改变之后的还原对象
向集群中添加节点之后,vbr.py支持对象级别的还原;而且新增的node可以被updated;
但是删除node,更改node的name,更改node的ip,vbr.py是不支持还原的;
16.4 Projection Epoch After Restore
所有的对象基本的备份和还原事件被当做DDL事件;
If a table does not participate
in an object-level backup, possibly due to a node being down, restoring the backup has this effect
on the projection:
l Its epoch is reset to 0
l It must recover any data that it does not have (by comparing epochs and other recovery
procedures)
16.5 在备份还原期间Catalog被锁定
HP Vertica交易遵循严格的锁机制来保证数据的完整性;
When restoring an object-level backup into a cluster that is UP, vbr.py begins by copying data and
managing storage containers, potentially splitting the containers if necessary. While copying and
managing data is the largest task in any restore process, it does not require any database locks.
After completing data copying tasks, vbr.py first requires a table object lock (O-lock), and then a
global catalog lock (GCLX).
如果其他数据库进程的DML语句获得了表级锁(O-lock on the table),
vbr.py is blocked from progress until the DML statement completes and
releases the lock。
GCLX 保证了cataLog的数据一致性完整性问题;。
Database system operations, such as the tuple mover (TM) transferring data from
memory to disk, are canceled to permit object-level restore to complete
16.6 Catalog还原事件
每个对象级别的备份都包含了一部分database catalog,称为snippet,它包含可选择对象,依赖对象,父对象等;由于catalog snippet的结构和database的database catalog结构相同,
当对象还原的时候,catalog shippet也会更新local和global catalog;
17 还原Hard Link Local Backups
If you have created both full- and object-level backups and the database fails, first restore the full
database backup. You can then restore from the object-level backups.
17.1 避免OID和Epoch的冲突
If you create full- and object-level backups in the same backup directory (recommended),
进行全库还原的时候,vbr.py也会自动识别最新的oid和epoch的数据库对象备份
vbr.py determines the latest OID and epoch of the object-level backups as
well.
例如:
1. Create a full hard link local backup in backup directory /home/dbadmin/backups, with
configuration file mybak.ini:
[dbadmin@node01 ~] /opt/vertica/bin/vbr.py–taskbackup–config−filemybak.ini2.Createanobject−levelhardlinklocalbackupforTable1usingthesamebackupdirectory,withconfigurationfiletable1bak.ini:[dbadmin@node01 ] /opt/vertica/bin/vbr.py –task backup –config-file table1bak.ini
3. Create an object-level hard link local backup for Table2, using the same backup directory, with
configuration file table2bak.ini.
[dbadmin@node01 ~]$ /opt/vertica/bin/vbr.py –task backup –config-file table2bak.ini
目录为:
还原的时候发生如下的情况:
【1】
vbr.py detects the maximum object ID (OID) and epochs,并且在还原的数据库中更新;
This prevents OID and epoch conflicts from
occurring when object-level backups are restored after the newly restored database
【2】
If the database and object-level table backups are not in the same backup directory, vbr.py
reverts the maximum OID and epoch for a full database restore back to the table OID and epoch.
Further attempts to restore either table1 or table2 backups will then fail, preventing any
potential conflicts.
17.2 Transferring Backups to and From Remote Storage
to another storage media, such as tape。
Complete the following steps to restore hard link local backups from external media:
1. If the original backup directory no longer exists on one or more local backup host nodes,
recreate the directory. The directory structure into which you restore hard link backup files
must be identical to what existed when the backup was created. For example, if you created
hard link local backups at the following backup directory, then recreate that directory structure:
/home/dbadmin/backups/localbak
2. Copy the backup files to their original backup directory, as specified for each node in the
configuration file with the backupHost and backupDir parameters. For example, this
configuration file shows the backupDir parameter for v_vmart_node0001:
[Mapping0]dbNode = v_vmart_node0001
backupHost = node03
backupDir = /home/dbadmin/backups/localbak
3. To restore the latest version of the backup, move the backup files to this directory:
/home/dbadmin/backups/localbak/node_name/snapshotname
4. To restore a different backup version, move the backup files to this directory:
/home/dbadmin/backups/localbak/node_name/snapshotname_archivedate_timestamp
5. When the backup files are returned to their original backup directory, use the original
configuration file to invoke vbr.py as follows:
/opt/vertica/bin/vbr.py –task restore –config-file localbak.ini
If the physical files are restored from tape into the correct directories, and you use the
configuration file that specifies hardLinkLocal = true, restoring the backup succeeds.
18 还原数据库到相同的集群
把一个全库备份还原一个相同的集群的数据库的一般步骤:
1. Stopping the database you intend to restore.
Note: If you restore data to a single node, the node has already stopped. You do not need
to stop the database.
Restoring the Database (vbr.py).
Starting the database.
Note: If you restored all the nodes using the backup, you are using a manual recovery
method. For more information, see the Failure Recovery in the See Also section below.
Administration Tools returns the message, “Database startup failed,” after you attempt to
restart the database and then offers to restart the database from an earlier epoch. Click
Yes.
After the database starts, connect to it through the Administration Tools or MC and verify that it was successfully restored by running some queries.
19 Removing Backups
snapshotName
backupHost
backupDir
dbNode
20 Copying the Database to Another Cluster
例如,在生产环境和开发环境之间,复制数据库。
存放HP Vertica catalog , data, and temp directories的存储路径,在目录库和源库之间,必须完全相同的。查看源库的目录结构:
VMart=> \xExpanded display is on.
VMart=> select node_name,storage_path, storage_usage from disk_storage;
-[ RECORD 1 ]-+—————————————————–
node_name | v_vmart_node0001
storage_path | /home/dbadmin/VMart/v_vmart_node0001_catalog/Catalog
storage_usage | CATALOG
-[ RECORD 2 ]-+—————————————————–
node_name | v_vmart_node0001
storage_path | /home/dbadmin/VMart/v_vmart_node0001_data
storage_usage | DATA,TEMP
-[ RECORD 3 ]-+—————————————————–
node_name | v_vmart_node0001
storage_path | home/dbadmin/SSD/schemas
storage_usage | DATA
-[ RECORD 4 ]-+—————————————————–
node_name | v_vmart_node0001
storage_path | /home/dbadmin/SSD/tables
storage_usage | DATA
-[ RECORD 5 ]-+—————————————————–
node_name | v_vmart_node0001
storage_path | /home/dbadmin/SSD/schemas
storage_usage | DATA
-[ RECORD 6 ]-+—————————————————–
node_name | v_vmart_node0002
storage_path | /home/dbadmin/VMart/v_vmart_node0002_catalog/Catalog
storage_usage | CATALOG
-[ RECORD 7 ]-+—————————————————–
node_name | v_vmart_node0002
storage_path | /home/dbadmin/VMart/v_vmart_node0002_data
storage_usage | DATA,TEMP
-[ RECORD 8 ]-+—————————————————–
node_name | v_vmart_node0002
storage_path | /home/dbadmin/SSD/tables
storage_usage | DATA
20.1 Identifying Node Names for Target Cluster
select node_name from nodes;
You need to know the exact names that Admintools supplied to all nodes in the source database
before configuring the target cluster.
To run Admintools from the command line, enter a command such as this for the VMart database:
$ /opt/vertica/bin/admintools -t node_map -d VMART
20.2 Configuring the Target Cluster
Have the same number of nodes the source cluster.
Have a database with the same name as the source database. The target database can be
completely empty.
Have the same node names as the source cluster. The nodes names listed in the NODES
system tables on both clusters must match.
Be accessible from the source cluster.
Have the same database administrator account, and all nodes must allow a database
administrator of the source cluster to login through SSH without a password.
注:
You need to configure each host in the target cluster to accept the SSH
authentication of the source cluster.
Have adequate disk space for the vbr.py –task copycluster command to complete.
20.3 Creating a Configuration File for CopyCluster
You cannot use an object-level backup with the copycluster command. You must use a full
database backup.
[Misc]
snapshotName = CopyVmart
verticaConfig = False
restorePointLimit = 5
tempDir = /tmp/vbr
retryCount = 5
retryDelay = 1
[Database]
dbName = vmart
dbUser = dbadmin
dbPassword = password
dbPromptForPassword = False
[Transmission]
encrypt = False
checksum = False
port_rsync = 50000
[Mapping]
; backupDir is not used for cluster copy
v_vmart_node0001= test-host01:/home/dbadmin/backups
v_vmart_node0002= test-host02:/home/dbadmin/backups
v_vmart_node0003= test-host03:/home/dbadmin/backups
20.4 Copying the Database
The target cluster must be stopped before you invoke copycluster
目标集群需要stop
To copy the cluster, run vbr.py from a node in the source database using the database
administrator account, passing the –task copycluster –config-file CopyVmart.ini
command.
vbr.py –config-file CopyVmart.ini –task copyclusterCopying…
1871652633 out of 1871652633, 100%
All child processes terminated successfully.
copycluster done!
21 Backup and Restore Utility Reference
21.1 VBR Utility Reference
语法:
/opt/vertica/bin/vbr.py { command }
… [ –archive file ]
… [ –config-file file ]
… [ –debug