数据库主机:使用VMware workstation 虚拟两台机器
操作系统:Red Hat EL 5.3
数据库软件:DB2 V9.7
虚拟机,操作系统环境安装过程省去。
2.2逻辑拓扑结构
2.3网络配置 2.3.1主机的IP配置
Linux1 130.30.3.252(255.255.255.0)
Linux2 130.30.3.136(255.255.255.0)
2.3.2/etc/hosts文件配置
Linux1的/etc/hosts
130.30.3.252 linux1
130.30.3.136 linux2
Linux2的/etc/hosts
130.30.3.252 linux1
130.30.3.136 linux2
2.3.3验证网络连通性
验证,确保网络连通性是否正常及配置正确
Ping linux1
Ping linux2
Ping 130.30.3.252
Ping 130.30.3.136
3DB2软件安装 3.1系统内核参数调整
用root用户登录,编辑/etc/sysctl.conf文件,修改需调整的内核参数,增加如下:
kernel.sem=250 256000 32 1024
kernel.shmmax=268435456
kernel.shmall=8388608
kernel.msgmax=65535
kernel.msgmnb=65535
kernel.shmall = 2097152
kernel.shmmax = 2147483648
kernel.shmmni = 4096
# semaphores: semmsl, semmns, semopm, semmni
kernel.sem = 250 32000 100 128
fs.file-max = 65536
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default=262144
net.core.rmem_max=262144
net.core.wmem_default=262144
net.core.wmem_max=262144
执行sysctl –p,从/etc/sysctl.conf文件装入sysctl设置。
运行ipcs –l命令显示验证当前的内核参数设置
# ipcs -l
------ Shared Memory Limits --------
max number of segments = 4096 // SHMMNI
max seg size (kbytes) = 32768 // SHMMAX
max total shared memory (kbytes) = 8388608 // SHMALL
min seg size (bytes) = 1
------ Semaphore Limits --------
max number of arrays = 1024 // SEMMNI
max semaphores per array = 250 // SEMMSL
max semaphores system wide = 256000 // SEMMNS
max ops per semop call = 32 // SEMOPM
semaphore max value = 32767
------ Messages: Limits --------
max queues system wide = 1024 // MSGMNI
max size of message (bytes) = 65536 // MSGMAX
default max size of queue (bytes) = 65536 // MSGMNB
3.2用户资源限制调整
设置DB2实例用户Data,nofiles,fsize资源的操作系统硬限制为无限制。
可通过修改文件/etc/security/limits.conf设置
* soft nproc 3000
* hard nproc 16384
* soft nofile 65536
* hard nofile 65536
可通过ulimit –Hdnf命令查询值的限制
3.3创建文件系统
在主节点创建以下文件系统:
创建PV,卷组
[root@linux1 db2]# pvcreate /dev/sdb1
[root@linux1 db2]# vgcreate datavg /dev/sdb1
新建LV,文件系统
[root@linux1 ~]# lvcreate -n db2homelv -L 1G /dev/datavg
Logical volume "db2homelv" created
[root@linux1 ~]# lvcreate -n hafs01lv -L 2G /dev/datavg
Logical volume "hafs01lv" created
# mkfs.ext3 /dev/datavg/db2homelv
# mkfs.ext3 /dev/datavg/hafs01lv
# mkdir /db2home
# mkdir /db2bak
挂载文件系统
# mount /dev/datavg/db2homelv /db2home
# mount /dev/datavg/hafs01lv /db2bak
增加挂载点条目
# vi /etc/fstab
/dev/datavg/db2homelv /db2home ext3 defaults 1 2
/dev/datavg/hafs01lv /db2bak ext3 defaults 1 2
在备用主机节点创建以下文件系统
[root@linux2 ~]# vgcreate datavg /dev/sdb1
Volume group "datavg" successfully created
[root@linux2 ~]# lvcreate -n db2homelv -L 1G /dev/datavg
Logical volume "db2homelv" created
[root@linux2 ~]# mkfs.ext3 /dev/datavg/db2homelv
[root@linux2 ~]# mkdir /db2home
[root@linux2 ~]# mount /dev/datavg/db2homelv /db2home
[root@linux2 ~]# vi /etc/fstab
/dev/datavg/db2homelv /db2home ext3 defaults 1 2
3.4创建用户及组
分别在主、备节点机器上创建以下用户组及用户,确保两台机器使用完全相同的UID,GID。
新建用户组
通过输入下列命令,为实例所有者创建一个组(例如,db2iadm1),为将要执行 UDF或存储过程的用户创建一个组(例如,db2fadm1),并为管理服务器创建一个组(例如,dasadm1):
groupadd -g 999 db2iadm1
groupadd -g 998 db2fadm1
groupadd -g 997 dasadm1
新建用户
通过使用下列命令,为前一步骤中创建的每个组创建一个用户。每个用户的主目录将是先前创建的共享DB2主目录(db2home)。
useradd -u 999 -g db2iadm1 -m -d /db2home/db2inst1 db2inst1
useradd -u 998 -g db2fadm1 -m -d /db2home/db2fenc1 db2fenc1
useradd -u 997 -g dasadm1 -m -d /db2home/dasusr1 dasusr1
设置新建用户的初始密码
# passwd db2inst1
# passwd db2fenc1
# passwd dasusr1
修改文件系统属主
# chown db2inst1:db2iadm1 /db2home
# chown db2inst1:db2iadm1 /db2bak
3.5安装DB2软件
依次在主、备节点分别完成DB2数据库软件的安装,一般建议使用root安装,非root用户安装会有一些限制。
解压安装下载的安装介质
[root@linux1 db2]# tar zxvf v9.7_linuxia32_server.tar.gz
使用db2_install命令执行安装
[root@linux1 server]# ./db2_install
选择DB2安装目录,可以更改安装目录,linux下缺省安装目录为/opt/ibm/db2/V9.7
用于安装产品的缺省目录 - /opt/ibm/db2/V9.7
***********************************************************
要选择另一个目录用于安装吗?[是/否]
否
安装DB2产品选择ESE
指定下列其中一个关键字以安装 DB2产品。
ESE
CONSV
WSE
EXP
PE
CLIENT
RTCL
按“帮助”以重新显示产品名称。
按“退出”以退出。
***********************************************************
ESE
等待安装执行过程,完成后会提示安装成功信息
正在初始化 DB2安装。
要执行的任务总数为:46
要执行的所有任务的总估计时间为:1876
任务 #1启动
描述:正在检查许可协议的接受情况
估计时间 1秒
任务 #1结束
…………………………………
…………………………………S
已成功完成执行。
有关更多信息,请参阅 "/tmp/db2_install.log.18940"上的 DB2安装日志。
3.6配置NTP网络时间同步
建议同步集群节点上的时间和日期(但不是必须的)。
4配置NFS文件系统 4.1配置NFS服务器
在主服务器上通过在 /etc/exports文件中添加以下条目,可以在启动时通过一个 NFS服务导出这个文件系统
[root@linux1 db2bak]# vi /etc/exports
/db2bak *(rw,no_root_squash)
启动NFS服务
[root@linux1 db2bak]#service nfs start
[root@linux1 db2bak]#service portmap start
执行 exportfs命令,使将挂载的 NFS客户机能使用配置的/db2bak共享目录
/usr/sbin/exportfs -a
其中选项 a 用于导出 /etc/exports文件中列出的所有目录。
4.2配置NFS客户机
用以下命令在在备机上创建共享目录:
[root@linux2 ~]# mkdir /db2bak
添加一个条目到 /etc/fstab文件,使 NFS在启动时自动挂载文件系统:
130.30.3.252:/db2bak /db2bak nfs rw,timeo=300,retrans=5,hard,intr,bg,suid
用以下命令在备机数据库服务器上挂载导出的文件系统:
[root@linux2 ~]# mount -t nfs 130.30.3.252:/db2bak /db2bak
或[root@linux2 server]# mount /db2bak
5创建示例数据库 5.1创建数据库实例
分别在主、备节点完成数据库实例创建,修改参数
[root@linux1 instance]cd /opt/ibm/db2/V9.7/instance
[root@linux1 instance]# ./db2icrt -u db2fenc1 db2inst1
DBI1070I Program db2icrt completed successfully.
[root@linux1 instance]# ./db2ilist
db2inst1
[root@linux1 instance]# su - db2inst1
[db2inst1@linux1 ~]$ db2start
01/27/2011 17:37:46 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
[db2inst1@linux1 ~]$ db2set db2comm=tcpip
[db2inst1@linux1 ~]$ db2 update dbm cfg using svcename DB2_db2inst1
DB20000I The UPDATE DATABASE MANAGER CONFIGURATION command completed successfully.
[db2inst1@linux1 ~]$ cat /etc/services |grep DB2_
DB2_db2inst1 60000/tcp
DB2_db2inst1_1 60001/tcp
DB2_db2inst1_2 60002/tcp
DB2_db2inst1_END 60003/tcp
5.2创建管理服务器
分别在主、备节点创建数据库管理服务器
[root@linux1 ~]# cd /opt/ibm/db2/V9.7/instance
[root@linux1 instance]# ./dascrt -u dasusr1
SQL4406W The DB2 Administration Server was started successfully.
DBI1070I Program dascrt completed successfully.
[root@linux1 instance]# su - dasusr1
[dasusr1@linux1 ~]$ db2admin start
SQL4409W The DB2 Administration Server is already active.
5.3创建示例数据库
在主节点创建sample数据库
[db2inst1@linux1 ~]$ db2sampl -dbpath /db2home/db2inst1/sample
6HADR数据库配置 6.1修改主数据库参数
修改以下数据库参数:
db2 UPDATE DB CFG FOR sample USING INDEXREC RESTART LOGINDEXBUILD ON LOGARCHMETH1 DISK:/db2bak/logs LOGPRIMARY 20 LOGSECOND 20 AUTORESTART OFF TRACKMOD ON
[db2inst1@linux1 ~]$ db2 UPDATE DB CFG FOR sample USING INDEXREC RESTART LOGINDEXBUILD ON LOGARCHMETH1 DISK:/db2bak/logs LOGPRIMARY 20 LOGSECOND 20 AUTORESTART OFF
DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully.
6.2在主服务器上备份数据库
在主机上执行脱机备份数据库
[db2inst1@linux1 db2bak]$ db2 deactivate db sample
DB20000I The DEACTIVATE DATABASE command completed successfully.
[db2inst1@linux1 db2bak]$ db2 backup db sample to /db2bak/ with 2 buffers buffer 1024 parallelism 4 without prompting
Backup successful. The timestamp for this backup image is : 20110309104324
[db2inst1@linux1 db2bak]$ ls
logs lost+found SAMPLE.0.db2inst1.NODE0000.CATN0000.20110309104324.001
[db2inst1@linux1 db2bak]$ db2ckbkp SAMPLE.0.db2inst1.NODE0000.CATN0000.20110309104324.001
[1] Buffers processed: ###################################
Image Verification Complete - successful.
6.3在备用服务器上恢复数据库
通过主数据库备份进行完全恢复,完成备用数据库的创建
[root@linux2 instance]# su - db2inst1
[db2inst1@linux2 ~]$ cd /db2bak
[db2inst1@linux2 ~]$ mkdir /db2home/db2inst1/sample
[db2inst1@linux2 ~]$ ls
sample sqllib
[db2inst1@linux2 ~]$ cd /db2bak
[db2inst1@linux2 db2bak]$ ls
logs lost+found SAMPLE.0.db2inst1.NODE0000.CATN0000.20110309104324.001 SAMPLE.0.db2inst1.NODE0000.CATN0000.20110310105829.001
[db2inst1@linux2 db2bak]$ db2 restore db sample from /db2bak taken at 20110310105829 replace history file
DB20000I The RESTORE DATABASE command completed successfully.
6.4修改/etc/services配置文件
分别在主、备服务器上修改服务配置文件,增加用于HADR通讯使用的服务名称和端口。
– Service name: DB2_HADR_1
– Port number: 55001
– Service name: DB2_HADR_2
– Port number: 55002
在主服务器linux1上添加如下服务
[root@linux1 ~]# cat /etc/services|grep DB2_
DB2_db2inst1 60000/tcp
DB2_db2inst1_1 60001/tcp
DB2_db2inst1_2 60002/tcp
DB2_db2inst1_END 60003/tcp
DB2_HADR_1 55001/tcp
DB2_HADR_2 55002/tcp
在备服务器linux2上添加如下服务
[root@linux2 ~]# cat /etc/services|grep DB2_
DB2_db2inst1 60000/tcp
DB2_db2inst1_1 60001/tcp
DB2_db2inst1_2 60002/tcp
DB2_db2inst1_END 60003/tcp
DB2_HADR_1 55001/tcp
DB2_HADR_2 55002/tcp
注:这一步不是必须的,因为在下面配置HADR_LOCAL_SVC和HADR_REMOTE_SVC数据库参数的时候您可以直接使用端口号来替代服务名。
6.5修改主服务器的HADR参数
db2 update db cfg for sample using HADR_LOCAL_HOST 130.30.3.252
db2 update db cfg for sample using HADR_LOCAL_SVC DB2_HADR_1
db2 update db cfg for sample using HADR_REMOTE_HOST 130.30.3.136
db2 update db cfg for sample using HADR_REMOTE_SVC DB2_HADR_2
db2 update db cfg for sample using HADR_REMOTE_INST DB2INST1
db2 update db cfg for sample using HADR_SYNCMODE NEARSYNC
db2 update db cfg for sample using HADR_TIMEOUT 10
db2 update db cfg for sample using HADR_PEER_WINDOW 120
db2 connect to sample
db2 quiesce database immediate force connections
db2 unquiesce database
db2 connect reset
6.6修改备用服务器的HADR参数
db2 update db cfg for sample using HADR_LOCAL_HOST 130.30.3.136
db2 update db cfg for sample using HADR_LOCAL_SVC DB2_HADR_2
db2 update db cfg for sample using HADR_REMOTE_HOST 130.30.3.252
db2 update db cfg for sample using HADR_REMOTE_SVC DB2_HADR_1
db2 update db cfg for sample using HADR_REMOTE_INST db2inst1
db2 update db cfg for sample using HADR_SYNCMODE NEARSYNC
db2 update db cfg for sample using HADR_TIMEOUT 10
db2 update db cfg for sample using HADR_PEER_WINDOW 120
6.7启动HADR
在备用服务器上启动HADR备用数据库
db2 deactivate database sample
db2 start hadr on database sample as standby
[db2inst1@linux2 db2bak]$ db2 deactivate database sample
SQL1496W Deactivate database is successful, but the database was not activated.
[db2inst1@linux2 db2bak]$ db2 start hadr on database sample as standby
DB20000I The START HADR ON DATABASE command completed successfully.
在主服务器上启动HADR主数据库
db2 deactivate database sample
db2 start hadr on database sample as primary
[db2inst1@linux1 db2bak]$ db2 deactivate database sample
SQL1496W Deactivate database is successful, but the database was not activated.
[db2inst1@linux1 db2bak]$ db2 start hadr on database sample as primary
DB20000I The START HADR ON DATABASE command completed successfully.
注意:
在启动 HADR时,一定要先在备用服务器上启动 HADR服务,然后在主服务器上启动。同样,在停止 HADR时,要先在主服务器上停止服务,然后是备用服务器。
如果你先启动主数据库服务器HADR,那么你必须保证在HADR_TIMEOUT参数指定的时间内(单位为秒)启动备用数据库服务器HADR。否则将启动失败。
6.8查看HADR运行状态
现在,已经完成了 HADR设置,就应该检查它是否能够正常工作。
在主服务器linux1上执行以下命令:
db2 get snapshot for db on sample
输出如下:
HADR Status
Role = Primary
State = Peer
Synchronization mode = Nearsync
Connection status = Connected, 2011-03-09 11:01:55.144160
Peer window end = 2011-03-09 11:06:34.000000 (1299639994)
Peer window (seconds) = 120
Heartbeats missed = 0
Local host = 130.30.3.252
Local service = DB2_HADR_1
Remote host = 130.30.3.136
Remote service = DB2_HADR_2
Remote instance = DB2INST1
timeout(seconds) = 3
Primary log position(file, page, LSN) = S0000000.LOG, 0, 00000000045F81C1
Standby log position(file, page, LSN) = S0000000.LOG, 0, 00000000045F81C1
Log gap running average(bytes) = 0
在主服务器linux1上执行以下命令:
db2 get snapshot for db on sample
输出如下:
HADR Status
Role = Standby
State = Peer
Synchronization mode = Nearsync
Connection status = Connected, 2011-03-09 11:01:54.978888
Peer window end = 2011-03-09 11:08:24.000000 (1299640104)
Peer window (seconds) = 120
Heartbeats missed = 0
Local host = 130.30.3.136
Local service = DB2_HADR_2
Remote host = 130.30.3.252
Remote service = DB2_HADR_1
Remote instance = db2inst1
timeout(seconds) = 3
Primary log position(file, page, LSN) = S0000000.LOG, 0, 00000000045F81C1
Standby log position(file, page, LSN) = S0000000.LOG, 0, 00000000045F81C1
Log gap running average(bytes) = 0
也可以通过db2pd –db sample –hadr命令来查看HADR运行状态。
7HADR接管测试
设置过程的最后一步是测试 HADR的故障恢复功能。可以手工在备用服务器上执行普通接管命令
db2 takeover hadr on database sample
如果一般的接管不起作用,就需要指定 BY FORCE选项,强迫 db2切换到备用服务器上的HADR(如主数据库主机异常情况下启用备用服务器接管数据库)。
7.1配置客户端重新路由
在主服务器linux1上,执行以下命令启用 HADR的自动客户机重路由特性:
db2 update alternate server for database sample using hostname 130.30.3.136 port 60000
[db2inst1@linux1 db2bak]$ db2 update alternate server for database sample using hostname 130.30.3.136 port 60000
DB20000I The UPDATE ALTERNATE SERVER FOR DATABASE command completed
successfully.
DB21056W Directory changes may not be effective until the directory cache is
refreshed.
[db2inst1@linux1 db2bak]$ db2 list db directory
System Database Directory
Number of entries in the directory = 1
Database 1 entry:
Database alias = SAMPLE
Database name = SAMPLE
Local database directory = /db2home/db2inst1/sample
Database release level = d.00
Comment =
Directory entry type = Indirect
Catalog database partition number = 0
Alternate server hostname = 130.30.3.136
Alternate server port number = 60000
在备用服务器linux2上,执行以下命令启用 HADR的自动客户机重路由特性:
db2 update alternate server for database sample using hostname 130.30.3.252 port 60000
[db2inst1@linux2 db2bak]$ db2 update alternate server for database sample using hostname 130.30.3.252 port 60000
DB20000I The UPDATE ALTERNATE SERVER FOR DATABASE command completed
successfully.
DB21056W Directory changes may not be effective until the directory cache is
refreshed.
[db2inst1@linux2 db2bak]$ db2 list db directory
System Database Directory
Number of entries in the directory = 1
Database 1 entry:
Database alias = SAMPLE
Database name = SAMPLE
Local database directory = /db2home/db2inst1
Database release level = d.00
Comment =
Directory entry type = Indirect
Catalog database partition number = 0
Alternate server hostname = 130.30.3.252
Alternate server port number = 60000
7.2在客户端机器编目
db2 catalog tcpip node testhadr remote 130.30.3.252 server 60000
db2 catalog db sample as testhadr at node testhadr
7.3备用服务器正常接管
1.在主数据库上创建测试表
C:\Documents and Settings\fengsh>db2 connect to testhadr user db2inst1 using db2inst1
db2 => CREATE TABLE HADRTEST(ID INTEGER NOT NULL WITH DEFAULT,NAME VARCHAR(10),PRIMARY KEY (ID))
DB20000I SQL命令成功完成。
db2 => INSERT INTO HADRTEST (ID,NAME) VALUES (1,'张三')
DB20000I SQL命令成功完成。
db2 => INSERT INTO HADRTEST (ID,NAME) VALUES (2,'李四')
DB20000I SQL命令成功完成。
db2 => select * from hadrtest
ID NAME
----------- ----------
1张三
2李四
2条记录已选择。
db2 =>
2.在备用服务器上接管主数据库
[db2inst1@linux2 db2bak]$ db2 takeover hadr on database sample
DB20000I The TAKEOVER HADR ON DATABASE command completed successfully.
获取快照发现,备用服务器上HADR已经接管,成为primay数据库
[db2inst1@linux2 db2bak]$ db2 get snapshot for db on sample|more
HADR Status
Role = Primary
State = Peer
Synchronization mode = Nearsync
Connection status = Connected, 2011-03-09 14:55:55.852957
Peer window end = 2011-03-09 15:01:08.000000 (1299654068)
Peer window (seconds) = 120
Heartbeats missed = 0
Local host = 130.30.3.136
Local service = DB2_HADR_2
Remote host = 130.30.3.252
Remote service = DB2_HADR_1
Remote instance = db2inst1
timeout(seconds) = 3
Primary log position(file, page, LSN) = S0000001.LOG, 0, 00000000049DB3A1
Standby log position(file, page, LSN) = S0000001.LOG, 0, 00000000049DB3A1
Log gap running average(bytes) = 0
而对应原来主服务器上数据库成为standby数据库
[db2inst1@linux1 ~]$ db2 get snapshot for database on sample
HADR Status
Role = Standby
State = Peer
Synchronization mode = Nearsync
Connection status = Connected, 2011-03-09 14:55:56.222980
Peer window end = 2011-03-09 15:03:16.000000 (1299654196)
Peer window (seconds) = 120
Heartbeats missed = 0
Local host = 130.30.3.252
Local service = DB2_HADR_1
Remote host = 130.30.3.136
Remote service = DB2_HADR_2
Remote instance = DB2INST1
timeout(seconds) = 3
Primary log position(file, page, LSN) = S0000001.LOG, 0, 00000000049DB3A1
Standby log position(file, page, LSN) = S0000001.LOG, 0, 00000000049DB3A1
Log gap running average(bytes) = 0
3.客户端应用程序验证
客户端程序无法连原主数据库后,会自动重新路由连接到备用数据库,不用客户端重新连库。
C:\Documents and Settings\fengsh>db2 "select * from hadrtest"
SQL30108N 连接失败,但是已经重新建立了连接。主机名或 IP地址为
"130.30.3.136",服务名称或端口号为
"60000"。可能会也可能不会重新尝试专用寄存器(原因码 = "1")。 SQLSTATE=08506
C:\Documents and Settings\fengsh>db2 "select * from hadrtest"
ID NAME
----------- ----------
1张三
2李四
2条记录已选择。
重新在原主服务器上接管数据库,恢复最初主备数据库服务器状态,验证客户程序的连接,也自动完成重新路由。
C:\Documents and Settings\fengsh>db2 "select * from hadrtest"
SQL30108N 连接失败,但是已经重新建立了连接。主机名或 IP地址为
"130.30.3.252",服务名称或端口号为
"60000"。可能会也可能不会重新尝试专用寄存器(原因码 = "1")。 SQLSTATE=08506
C:\Documents and Settings\fengsh>db2 "select * from hadrtest"
ID NAME
----------- ----------
1张三
2李四
2条记录已选择。
7.4主数据库异常时接管
在主数据库宕库,主数据库主机故障,网络故障等异常情况下导致主数据库问题是的接管测试。
1. 用db2_kill命令手工关闭主服务器
[db2inst1@linux1 db2dump]$ db2_kill
ipclean: Removing DB2 engine and client's IPC resources for db2inst1.
在备库服务器上执行db2 get snapshot for db on sample获取快照信息
HADR Status
Role = Standby
State = Disconnected peer
Synchronization mode = Nearsync
Connection status = Disconnected, 2011-03-10 16:54:28.400277
Peer window end = 2011-03-10 17:00:29.000000 (1299747629)
Peer window (seconds) = 120
Heartbeats missed = 0
Local host = 130.30.3.136
Local service = DB2_HADR_2
Remote host = 130.30.3.252
Remote service = DB2_HADR_1
Remote instance = db2inst1
timeout(seconds) = 10
Primary log position(file, page, LSN) = S0000006.LOG, 59, 0000000005D860C4
Standby log position(file, page, LSN) = S0000006.LOG, 59, 0000000005D860C4
Log gap running average(bytes) = 0
2.在备用服务器上接管HADR主数据库
[db2inst1@linux2 sample]$ db2 takeover hadr on database sample
SQL1770N Takeover HADR cannot complete. Reason code = "1".
[db2inst1@linux2 sample]$ db2 takeover hadr on database sample by force
DB20000I The TAKEOVER HADR ON DATABASE command completed successfully.
备用机器HADR接管为primary数据库
HADR Status
Role = Primary
State = Disconnected
Synchronization mode = Nearsync
Connection status = Disconnected, 2011-03-10 16:54:28.400277
Peer window end = Null (0)
Peer window (seconds) = 120
Heartbeats missed = 0
Local host = 130.30.3.136
Local service = DB2_HADR_2
Remote host = 130.30.3.252
Remote service = DB2_HADR_1
Remote instance = db2inst1
timeout(seconds) = 10
Primary log position(file, page, LSN) = S0000006.LOG, 59, 0000000005D860C4
Standby log position(file, page, LSN) = S0000000.LOG, 0, 0000000000000000
Log gap running average(bytes) = 0
3.客户端应用程序验证
客户端程序无法连原主数据库后,会自动重新路由连接到备用数据库,不用客户端重新连库。
C:\Documents and Settings\fengsh>db2 select * from hadrtest
ID NAME
----------- ----------
1张三
2李四
2条记录已选择。
C:\Documents and Settings\fengsh>db2 select * from hadrtest
SQL30108N 连接失败,但是已经重新建立了连接。主机名或 IP地址为
"130.30.3.136",服务名称或端口号为
"60000"。可能会也可能不会重新尝试专用寄存器(原因码 = "1")。 SQLSTATE=08506
C:\Documents and Settings\fengsh>db2 select * from hadrtest
ID NAME
----------- ----------
1张三
2李四
2条记录已选择。
C:\Documents and Settings\fengsh>db2 select * from hadrtest
SQL30108N 连接失败,但是已经重新建立了连接。主机名或 IP地址为
"130.30.3.252",服务名称或端口号为
"60000"。可能会也可能不会重新尝试专用寄存器(原因码 = "1")。 SQLSTATE=08506
C:\Documents and Settings\fengsh>db2 select * from hadrtest
ID NAME
----------- ----------
1张三
2李四
2条记录已选择。
7.5原主服务器恢复接管
在主数据库宕库,主数据库主机故障,网络故障等异常情况下导致HADR数据库故障接管;待主服务器恢复后,重新进行hadr主数据库接管,恢复最初主备状态。
1.启动实例
[db2inst1@linux1 db2dump]$ db2start
03/10/2011 17:05:54 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
2.启动hadr到standby数据库
[db2inst1@linux1 db2dump]$ db2 start hadr on database sample as standby
DB20000I The START HADR ON DATABASE command completed successfully.
3.重新接管恢复为primary数据库
[db2inst1@linux1 db2dump]$ db2 takeover hadr on db sample
DB20000I The TAKEOVER HADR ON DATABASE command completed successfully.
4.客户端应用程序验证
客户端程序自动重新路由连接到主服务器的primary数据库。
8 其他问题 8.1HADR设计限制
1. 仅在 DB2 UDB企业服务器版本(ESE)上支持 HADR。但是,当 ESE上有多个数据库分区时,不支持 HADR。
2. 主数据库和备用数据库主机的操作系统、db2软件的版本,级别必须相同,交替卷动升级过程中较短时间除外。
3. 主数据库和备用数据库上的 DB2 UDB发行版必须具有相同的位大小(32位或 64位)。
4. 日志要使用独立的、高性能的磁盘或文件系统归档日志的位置,主从节点必须都能访问的到
5. HADR通讯最好使用独立的网络
6. 考虑使用多块网卡
7. 考虑提供一个虚拟IP访问数据库服务器
8. 考虑使用客户端自动路由――虚拟IP和客户端自动路由二选一即可,不可同时使用。
客户端自动路由与虚拟IP相比,它是DB2软件集成的,不需其他硬件或软件支持。
9. 选择合适的hadr_syncmode模式
10. 在V9.7前版本不支持备用数据库上的读操作,客户机无法与备用数据库连接,V9.7支持备用数据库查询模式打开。
8.2基本维护操作 8.2.1启动HADR
1. 在备用服务器上启动备用数据库上的HADR
db2 deactivate database sample
db2 start hadr on database sample as standby
2. 在主服务器上启动主数据库上的HADR
db2 deactivate database sample
db2 start hadr on database sample as primary
8.2.2停止HADR
1.在主服务器上停止主数据库的HADR
db2 deactivate database sample
db2 stop hadr on database sample
2. 在备用服务器上停止备用数据库的HADR
db2 deactivate database sample
db2 stop hadr on database sample
8.2.3HADR接管
在备用服务器上执行数据库普通接管命令
db2 takeover hadr on database sample
如果普通接管失败,或者主数据库故障时,需执行强制接管命令
db2 takeover hadr on database sample by force
8.3HADR复制的操作
以下操作将会通过HADR从主数据库复制到standby库
Data Definition Language (DDL):
– Includes create and drop table, index, and more
Data Manipulation Language (DML):
– Insert, update, or delete
Buffer pool:
– Create and drop
Table space:
– Create, drop, or alter.
– Container sizes, file types (raw device or file system), and paths must be identical on the primary and the standby.
– Table space types (DMS or SMS) must be identical on both the servers.
– If the database is enabled for automatic storage, then the storage paths must be identical.
Online REORG:
– Replicated. The standby’s Peer state with the primary is rarely impacted
by this, although it can impact the system because of the increase in the number of log records generated.
Offline REORG:
- Replicated.
Stored procedures and user defined functions (UDF):
– Creation statements are replicated.
– HADR does not replicate the external code for stored procedures or UDF object library files. In order to keep these in synchronization between the primary and the standby, it is the user’s responsibility to set these up on identical paths on both the primary and the standby servers. If the paths are not identical, invocation fails on the standby.
8.4HADR不复制的操作
以下操作HADR不会复制standby数据库:
_NOT LOGGED INITIALLY tables:
– Tables created or altered with the NOT LOGGED INITIALLY option
specified are not replicated. The tables are invalid on the standby and
after a takeover, attempts to access these tables result in an error.
_BLOBs and CLOBs:
– Non-logged Large Objects (LOBs) are not replicated, but the space is
allocated and LOB data is set to binary zeroes on the standby.
_Data links:
– NOT SUPPORTED in HADR.
_Database configuration changes:
– UPDATE DB CFG is not replicated.
– Dynamic database configuration parameters can be updated to both the
primary and the standby without interruption. For the database
configuration parameters that need a recycle of the database we
recommend you follow the rolling upgrade steps. See 7.1, “DB2 fix pack
rolling upgrades” on page 177.
_Recovery history file:
– NOT automatically shipped to standby.
– You can place an initial copy of the history file by using a primary backup
image to restore the history file to the standby using the RESTORE
command with the REPLACE HISTORY FILE option for example:
db2 restore database sample replace history file
– You can also update the history file on the standby from a primary backup
image by the RESTORE command for example:
db2 restore database sample history file
– If a takeover operation occurs and the standby database has an
up-to-date history file, backup and restore operations on the new primary
generate new records in the history file and blend seamlessly with the
records generated on the original primary. If the history file is out-of-date
or has missing entries, an automatic incremental restore might not be
possible; instead, a manual incremental restore operation is required.
We think that the recovery history file for the primary database is not
relevant to the standby database, because it contains DB2 recovery
information specific to that primary server; DB2 keeps records of which
backup file to use for recovery to prior points in time, among other things.
102 High Availability and Disaster Recovery Options for DB2 on Linux, UNIX, and Windows
In most HADR implementations, the standby system is only used in the
primary mode as a temporary measure, while the old primary server is
being fixed, making the primary version of the recovery history file
irrelevant; also, because of potentially different log file numbers and
locations between the primary and the standby, incorrect for use on the
standby server.