redhat 6.1上purescale 10.1安装手册

db2 purecale是ibm推出的基于共享磁盘的,可扩展的数据库集群软件,类似于oracle的rac。具体功能这里就不过多介绍了,直接介绍在rehdat上安装purescale的实战经历

 1      安装手册

软硬件版本

IB交换机:Mellanox

PC服务器:X3850 X5

存储版本:DS5100

OS版本:RHEL6.1

DB2版本:db2 purescale10.1.0.2

 

建议使用purescale认证的软硬件版本,否则可能会碰到各种莫名其妙的问题

1.1    规划

 

主机名

IP地址

管理

IB地址

CF

member

 

Wtydb21

192.168.2.101

192.168.1.101

192.168.0.101

primary

YES

 

Wtydb22

192.168.2.102

192.168.1.102

192.168.0.102

second

YES

 

Wtydb23

192.168.2.103

192.168.1.103

192.168.0.103

 

YES

 

 

IB地址规划

 

管理IP

浮动IP

IB

 

Switch-1

192.168.1.104

192.168.1.106

192.168.0.201

 

Switch-2

192.168.1.105

 

192.168.0.202

 

 

 

1.2    安装前环境准备

1.2.1  Nfs+yum源配置

1.2.1.1 配置NFS

Wtydb21为NFS server,其它主机为client,为了在各服务器安装软件包方便,建议配置yum

1,手工安装nfs软件包

rpm –qa nfs

修改为nfs默认打开,所有服务器设置

[root@stydb1 ~]# chkconfig --list nfs

nfs             0:off   1:off   2:off   3:off   4:off   5:off   6:off

[root@stydb1 ~]# chkconfig nfs on

[root@stydb1 ~]# chkconfig --list nfs

nfs             0:off   1:off   2:on    3:on    4:on    5:on    6:off  

service portmap start

service nfs start

service rpcbind start

 

2, 确保nfs server端防火墙处于关闭状态

service iptables stop

3,nfs server端配置

A,Mount NFS磁盘mount nfs disks,本例共享/data目录

mount /dev/mapper/vg_1-lv_data /data

B,vi /etc/fstab,添加如下一行,修改为自动挂载

/dev/mapper/vg_1-lv_data /data                   ext4    defaults        1 2

C,nfs共享/data目录

vi /etc/exports

/data   *(rw)

/data   wtydb22(rw) wtydb23(rw)

/usr/sbin/exportfs -r

4,客户端设置

A,查看nfs服务器端的设置

[root@wtydb22 ~]# showmount -e wtydb21

Export list for wtydb21:

/data wtydb23,wtydb22

 

B,vi /etc/fstab

wtydb21:/data /data nfs rw

rpm -qa|grep nfs

C,手工mount

mkdir /data

mount wtydb21:/data/  /data/

1.2.1.2 配置yum源

1,新建目录mkidr/data/rhel6.1

 cp 光盘内容到/data/rhel6.1目录,也可以直接使用光盘或者iso文件的介质文件

2,配置yum(三台服务器上配置)

vi/etc/yum.repos.d/rhel6.repo

[Server]

name=rhel6server

baseurl=file:///data/rhel6.1/Server/

enable=1

gpcheck=1

gpgkey=file:///data/rhel6.1/RPM-GPG-KEY-redhat-release

3,测试

yum list

 

 

也可通过iso文件实现

mount -o loop /software/rhel-server-6.1-x86_64-dvd.iso /mnt/cdrom

 

vi /etc/yum.repos.d/rhel6.repo

[Server]

name=rhel6server

baseurl=file:///mnt/cdrom/Server/

enable=1

gpcheck=1

gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release

 

 

1.2.2  ssh互通

1, 生成ssh key,在~/.ssh/目录下会生成两个文件

ssh-keygen -t dsa,一路默认回车

在.ssh目录下

id_dsa (the private key) and id_dsa.pub (the public key) for DSA encryption

 

2,把三台主机的id_dsa.pub文件的内容复制到authorized_keys文件中,把authorized_keys放于三台主机的.ssh目录下

1.2.3  GPFS安装配置(可选)

正常情况下,GPFS不需要单独安装,随着安装db2 purescale软件会自动完成gpfs安装。.

1.2.3.1 手工安装

[root@wtydb22 data]# cd /data/esefp2/server/db2/linuxamd64/gpfs

[root@wtydb22 gpfs]# ls

base  db2ckgpfs  db2gutil  errMsg  fp  installGPFS  uninstallGPFS

[root@wtydb22 gpfs]# ./installGPFS -i -g

"GPFS" is not installed.

 

DBI1070I  Program installGPFS completed successfully.

 

 

 

# vi /tmp/gpfsprofile

wtydb21:quorum

wtydb22:quorum

wtydb23:quorum

 

[root@pure1, /]

mmcrcluster -n /tmp/gpfsprofile -p wtydb21-s wtydb22 -C gpfs_cluster -r /usr/bin/ssh -R /usr/bin/scp

 

# mmcrcluster -n /home/gpfs/gpfs.allnodes-p node1 -s node2 -C gpfs_cluster -r /usr/bin/ssh -R /usr/bin/scp

# cd /opt/ibm/db2/V10.1/bin

# ./db2cluster -cfs -add -license

The license for the shared file system

 

 

./db2cluster -cfs -start -all

 

./db2cluster -cfs -create -filesystem db2fs-disk /dev/dm-3 -mount /db2fs

1.2.3.2 安装配置GPFS

 

1,   安装gpfs软件(三台主机上运行)

本次从db2 purescale安装介质中直接安装,其它安装方法见参考文献

rpm –qa gpfs验证软件是否已经正确安装

2,   cd /usr/lpp/mmfs/src && make Autoconfig && make World && make InstallImages

3,   修改PATH路径(三台主机上运行)

 

在3 台主机上修改HOME目录下的。bash_profile文件,在文件末尾增加

export PATH=$PATH:/usr/lpp/mmfs/bin

3,   新建目录,用做GPFS文件系统(三台主机上运行)

mkdir /db2fs

4,   创建gpfs 群集配置文件

[root@wtydb21 tmp]# vi /tmp/gpfsprofile

wtydb21:quorum-manager

wtydb22:quorum-manager

wtydb23:quorum-manager

5,创建集群,注意指定ssh方式

[root@wtydb21 pam.d]#

[root@wtydb21 gpfs]# mmcrcluster -N /tmp/gpfsprofile  -p wtydb21 -s wtydb22  -C gpfs_cluster -r /usr/bin/ssh -R /usr/bin/scp

Sat Apr  6 12:17:35 CST 2013: mmcrcluster: Processing node wtydb21

Sat Apr  6 12:17:35 CST 2013: mmcrcluster: Processing node wtydb22

Sat Apr  6 12:17:38 CST 2013: mmcrcluster: Processing node wtydb23

mmcrcluster: Command successfully completed

mmcrcluster: Warning: Not all nodes have proper GPFS license designations.

    Use the mmchlicense command to designate licenses as needed.

mmcrcluster: Propagating the cluster configuration data to all

  affected nodes.  This is an asynchronous process.

 

mmcrcluster 命令其中参数含义

-C bgbcrun 设定集群名称

-U bgbc 定义域名

-N /tmp/gpfs/nodefile 指定节点文件名

-p NSD1 指定主NSD 服务器为 NSD1

-s NSD1 指定备NSD 服务器为 NSD1

6,接受许可协议

[root@wtydb21 pam.d]# mmchlicense server --accept -N wtydb21,wtydb22,wtydb23

7,确认创建集群状况

[root@wtydb21 ~]# mmlscluster

[root@wtydb21 gpfs]# mmlscluster

 

GPFS cluster information

========================

  GPFS cluster name:         gpfs_cluster.wtydb21

  GPFS cluster id:           12146727015547904479

  GPFS UID domain:           gpfs_cluster.wtydb21

  Remote shell command:      /usr/bin/ssh

  Remote file copy command:  /usr/bin/scp

 

GPFS cluster configuration servers:

-----------------------------------

  Primary server:    wtydb21

  Secondary server:  wtydb22

 

 Node  Daemon node name  IP address   Admin node name  Designation

-------------------------------------------------------------------

   1   wtydb21           192.168.2.101  wtydb21          quorum-manager

   2   wtydb22           192.168.2.102  wtydb22          quorum-manager

   3   wtydb23           192.168.2.103  wtydb23          quorum-manager

8,生成nsd 盘,使用/dev/dm-3

 

[root@wtydb21 etc]# vi /tmp/nsdprofile

dm-3:::dataAndMetadata::

[root@wtydb21 gpfs]#  mmcrnsd -F /tmp/nsdprofile

mmcrnsd: Processing disk dm-3

mmcrnsd: Propagating the cluster configuration data to all

  affected nodes.  This is an asynchronous process.

 

此时系统自动修改/tmp/gpfsnsdprofile 文件内容如下:

[root@wtydb21 ~]# cat /tmp/nsdprofile

# dm-4:::dataAndMetadata::

gpfs1nsd:::dataAndMetadata:-1::system

 

9,启动集群

[root@wtydb21 /]# mmstartup -a

.

[root@wtydb21 src]# mmgetstate -a -L

 

 Node number  Node name       Quorum  Nodes up  Total nodes  GPFS state  Remarks   

------------------------------------------------------------------------------------

       1      wtydb21            2        2          3       active      quorum node

       2      wtydb22            2        2          3       active      quorum node

       3      wtydb23            2        2          3       active     

10,创建gpfs 文件系统

[root@wtydb21 src]# mmcrfs /db2fs gpfs_lv -F /tmp/nsdprofile -A yes -n 30 -v no

 

The following disks of gpfs_lv will be formatted on node wtydb21:

    gpfs1nsd: size 100829184 KB

Formatting file system ...

Disks up to size 848 GB can be added to storage pool system.

Creating Inode File

Creating Allocation Maps

Creating Log Files

Clearing Inode Allocation Map

Clearing Block Allocation Map

Formatting Allocation Map for storage pool system

Completed creation of file system /dev/gpfs_lv.

mmcrfs: Propagating the cluster configuration data to all

  affected nodes.  This is an asynchronous process.

 

参数含义如下:

/datafs 文件系统 mount 点名

gpfs_lv 指定文件系统 lv 名

-F 指定 NSD 的文件名

-A 自动 mount 选项为 yes

-B 块大小为64K

-n 挂载文件系统的节点估计数30 个

-v 校验建立磁盘是否已有文件系统 为否

11,挂载文件系统

[root@wtydb21 /]# mount /db2fs

[root@wtydb21 /]# df

 

12,设置gpfs 开机自启

[root@wtydb21 /]# mmchconfig autoload=yes

 

13,查询GPFS 配置信息

[root@wtydb21 share]# mmlsconfig

 

[root@wtydb21 share]# mmgetstate -a

1.2.3.3 修改参数配置

db2 purescale要求必须符合以下参数配置

[root@wtydb21 src]# /opt/ibm/db2/V10.1/bin/db2cluster -cfs -verify -configuration

The shared file system cluster option 'adminMode' has not been set to the optimal value of 'allToAll'.  See the DB2 Information Center for more details.

The shared file system cluster option 'maxFilesToCache' has been set too low.  The value should be at least '10000'.  See the DB2 Information Center for more details.

The shared file system cluster option 'usePersistentReserve' has not been set to the optimal value of 'yes'.  See the DB2 Information Center for more details.

The shared file system cluster option 'verifyGpfsReady' has not been set to the optimal value of 'yes'.  See the DB2 Information Center for more details.

The shared file system cluster option 'failureDetectionTime' could not be verified.  Re-issue this command after the cluster manager peer domain has been created and is online.

A diagnostic log has been saved to '/tmp/ibm.db2.cluster.toBL0B'.修改过程如下

mmchconfig adminMode allToAll

mmchconfig maxFilesToCache=20000

mmchconfig pagepool=1024M

mmchconfig usePersistentReserve=yes

mmchconfig verifyGpfsReady=yes

 

mmshutdown -a

db2cluster -cfs -stop -all

mmchconfig failureDetectionTime=35

db2cluster -cfs -start -all

 

mmstartup -a

 

 

1.2.3.4 验证GPFS

db2cluster -cfs -verify -configuration

db2cluster -cfs -verify -filesystem mygpfs1

 

实例的所有者db2sdin1对文件系统具有访问权限

1.2.3.5 删除GPFS集群(未验证)

1.fuser –kcu /ods/gpfs

2. unmount /ods/gpfs #在所有节点上执行

3. mmdelfs gpfslv,

4. mmlsfs gpfslv #检查结果

5. mmdelnsd –F /ods/gpfsnsdprofile

6. mmshutdown –a

7. mmdelnode -a

8. mmdelnode –f #最后清除集群

 

 

Optional. For DB2 managed GPFSinstallations, verify the remote shell and

remote file copy settings default todb2locssh and db2scp. For example:

/usr/lpp/mmfs/bin/mmlscluster

Remote shell command:/var/db2/db2ssh/db2locssh

Remote file copy command:/var/db2/db2ssh/db2scp

 

 

1,卸载所有gpfs文件系统,mmumount all -a

2,删除文件系统,mmdelfs gpfs_lv

3,删除nsd节点,mmdelnsd -F /tmp/nsdprofile

4,停止gpfs系统,mmshutdown-a

5,卸载gpfs软件,先查看已经安装的软件rpm -qa|grep gpfs

rpm -e gpfs.gpl

rpm -e gpfs.msg.en_us

rpm -e gpfs.base

rpm -e gpfs.docs

6,删除目录 /var/mmfs和/usr/lpp/mmfs

7,删除/var/adm/ras目录下以mm开头的文件

8,删除/tmp/mmfs(如果存在)目录和其内容

 

 

gpfs log file path /var/adm/ras

 

1.2.3.6 Explanation

The verifyGpfsReady=yesconfiguration attribute is set, but the /var/mmfs/etc/gpfsreadyscript could not be executed.

1.2.3.7 User response

Make sure /var/mmfs/etc/gpfsreadyexists and is executable, or disable the verifyGpfsReadyoption via mmchconfig verifyGpfsReady=no.

 

ERROR: DBI20093E  The DB2 installer failed to set up thedb2sshid because the

GPFS file system is a user-managed filesystem.

参考文献

 

1.2.4  网络环境

1.2.4.1 1,修改vi /etc/hosts文件

[root@wtydb21 ~]# cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.0.101 wtydb21-ib0

192.168.3.101 wtydb21-ib3

192.168.0.102 wtydb22-ib0

192.168.3.102 wtydb22-ib3

192.168.0.103 wtydb23-ib0

192.168.3.103 wtydb23-ib3

 

192.168.2.101 wtydb21

192.168.2.102 wtydb22

192.168.2.103 wtydb23

备注:IP和主机名之间用空格,一定不要用tab键,否则报错。lookback不要匹配主机名

1.2.4.2 2,修改主机名

第一步:

#hostnamewtydb21

第二步:

修改/etc/sysconfig/network中的hostname

第三步:

修改/etc/hosts文件

1.2.4.3 3,设置网卡绑定

eth0和eth4做网卡绑定,主备模式

使用ethtool确认网络连接状态

1,cd/etc/sysconfig/network-scripts/ 修改相关配置文件

新建 ifcfg-bond0

DEVICE=bond0

BOOTPROTO=static

IPADDR=192.168.2.101

NETMASK=255.255.255.0

GATEWAY=192.168.2.254

ONBOOT=YES

TYPE=Ethernet

USERCTL=no

BONDING_OPTS="miimon=100 mode=1"

修改ifcfg-eth0内容:

 

DEVICE=eth0

BOOTPROTO=none

ONBOOT=yes

USERCTL=no

MASTER=bond0

SLAVE=yes

   

修改ifcfg-eth4内容

 

DEVICE=eth4

BOOTPROTO=none

ONBOOT=yes

USERCTL=no

MASTER=bond0

SLAVE=yes

 

2,vi/etc/modprobe.conf  /etc/modprobe.d/bonding.conf

aliasbond0 bonding

3,重启网络配置

servicenetwork

4,  关闭networkmanger(rhel6.3版本需要,之前rhel5配置时无此步骤)

chkconfig NetworkManager off;

/etc/init.d/NetworkManager stop;

chkconfig network on;

/etc/init.d/network restart

ifconfig bond1 up

 

 

1.2.4.4 4,关闭相关服务:防火墙和selinux

1,chkconfig iptablesoff

service iptables stop

 

2,selinux

vi /etc/selinux/config

 

#SELINUX=enforcing

SELINUX=disabled

将文件中的SELINUX="" 为 disabled ,然后重启。

如果不想重启系统,使用命令setenforce 0

1.2.5  IB环境准备

1.2.5.1 交换机配置

With the Mellanox SM HA software, thesystem manager can enter and modify

all IB subnet configurations for the subnetmanagers from a single location.

Assign a virtual IP address (VIP) to themanagement port of the switch to

manage the high availability domain. Thesystem manager must configure all

the switches in a Mellanox SM HAenvironment to join the same IB subnet, and

assign the subnet a name. After joining thesubnet, the subnet managers are

synchronized and you must select one as themaster subnet manager, and the

others become standby subnet managers.

 

1.2.5.2 安装infiniband驱动软件

infiniband软件

Install the packages required and reboot:

rhel 6.3版本

[root@serv]# yum groupinstall "Infiniband Support"

[root@serv]# yum install infiniband-diags perftest qperf opensm

[root@serv]# chkconfig rdma on

[root@serv]# chkconfig opensm on

[root@serv]# shutdown -r now

 

 

1.2.5.3 配置IP地址

同普通以太网卡配置,例外db2数据库使用RDMA协议,而不是TCP|IP协议

[root@wtydb22 network-scripts]# cat ifcfg-ib0

DEVICE="ib0"

BOOTPROTO="static"

IPADDR=192.168.0.102

#NETMASK=255.255.255.0

PREFIX=24

NM_CONTROLLED="yes"

ONBOOT="yes"

TYPE="InfiniBand"

 

1.2.5.4 参数配置

1,log_mtts_per_seg参数

如果 CF 或成员所在的主机有超过 64 GB 的内存,那么 Mellanox HCA 驱动程序(mlx4_core)模块的参数 log_mtts_per_seg 必须从 3(默认)增加到 7,以便有更大的内存注册。

 

要增加大小,以根用户身份发出以下命令:

 

    在 SUSE 上:

 

                                echo "optionsmlx4_core log_mtts_per_seg=7" >>

   /etc/modprobe.conf.local

    在 RHEL 上:

                                echo "optionsmlx4_core log_mtts_per_seg=7" >> /etc/modprobe.conf

 

RHEL 6 上需要运行如下命令

echo "options mlx4_corelog_mtts_per_seg=7" >> /etc/modprobe.d/modprobe.conf

要让该更改生效,您必须重启服务器。要检查您的更改在模块上是否有效,输入:

<host-name>/sys/module/mlx4_core/parameters# cat /sys/module/mlx4_core/parameters/log_mtts_per_seg

 

2,/etc/rdma/dat.conf配置文件

On RHEL 6.1, the DAT configuration file islocated in /etc/rdma/dat.conf

and it is updated by the group installationof the "InfiniBand Support"

package

系统默认配置好了

 

[root@wtydb21 Packages]# cat/etc/rdma/dat.conf

ofa-v2-mlx4_0-1 u2.0 nonthreadsafe defaultlibdaploscm.so.2 dapl.2.0 "mlx4_0 1" ""

ofa-v2-mlx4_0-2 u2.0 nonthreadsafe defaultlibdaploscm.so.2 dapl.2.0 "mlx4_0 2" ""

ofa-v2-ib0 u2.0 nonthreadsafe defaultlibdaplofa.so.2 dapl.2.0 "ib0 0" ""

……

1.2.5.5 最佳实践

网络配置

2)检查infiniband交换机的配置

a,管理软件需要最低版本

image-PPC_M405EXEFM_1.1.2500.img

b,启动子网管理SM,管理口配置VIP,互联口配置IP地址

c,配置故障转移

d,两个交换机之间需要2根以上的互连线,建议CF和member连接至交换机的总port数除以2

1.2.6  依赖软件安装

使用yum install命令在三台主机同时安装即可

 

   

yum install   libstdc++ 

yum install   pam-1.1.1

yum install   pam_krb5-2.3.11

yum install   pam-devel

yum install   pam_pkcs11

 

所需软件包,详见官方安装文档说明。

libstdc++-4.4.5-6.el6.i686,

pam-1.1.1-8.el6.i686,

pam_krb5-2.3.11-6.el6.i686,

pam-devel-1.1.1-8.el6.i686,

pam_pkcs11-0.6.2-11.1.el6.i686,

pam_ldap-185-8.el6.i686

下列包rhel 6.1 均已经默认安装

For InfiniBand network

type (both 32- bit and 64-bit libraries unless specified) :

libibcm

dapl (64-bit libraries only)

ibsim (64-bit libraries only)

ibutils (64-bit libraries only)

libibverbs

librdmacm

libcxgb3

libibmad

libibumad

libipathverbs (64-bit libraries only)

libmlx4

libmthca

libnes (64-bit libraries only)

libmlx4

rdma (no architecture)

 

 

ntp-4.2.4p8-2.el6.x86_64/ntpdate-4.2.4p8-2.el6.x86_64

libstdc++-4.4.5-6.el6.x86_64

libstdc++-4.4.5-6.el6.i686

glibc-2.12-1.25.el6.x86_64

glibc-2.12-1.25.el6.i686

gcc-c++-4.4.5-6.el6.x86_64

gcc-4.4.5-6.el6.x86_64

kernel-2.6.32-131.0.15.el6.x86_64

kernel-devel-2.6.32-131.0.15.el6.x86_64

kernel-headers-2.6.32-131.0.15.el6.x86_64

kernel-firmware-2.6.32-131.0.15.el6.noarch

ntp-4.2.4p8-2.el6.x86_64

ntpdate-4.2.4p8-2.el6.x86_64

sg3_utils-1.28-3.el6.x86_64

 

 

 

sg3_utils-libs-1.28-3.el6.x86_64

binutils-2.20.51.0.2-5.20.el6.x86_64

binutils-devel-2.20.51.0.2-5.20.el6.x86_64

openssh-5.3p1-52.el6.x86_64

cpp-4.4.5-6.el6.x86_64

ksh-20100621-16.el6.x86_64

 

 

安装示例说明

 

[root@wtydb23 ~]#  yum install compat-libstdc++-33-3.2.3-69.el6.x86_64.rpm

 

 

1.2.6.1 GPFS依赖软件包

另外,GPFS软件还需要I686等软件包需要安装

v For multiple communication adapter portson InfiniBand network and single or multiple communication adapter port at CFson 10GE network,

the minimum support level is RHEL 6.1.

i686 which is 32-bit packages might not getinstalled by default when installing x86_64 server. Make sure that all the32-bit dependencies are

explicitly installed. For example:

libstdc++-4.4.5-6.el6.i686,

pam-1.1.1-8.el6.i686,

pam_krb5-2.3.11-6.el6.i686,

pam-devel-1.1.1-8.el6.i686,

pam_pkcs11-0.6.2-11.1.el6.i686,

pam_ldap-185-8.el6.i686

Alternatively, run the yum command aftercreating a source from local DVD or after registering to RHN:

yum install *.i686

 

 

Ensure the following 32-bit RSCT packagesare installed:

– libibcm.i686

– libibverbs-rocee.i686

– librdmacm.i686

– libcxgb3.i686

– libibmad.i686

– libibumad.i686

– libmlx4-rocee.i686

– libmthca.i686·

 

DB2 pureScale Feature requireslibstdc++.so.6. Verify that the files exist with

the following commands:

ls /usr/lib/libstdc++.so.6*

ls /usr/lib64/libstdc++.so.6*

 

1.2.6.2 install SAM

[root@wtydb21 tsamp]# cat /tmp/prereqSAM.1.log

prereqSAM: >>> Prerequisite on wtydb21 check - log started : Fri Apr 12 14:22:16 CST 2013

prereqSAM:  OPTIONS         = ''

prereqSAM:  OPT_SILENT      = 0

prereqSAM:  OPT_NOLICCHECK  = 0

prereqSAM: Detected operating system  Linux

prereqSAM: Detected architecture  i386x

prereqSAM: Detected distribution  RH

prereqSAM: Supported operating system versions  RH Linux i386x - 5.0 6.0

prereqSAM: Detected operating system version

Red Hat Enterprise Linux Server release 6.1 (Santiago)

 

prereqSAM: rpm package and version installed  'ksh'

 

prereqSAM: Using default prerequisite checking on the following rpm package  'perl'

prereqSAM: rpm package and version installed  'perl'

 

prereqSAM: Using default prerequisite checking on the following rpm package  'libstdc++' 'i686'

prereqSAM: rpm package and version installed  'libstdc++' 'i686'

 

prereqSAM: Using default prerequisite checking on the following rpm package  'libstdc++' 'x86_64'

prereqSAM: rpm package and version installed  'libstdc++' 'x86_64'

 

prereqSAM: Using default prerequisite checking on the following rpm package  'compat-libstdc++-33' 'x86_64'

prereqSAM: Error: The following rpm package is not installed  'compat-libstdc++-33' 'x86_64'

 

prereqSAM: Using default prerequisite checking on the following rpm package  'pam' 'i686'

prereqSAM: rpm package and version installed  'pam' 'i686'

 

1 missing package: compat-libstdc++-33 (x86_64)

 

prereqSAM: Error: Prerequisite checking for the ITSAMP installation failed  Linux i386x RH

Red Hat Enterprise Linux Server release 6.1 (Santiago)

prereqSAM: Most severe error code returned  21

prereqSAM: One or more prerequisite packages were not found installed

prereqSAM: <<< Prerequisite on wtydb21 check - log ended : Fri Apr 12 14:22:19 CST 2013

 

 

1.2.7  内核参数调整

db2 10.1 可以自动调整内核参数为最优值,但是个人设置一下还是没啥坏处吧

 

able 1. Enforced minimum settings for Linuxinterprocess communication kernel parametersIPC kernel parameter         Enforced minimum setting

kernel.shmmni (SHMMNI)      256 * <size of RAM in GB>

kernel.shmmax (SHMMAX)    <size of RAM in bytes>

kernel.shmall (SHMALL)          2 * <size of RAM in the defaultsystem page size>

 

共享内存段的设置建议

Beginning with the first section on SharedMemory Limits, the SHMMAX limit is the maximum size of a shared memory segmenton a Linux system. The SHMALL limit is the maximum allocation of shared memorypages on a system.

   It is recommended to set the SHMMAX value to be equal to the amount ofphysical memory on your system. However, the minimum required on x86 systems is268435456 (256 MB) and for 64-bit systems, it is 1073741824 (1 GB).

 

kernel.sem (SEMMNI)    256 * <size of RAM in GB>

kernel.sem (SEMMSL)    250

kernel.sem (SEMMNS)   256 000

kernel.sem (SEMOPM)   32

 

信号量相关的内核设置

kernel.msgmni (MSGMNI)      1 024 * <size of RAM in GB>

kernel.msgmax (MSGMAX)    65 536

kernel.msgmnb (MSGMNB)   65 536

消息相关的内核设置

 

 

设置的小例子

#Example for a computer with 128GB of RAM,100GB分配给数据库使用

kernel.shmmni=4096

kernel.shmmax=107374182400

//单个共享内存段的最大大小,建议设置为最大物理内存大小。本例设置为100GB;大小为100*1024*1024*1024=107374182400

kernel.shmall=8388608

//系统中能够分配的共享内存页的大小,页的大小是4KB,总大小设置为100GB,应该是100*1024*1024*1024/4/1024=26214400

#kernel.sem=<SEMMSL> <SEMMNS><SEMOPM> <SEMMNI>

kernel.sem=250 1024000 32 4096

kernel.msgmni=16384

kernel.msgmax=65536

kernel.msgmnb=65536

 

参数设置说明

具体设计的例子

vi /etc/sysctl.conf

 

 

2,网络相关内核参数

注释掉rp_filter=1的行,

# net.ipv4.conf.all.rp_filter = 1

# changes for DB2 pureScale

然后添加

net.ipv4.conf.all.rp_filter = 2

net.ipv4.conf.all.arp_ignore = 1

 

我的服务器设置内存128GB 100GB分配给数据库使用

 

1.2.8  blacklist.conf

在所有的机器(pcf、scf、mb1、mb2)上,修改文件:

/etc/modprobe.d/blacklist.conf

添加如下内容:

# RSCT hatsd, add for pureScale

blacklist iTCO_wdt

blacklist iTCO_vendor_support

 

1.2.9  共享存储准备

1.2.9.1 需求

一般至少需要四块物理盘,

tirebreker disk  25M

实例共享文件系统 10G

数据文件系统

日志文件系统

1.2.9.2 安装方法

1,安装device-mapper-multipath rpm。

当在 DM-多路径中添加新设备时,这些新设备会位于 /dev 目录的三个不同位置:dev/mapper/mpathn、/dev/mpath/mpathn 和 /dev/dm-n。 /dev/mapper 中的设备是在引导过程中生成的。可使用这些设备访问多路径设备,例如在生成逻辑卷时。提供 /dev/mpath 中的设备是为了方便,这样可在一个目录中看到所有多路径设备。这些设备是由 udev 设备管理器生成的,且在系统需要访问它们时不一定能启动。请不要使用这些设备生成逻辑卷或者文件系统。

所有 /dev/dm-n 格式的设备都只能是作为内部使用,且应该永远不要使用。

备注 db2 purescale不能使用/dev/mapper/mpath设备,只能使用dm-设备名称

yuminstall device-mapper-multipath*

2,打开multipathd

[root@wtydb21 ~]# chkconfig --list multipathd

multipathd      0:off   1:off   2:off   3:off   4:off   5:off   6:off

[root@wtydb21 ~]# chkconfig  multipathd on

[root@wtydb21 ~]# chkconfig --list multipathd

multipathd      0:off   1:off   2:on    3:on    4:on    5:on    6:off

 

3,修改配置文件,保持多路径设备名称保持一致(安装db2 purescale 本步骤可省略)

一般可以修改/etc/multipath.conf文件,但为了保持集群(oraclerac),也可以修改/var/lib/multipath/bindings,并把文件复制到集群中的其它节点

4,查看配置情况

modprobe dm-multipath

service multipathd start

multipath -v2

multipath -v2 命令输出多路径设备的路径来显示哪些设备是多路径设备。如果没有输出任何结果,请确定正确调试了所有 SAN 连接并使系统多路径。

 

执行以下命令确定在引导时启动了多路径守护进程

 

mpathconf --enable --with_multipathd y

 

chkconfig multipathd on

 

chkconfig --list multipathd

 

$ service multipathd stop

$ service multipathd start

$ multipathd –F

$ multipathd –v2

 

参考文献

https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/DM_Multipath/mpio_setup.html

 

1.2.10             用户和组

可手工创建,也可在安装db purescale软件时配置。

两个用户分别属于不同的组,集群中不同的主机上需要相同的UID和GID,相同的home目录

You should use two different users with twodifferent groups. Each of the two users should have the same UID, GID, groupname, and the home directory on all the hosts.

Required user          User name      Groupname

Instance owner       db2sdin1          db2iadm1

Fenced user    db2sdfe1         db2fadm1

groupadd -g 999 db2iadm1

groupadd -g 998 db2fadm1

useradd -g db2iadm1 -u 1004-m db2sdin1

useradd -g db2fadm1 -u 1003 -m db2sdfe1

 

passwd db2sdin1

passwd db2fenc1

 

1.2.11             NTP设置

vi /etc/ntp.conf

server  192.168.0.81       # local clock

fudge   192.168.0.81 stratum 10

restrict 192.168.0.81 mask 255.255.255.255 nomodify notrap noquery

 

service ntp start

chkconfig ntp on

chkconfig --list ntpd

 

1.2.12             WatchDog设置

DB2使用自己的watch dog,因此需要关闭机器带的intelwatch dog

In some installations, if Intel TCOWatchDog Timer Driver modules are loaded by default, they should beblacklisted, so that they do not start automatically or conflict with RSCT. Toblacklist the modules, edit the following files:

 

To verify if the modules are loaded

 

    lsmod | grep -i iTCO_wdt; lsmod | grep -iiTCO_vendor_support

 

Edit the configuration files:

       On RHEL 5.x and RHEL 6.1, edit file /etc/modprobe.d/blacklist.conf:

 

       # RSCT hatsd

blacklist iTCO_wdt

blacklist iTCO_vendor_support

 

1.2.13             安装前检查

使用db2prereqcheck命令进行安装前的检查,一定确保检查完成通过。其实即使了在安装tivoli,gpfs等还需要一些依赖包,也可能出现安装报错

 

 

$ ./db2prereqcheck -i

检查是否满足具体安装版本需求,使用-v命令

 

$ ./db2prereqcheck -v 10.1.0.0

 

To generate a report file containingvalidation information and output from the db2prereqcheck command (including asummary section listing all failing tests), use the -o <filename>parameter. Specify the full report filename path to generate the report inanother directory. The prerequisites for installing the DB2 product will onlybe displayed on the command screen without the -o <filename> parameter.

 

$ ./db2prereqcheck -i -o report.rpt

 

检查IB网络是否满足要求

$ ./db2prereqcheck -i -p -t <networkconfiguration type>

 

db2prereqcheck –I –p –t MULT_IB_PORT_CLUSTER

1.2.14            接管GPFS(可选)

Purescale有个bug,不在认证中的阵列会出现不能使用/dev/dm-硬盘安装的情况,可以先手工安装gpfs,使用dm-盘,在使用db2cluster接管gpfs

 

解决不能直接使用/dev/dm- 进行安装的问题

 

[root@wtydb21 db2sdin1]#/opt/ibm/db2/V10.1/bin/db2cluster -cfs -verify -configuration

 

/opt/ibm/db2/V10.1/instance/db2cluster_prepare-cfs_takeover

 

take over a user-managed GPFS cluster:

1.   Log on as root on any machine in your cluster.

2.   Run the db2cluster_prepare command with the following parameters:

   db2cluster_prepare -cfs_takeover

3.   Check the error return code using the echo $? command. If there are errors, resolve the errors and rerun the command as specified in Step 2.

4.   To verify that you've properly added the record, run the following command:

   [root@wtydb21 tmp]# db2greg -dump

S,GPFS,3.5.0.4,/usr/lpp/mmfs,-,-,0,0,-,1365173996,0

S,DB2,10.1.0.2,/opt/ibm/db2/V10.1,,,2,0,,1365174117,0

V,DB2GPRF,DB2SYSTEM,wtydb21,/opt/ibm/db2/V10.1,

V,GPFS_CLUSTER,NAME,gpfs_cluster.wtydb21,-,DB2_MANAGED

 

 

 

1.2.15              nable SCSI-3 PR

[root@wtydb21 src]#/usr/lpp/mmfs/bin/tsprinquiry dm-3

IBM    :2810XIV         :0000

 

vi /var/mmfs/etc/prcapdevices

IBM:2810XIV:0000

 

/usr/lpp/mmfs/bin/mmchconfigusePersistentReserve=yes

1.2.16              IB内核参数

echo "options mlx4_corelog_mtts_per_seg=7" >>/etc/modprobe.conf

1.3    安装purescale Feature

仅安装purescale Feature,不安装db2 实例

1.3.1  db2setup图形安装

本文通过db2_install手工实现安装

 

1.3.2  db2_install手工安装

1.3.2.1 安装软件

需要在两台主机分别进行安装

[root@wtydb21 server]# ./db2_install -t /tmp/db2_install.trc -l db2_install.log

DBI1324W  Support of the db2_install command is deprecated. For

      more information, see the DB2 Information Center.

 

Default directory for installation of products - /opt/ibm/db2/V10.1

 

***********************************************************

Install into default directory (/opt/ibm/db2/V10.1) ? [yes/no]

yes

 

 

Specify one of the following keywords to install DB2 products.

 

  AESE

  ESE

  CONSV

  WSE

  EXP

  CLIENT

  RTCL

 

Enter "help" to redisplay product names.

 

Enter "quit" to exit.

 

***********************************************************

ESE

***********************************************************

Do you want to install the DB2 pureScale Feature? [yes/no]

yes

DB2 installation is being initialized.

 

 Total number of tasks to be performed: 50

Total estimated time for all tasks to be performed: 1910 second(s)

 

Task #1 start

Description: Checking license agreement acceptance

Estimated time 1 second(s)

Task #1 end

 

Task #2 start

Description: Base Client Support for installation with root privileges

Estimated time 3 second(s)

Task #2 end

 

Task #54 start

Description: Updating global profile registry

Estimated time 3 second(s)

Task #54 end

 

The execution completed successfully.

 

For more information see the DB2 installation log at

"/data/esefp2/server/db2_install.log"

.…….

 

1.3.2.2 新建实例

安装前检查,没有输出即正常

 [root@wtydb21 ~]# preprpnode wtydb21 wtydb22wtydb23

 

开始安装

/opt/ibm/db2/V10.1/instance/db2icrt -d -cf wtydb21 -cfnet wtydb21-ib0 -m wtydb22 -mnet wtydb22-ib0 -instance_shared_dev /dev/dm-15 -tbdev /dev/dm-2 -u db2sdfe1 db2sdin1

 

1.3.3  添加成员和CF

--添加第二个CF

/opt/ibm/db2/V10.1/instance/db2iupdt -d -add -cf wtydb22 -cfnet wtydb22-ib0 db2sdin1

--添加member

/opt/ibm/db2/V10.1/instance/db2iupdt -d -add -m wtydb22 -mnet wtydb22-ib0 db2sdin1

/opt/ibm/db2/V10.1/instance/db2iupdt -d -add -m wtydb23 -mnet wtydb23-ib0 db2sdin1

 

1.3.4  安装补丁包

本次直接使用fp2最新补丁集安装。

1.3.5  添加IB网卡(待实施)

更新前检查db2nodes.cfg文件

[root@wtydb23 msdata]# cat /home/db2sdin1/sqllib/db2nodes.cfg

0 wtydb22 0 wtydb22-ib0 - MEMBER

1 wtydb23 0 wtydb23-ib0 - MEMBER

128 wtydb23 0 wtydb23-ib0 - CF

129 wtydb22 0 wtydb22-ib0 - CF

 

更新IB网络配置

 

db2iupdt -update -cf wtydb23 -cfnet wtydb23-ib0,wtydb23-ib3  db2sdin1

db2iupdt -update -m wtydb23 -mnet wtydb23-ib0, wtydb23-ib3  db2sdin1

1.4    删除软件(可选)

安装过程中如果出现问题,可以使用如下方法删除相关软件

1, 备份数据库

2,使用db2stop 命令停止所有实例

3,使用db2idrop -g db2sdin1删除所有主机上的数据库实例.另外,db2idrop -g命令会保留最后主机上的gpfs数据库,需要手工清除gpfs。

4,使用db2_deinstall -a命令删除db2 purescale软件

2      安装中常见问题

1.      /dev/mapper/mapth*设备不能使用

/dev/mapper/mapth*设备本质上是链接文件,GPFS不能使用,应该使用多路径软件映射后的/dev/mapth设备

2.      直接/dev/dm-*设备报错

该问题是软件的bug,主要是由于XIV存储不在软件支持列表导致,可以通过手工创建GPFS,然后db2cluster接管的方式或者改由

3      高可用测试

初始状态,系统正常

[db2sdin1@wtydb22 ~]$ db2instance -list

ID        TYPE             STATE                HOME_HOST               CURRENT_HOST            ALERT   PARTITION_NUMBER        LOGICAL_PORT    NETNAME

--        ----             -----                ---------               ------------            -----   ----------------        ------------    -------

0       MEMBER           STARTED                  wtydb22                    wtydb22               NO                  0                   0    wtydb22-ib0

1       MEMBER           STARTED                  wtydb21                    wtydb21               NO                  0                   0    wtydb21-ib0

2       MEMBER           STARTED                  wtydb23                    wtydb23               NO                  0                   0    wtydb23-ib0

128     CF               PRIMARY                  wtydb21                    wtydb21               NO                  -                   0    wtydb21-ib0

129     CF                  PEER                  wtydb22                    wtydb22               NO                  -                   0    wtydb22-ib0

 

HOSTNAME                  STATE               INSTANCE_STOPPED        ALERT

--------                  -----               ----------------        -----

 wtydb23                  ACTIVE                              NO           NO

 wtydb21                  ACTIVE                              NO           NO

 wtydb22                  ACTIVE                              NO           NO

 

数据库能正常连接

[db2sdin1@wtydb22 ~]$ db2 connect to msdb

 

   Database Connection Information

 

 Database server        = DB2/LINUXX8664 10.1.2

 SQL authorization ID   = DB2SDIN1

 Local database alias   = MSDB

 

 

3.1    一台服务器问题

3.1.1  一台服务器公网断

1,断掉wtydb21主机的网卡

 

[root@wtydb21 ~]# ifdown bond0

2,此时,从其它主机观察状态,正常

 

[db2sdin1@wtydb22 ~]$ db2instance -list

ID        TYPE             STATE                HOME_HOST               CURRENT_HOST            ALERT   PARTITION_NUMBER        LOGICAL_PORT    NETNAME

--        ----             -----                ---------               ------------            -----   ----------------        ------------    -------

0       MEMBER           STARTED                  wtydb22                    wtydb22              YES                  0                   0    wtydb22-ib0

1       MEMBER  WAITING_FOR_FAILBACK              wtydb21                    wtydb23              YES                  0                   1    wtydb23-ib0

2       MEMBER           STARTED                  wtydb23                    wtydb23              YES                  0                   0    wtydb23-ib0

128     CF                 ERROR                  wtydb21                    wtydb21              YES                  -                   0    wtydb21-ib0

129     CF                  PEER                  wtydb22                    wtydb22               NO                  -                   0    wtydb22-ib0

 

HOSTNAME                   STATE                INSTANCE_STOPPED        ALERT

--------                   -----                ----------------        -----

 wtydb23                  ACTIVE                              NO           NO

 wtydb21                  ACTIVE                             YES          YES

 wtydb22                  ACTIVE                              NO           NO

There is currently an alert for a member, CF, or host in the data-sharing instance. For more information on the alert, its impact, and how to clear it, run the following command: 'db2cluster -cm -list -alert'.

 

[db2sdin1@wtydb22 ~]$ db2 connect to msdb

 

   Database Connection Information

 

 Database server        = DB2/LINUXX8664 10.1.2

 SQL authorization ID   = DB2SDIN1

 Local database alias   = MSDB

 

3,网卡恢复后,系统能自动回复

 一台服务器IB网络断

 A,主机1 停止bond0网卡

[root@wtydb21 ~]# ifdown bond0

 B, 主机2 仍能成功连接数据库

[root@wtydb22 ~]# db2 connect to msdb

 

   Database Connection Information

 

 Database server        = DB2/LINUXX8664 10.1.2

 SQL authorization ID   = ROOT

 Local database alias   = MSDB

 

C, 此时集群状态,wtydb21处于waiting_for_fallback状态

 

[db2sdin1@wtydb23 ~]$ db2instance -list

ID        TYPE             STATE                HOME_HOST               CURRENT_HOST            ALERT   PARTITION_NUMBER        LOGICAL_PORT    NETNAME

--        ----             -----                ---------               ------------            -----   ----------------        ------------    -------

0       MEMBER           STARTED                  wtydb22                    wtydb22               NO                  0                   0    wtydb22-ib0

1       MEMBER           STARTED                  wtydb23                    wtydb23               NO                  0                   0    wtydb23-ib0

2       MEMBER  WAITING_FOR_FAILBACK              wtydb21                    wtydb22               NO                  0                   1    wtydb22-ib0

128     CF               PRIMARY                  wtydb23                    wtydb23               NO                  -                   0    wtydb23-ib0

129     CF                  PEER                  wtydb22                    wtydb22               NO                  -                   0    wtydb22-ib0

 

HOSTNAME                   STATE                INSTANCE_STOPPED        ALERT

--------                   -----                ----------------        -----

 wtydb21                  ACTIVE                              NO          YES

 wtydb23                  ACTIVE                              NO           NO

 wtydb22                  ACTIVE                              NO           NO

There is currently an alert for a member, CF, or host in the data-sharing instance. For more information on the alert, its impact, and how to clear it, run the following command: 'db2cluster -cm -list -alert'.

 

 

3.2    重启一台服务器(wtydb23)

wtydb22主机上显示,CF正常切换,数据库仍能提供服务,测试通过。重新起来后,恢复正常

 

[db2sdin1@wtydb22 ~]$ db2instance -list

ID        TYPE             STATE                HOME_HOST               CURRENT_HOST            ALERT   PARTITION_NUMBER        LOGICAL_PORT    NETNAME

--        ----             -----                ---------               ------------            -----   ----------------        ------------    -------

0       MEMBER           STARTED                  wtydb22                    wtydb22               NO                  0                   0    wtydb22-ib0

1       MEMBER  WAITING_FOR_FAILBACK              wtydb23                    wtydb21               NO                  0                   1    wtydb21-ib0

2       MEMBER           STARTED                  wtydb21                    wtydb21               NO                  0                   0    wtydb21-ib0

128     CF                 ERROR                  wtydb23                    wtydb23              YES                  -                   0    wtydb23-ib0

129     CF               PRIMARY                  wtydb22                    wtydb22               NO                  -                   0    wtydb22-ib0

 

HOSTNAME                   STATE                INSTANCE_STOPPED        ALERT

--------                   -----                ----------------        -----

 wtydb21                  ACTIVE                              NO           NO

 wtydb23                INACTIVE                              NO          YES

 wtydb22                  ACTIVE                              NO           NO

There is currently an alert for a member, CF, or host in the data-sharing instance. For more information on the alert, its impact, and how to clear it, run the following command: 'db2cluster -cm -list -alert'

 

 

3.3    重启IB交换机switch-1

wtydb22成为提供服务的主机

[db2sdin1@wtydb22 ~]$ db2instance -list

 

ID        TYPE             STATE                HOME_HOST               CURRENT_HOST            ALERT   PARTITION_NUMBER        LOGICAL_PORT    NETNAME

--        ----             -----                ---------               ------------            -----   ----------------        ------------    -------

0       MEMBER           STARTED                  wtydb22                    wtydb22               NO                  0                   0    wtydb22-ib0

1       MEMBER  WAITING_FOR_FAILBACK              wtydb23                    wtydb22              YES                  0                   2    wtydb22-ib0

2       MEMBER  WAITING_FOR_FAILBACK              wtydb21                    wtydb22              YES                  0                   1    wtydb22-ib0

128     CF                 ERROR                  wtydb23                    wtydb23              YES                  -                   0    wtydb23-ib0

129     CF               PRIMARY                  wtydb22                    wtydb22               NO                  -                   0    wtydb22-ib0

 

HOSTNAME                   STATE                INSTANCE_STOPPED        ALERT

--------                   -----                ----------------        -----

 wtydb21                  ACTIVE                              NO          YES

 wtydb23                  ACTIVE                              NO          YES

 wtydb22                  ACTIVE                              NO           NO

switch-1 起来以后,数据库自动恢复

 

[db2sdin1@wtydb23 ~]$ db2instance -list

ID        TYPE             STATE                HOME_HOST               CURRENT_HOST            ALERT   PARTITION_NUMBER        LOGICAL_PORT    NETNAME

--        ----             -----                ---------               ------------            -----   ----------------        ------------    -------

0       MEMBER           STARTED                  wtydb22                    wtydb22               NO                  0                   0    wtydb22-ib0

1       MEMBER           STARTED                  wtydb23                    wtydb23               NO                  0                   0    wtydb23-ib0

128     CF                  PEER                  wtydb23                    wtydb23               NO                  -                   0    wtydb23-ib0

129     CF               PRIMARY                  wtydb22                    wtydb22               NO                  -                   0    wtydb22-ib0

 

HOSTNAME                   STATE                INSTANCE_STOPPED        ALERT

--------                   -----                ----------------        -----

 wtydb23                  ACTIVE                              NO           NO

 wtydb22                  ACTIVE                              NO           NO

 

4      新增加一台服务器

前期环境准备,同前面

目标通过检验:/software/ese10.1.2/server/db2prereqcheck -p -tMULTI_IB_PORT_CLUSTER -v 10.1.0.2

 

/opt/ibm/db2/V10.1.01/instance/db2iupdt -d -add -m wtydb23 -mnet wtydb23-ib0 db2sdin1


 

 

 

 

你可能感兴趣的:(db2,purescale)