拓扑及地址设置
本实验要用到的软件包
drbd83-8.3.8-1.el5.centos.i386.rpm #drbd的管理包
kmod-drbd83-8.3.8-1.el5.centos.i686.rpm #drbd的内核模块
cluster-glue-1.0.6-1.6.el5.i386.rpm #在群集中增加对更多节点的支持
cluster-glue-libs-1.0.6-1.6.el5.i386.rpm
corosync-1.2.7-1.1.el5.i386.rpm #corosync的主配置文件
corosynclib-1.2.7-1.1.el5.i386.rpm #corosync的库文件
heartbeat-3.0.3-2.3.el5.i386.rpm #我们的heartbeat在这里是做四层的资源代理用的
heartbeat-libs-3.0.3-2.3.el5.i386.rpm #heartbeat的库文件
ldirectord-1.0.1-1.el5.i386.rpm #在高可用性群集中实验对后面realserver的探测
libesmtp-1.0.4-5.el5.i386.rpm
openais-1.1.3-1.6.el5.i386.rpm
openaislib-1.1.3-1.6.el5.i386.rpm #openais的库文件
pacemaker-1.1.5-1.1.el5.i386.rpm #pacemake的主配置文档
pacemaker-libs-1.1.5-1.1.el5.i386.rpm #pacemaker的库文件
pacemaker-cts-1.1.5-1.1.el5.i386.rpm
perl-TimeDate-1.16-5.el5.noarch.rpm
resource-agents-1.0.4-1.1.el5.i386.rpm #开启资源代理的包
mysql-5.5.15-linux2.6-i686.tar.gz #mysql的绿色软件
说明:资源的下载地址
http://down.51cto.com/data/402802
实验步骤如下:
一.修改群集中各节点的网络参数
node1的地址为192.168.2.10:
[root@localhost ~]# vim /etc/sysconfig/network
[root@localhost ~]# hostname node1.a.com
重新连接到192.168.2.10
[root@node1 ~]# setup
[root@node1 ~]# service network restart
正在关闭接口 eth0: [确定]
关闭环回接口: [确定]
弹出环回接口: [确定]
弹出界面 eth0: [确定]
[root@node1 ~]# vim /etc/hosts
node2的地址为192.168.2.20:
[root@cms ~]# vim /etc/sysconfig/network
[root@cms ~]# hostname node2.a.com
重新连接到192.168.2.20
[root@node2 ~]# setup
[root@node2 ~]# service network restart
Shutting down interface eth0: [ OK ]
Shutting down loopback interface: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
[root@node2 ~]# vim /etc/hosts
同步群集中各节点的时间
[root@node1 ~]# hwclock -s
[root@node2 ~]# hwclock -s
二.在各个节点上面产生密钥实现无密码的通讯
node1:
[root@node1 ~]# ssh-keygen -t rsa #产生一个rsa的非对称加密的私钥对
[root@node1 ~]# ssh-copy-id -i .ssh/id_rsa.pub node2 #拷贝到node2节点
node2:
[root@node2 ~]# ssh-keygen -t rsa #产生一个rsa的非对称加密的私钥对
[root@node2 ~]# ssh-copy-id -i .ssh/id_rsa.pub node2 #拷贝到node1节点
三.在各个节点上面配置好yum客户端
[root@node1 ~]# vim /etc/yum.repos.d/server.repo
[root@node1 ~]# scp /etc/yum.repos.d/server.repo node2:/etc/yum.repos.d/ #配置node2的yum客户端可以这样做
四.将下载好的rpm包上传到linux上的每个节点后查看
node1上:
node2上:
五.在各节点上面安装drbd的rpm包
[root@node1 ~]# yum localinstall drbd kmod-drbd -y --nogpgcheck
[root@node2 ~]# yum localinstall drbd kmod-drbd -y --nogpgcheck
六.在各节点上增加一个大小类型都相关的drbd设备(sdb1)
node1:
[root@node1 ~]# fdisk /dev/sdb
(这些命令:n/p/1/1/+1000M/p/w)
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-522, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-522, default 522):+1000M
Command (m for help): p
Disk /dev/sdb: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 123987966 83 Linux
[root@node1 ~]# partprobe /dev/sdb #重新加载内核模块
[root@node1 ~]# cat /proc/partitions #查看
major minor #blocks name
8 0 20971520 sda
8 1 104391 sda1
8 2 1052257 sda2
8 3 19808145 sda3
8 16 4194304 sdb
8 17 987966 sdb1
Node2:
[root@node2 ~]# fdisk /dev/sdb
(这些命令:n/p/1/1/+1000M/p/w)
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-522, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-522, default 522):+1000M
Command (m for help): p
Disk /dev/sdb: 4294 MB, 4294967296 bytes
255 heads, 63 sectors/track, 522 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 123987966 83 Linux
[root@node2 ~]# partprobe /dev/sdb #重新加载内核模块
[root@node2 ~]# cat /proc/partitions #查看
major minor #blocks name
8 0 20971520 sda
8 1 104391 sda1
8 2 1052257 sda2
8 3 19808145 sda3
8 16 4194304 sdb
8 17 987966 sdb1
七.在每个节点上配置drbd
node1:
1.复制配置文件
[root@node1 ~]# cp /usr/share/doc/drbd83-8.3.8/drbd.conf /etc/
2.备份global_common.conf文件
[root@node1 ~]# cd /etc/drbd.d/
[root@node1 drbd.d]# ll
-rwxr-xr-x 1 root root 1418 Jun 4 2010 global_common.conf
[root@node1 drbd.d]# cp global_common.conf global_common.conf.bak
3.编辑global_common.conf
[root@node1 drbd.d]# vim global_common.conf
global {
usage-count no;
}
common {
protocol C;
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
}
startup {
wfc-timeout 120; #等待连接超时的时间
degr-wfc-timeout 120; #等待降级的节点连接超时的时间
}
disk {
on-io-error detach;
}
net {
cram-hmac-alg "sha1"; #使用sha1加密算法实现节点认证
shared-secret "mydrbdlab"; #认证码
}
syncer {
rate 100M; #定义同步数据时的速率
}
}
4.编辑mysql的文件
[root@node1 drbd.d]# vim mysql.res #在文件中有如下修改
resource mysql {
on node1.a.com {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.2.10:7789;
meta-disk internal;
}
on node2.a.com {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.2.20:7789;
meta-disk internal;
}
}
在node2上也要做上面的工作,这时我们可以从node1上拷贝
[root@node1 drbd.d]# scp -r /etc/drbd.* node2:/etc/
5.node1及node2上初始化定义mysql的资源并启动相应的服务,在这里我们并不需要设置开机自动启动
[root@node1 drbd.d]# drbdadm create-md mysql
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
[root@node1 drbd.d]# service drbd start
[root@node2 drbd.d]# drbdadm create-md mysql
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
[root@node2 drbd.d]# service drbd start
[root@node2 drbd.d]#
6.使用drbd-overview命令来查看启动状态
[root@node1 drbd.d]# drbd-overview
0:mysql Connected Secondary/Secondary Inconsistent/Inconsistent C r----
将node1设置为主节点并执行如下命令:
[root@node1 drbd.d]# drbdadm -- --overwrite-data-of-peer primary mysql
[root@node1 drbd.d]# drbd-overview #再次查看启动的状态
0:mysql SyncSource Primary/Secondary UpToDate/Inconsistent C r----
[>...................] sync'ed: 8.7% (906616/987896)K delay_probe: 4
[root@node2 drbd.d]# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by [email protected], 2010-06-04 08:04:16
0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r----
ns:0 nr:987896 dw:987896 dr:0 al:0 bm:61 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
7.查看同步过程
[root@node1 drbd.d]# watch -n 1 'cat /proc/drbd'
8.在primary节点上创建文件系统
[root@node1 drbd.d]# mkfs -t ext3 /dev/drbd0 #格式化
[root@node1 drbd.d]# mkdir /mysqldata #创建挂载点
[root@node1 drbd.d]# mount /dev/drbd0 /mysqldata/ #进行挂载
[root@node1 drbd.d]# cd /mysqldata
[root@node1 mysqldata]#touch f1 f2 #创建2个文件
[root@node1 drbd.d]# ls /mysqldata/
-rw-r--r-- 1 root root 0 May 9 15:45 f1
-rw-r--r-- 1 root root 0 May 9 15:45 f2
drwx------ 2 root root 16384 May 9 15:41 lost+found
[root@node1 ~]# umount /mysqldata #卸载drbd设备
[root@node1 ~]# drbdadm secondary mysql #将node1设置为secondary节点
[root@node1 ~]# drbd-overview #再次查看启动状态
0:mysql Connected Secondary/Secondary UpToDate/UpToDate C r----
9.将node2设置为primary节点
[root@node2 ~]# drbdadm primary mysql
[root@node2 ~]# drbd-overview
0:mysql Connected Primary/Secondary UpToDate/UpToDate C r----
[root@node2 ~]# mkdir /mysqldata
[root@node2 ~]# mount /dev/drbd0 /mysqldata #挂载
[root@node2 ~]# ll /mysqldata/
total 16
-rw-r--r-- 1 root root 0 May 9 15:45 f1
-rw-r--r-- 1 root root 0 May 9 15:45 f2
drwx------ 2 root root 16384 May 9 15:41 lost+found
[root@node2 ~]# umount /mysqldata/
八.在各个节点上安装配置mysql
1.node1上的配置,添加用户和组
[root@node1 ~]# groupadd -r mysql
[root@node1 ~]# useradd -g mysql -r mysql
#设置node1为主设备,node2为从设备
node2上操作:
[root@node2 ~]# drbdadm secondary mysql
node1上操作:
[root@node1 ~]# drbdadm primary mysql
[root@node1 ~]# drbd-overview
0:mysql Connected Primary/Secondary UpToDate/UpToDate C r----
[root@node1 ~]# mount /dev/drbd0 /mysqldata #挂载drbd设备
[root@node1 ~]# mkdir /mysqldata/data
[root@node1 ~]# chown -R mysql.mysql /mysqldata/data/
[root@node1 ~]# ls /mysqldata/ #查看一下
data
lost+found
2.安装mysql
在node1上的:
[root@node1 ~]# tar zxvf mysql-5.5.15-linux2.6-i686.tar.gz -C /usr/local
[root@node1 ~]# cd /usr/local/
[root@node1 ~]# ln -sv mysql-5.5.15-linux2.6-i686 mysql
[root@node1 ~]# cd mysql
[root@node1 ~]# chown -R mysql:mysql .
[root@node1 ~]# scripts/mysql_install_db --user=mysql --datadir=/mysqldata/data #初始化mysql数据库
[root@node1 ~]# chown -R root .
[root@node1 ~]# cp support-files/my-large.cnf /etc/my.cnf #为mysql提供主配置文件
[root@node1 ~]# vim /etc/my.cnf
thread_concurrency = 2
datadir = /mysqldata/data #添加该行
[root@node1 ~]# cp support-files/mysql.server /etc/rc.d/init.d/mysqld #为mysql提供sysv服务脚本
在node2上的:
[root@node2 ~]# scp /etc/my.cnf node2:/etc/ #可从node1上拷贝过来
[root@node2 ~]# scp /etc/rc.d/init.d/mysqld node2:/etc/rc.d/init.d
[root@node2 ~]# chkconfig --add mysqld
[root@node2 ~]# chkconfig mysqld off #设置开机不能自动启动
[root@node2 ~]# service mysqld start #启动服务
测试之后关闭服务
[root@node2 ~]# service mysqld stop
[root@node2 ~]# ls /mysqldata/data #查看
node1上的
[root@node1 mysql]# ls /mysqldata/data/
ib_logfile0 ibdata1 mysql-bin.000001 node1.a.com.err performance_schema
ib_logfile1 mysql mysql-bin.index node1.a.com.pid test
[root@node1 ~]# service mysqld stop
[root@node1 ~]# vim /etc/man.config
MANPATH /usr/local/mysql/man #添加该行
[root@node1 ~]# ln -sv /usr/local/mysql/include /usr/include/mysql
#建立链接输出mysql的库文件给系统库查找路径
[root@node1 ~]# echo '/usr/local/mysql/lib' > /etc/ld.so.conf.d/mysql.conf
[root@node1 ~]# ldconfig #重新载入
[root@node1 ~]#vim /etc/profile #修改PATH环境变量
PATH=$PATH:/usr/local/mysql/bin . /etc/profile
[root@node1 mysql]# echo $PATH #重新读取环境变量
/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/usr/local/mysql/bin
[root@node1 ~]#
[root@node1 ~]# umount /mysqldata #卸载drbd设备
3.node2上的配置:
[root@node2 ~]# groupadd -r mysql
[root@node2 ~]# useradd -g mysql -r mysql
node1上操作:
[root@node2 ~]# drbdadm secondary mysql
node2上操作:
[root@node2 ~]# drbdadm primary mysql
[root@node2 ~]# mount /dev/drbd0 /mysqldata
[root@node2 ~]# ls /mysqldata/
data lost+found
安装mysql
[root@node2 ~]# tar zxfv mysql-5.5.15-linux2.6-i686.tar.gz -C /usr/local
[root@node2 ~]# cd /usr/local/
[root@node2 ~]# ln -sv mysql-5.5.15-linux2.6-i686 mysql
[root@node2 ~]# cd mysql
[root@node2 ~]# chown -R root:mysql .
[root@node2 ~]# chkconfig --add mysqld
[root@node2 ~]# chkconfig mysqld off
[root@node2 ~]# service mysqld start
[root@node2 ~]# ls /mysqldata/data #查看其中是否有文件
[root@node2 mysql]# ls /mysqldata/data/
ib_logfile0 ibdata1 mysql-bin.000001 mysql-bin.index node2.a.com.err performance_schema
ib_logfile1 mysql mysql-bin.000002 node1.a.com.err node2.a.com.pid test
[root@node2 ~]# service mysqld stop
[root@node2 ~]# umount /dev/drbd0 #卸载设备
九.在各个节点上安装并配置corosync+pacemaker
1.安装
[root@node1 ~]# yum install -y cluster-glue-1.0.6-1.6.el5.i386.rpm
cluster-glue-libs-1.0.6-1.6.el5.i386.rpm
corosync-1.2.7-1.1.el5.i386.rpm
corosynclib-1.2.7-1.1.el5.i386.rpm
drbd83-8.3.8-1.el5.centos.i386.rpm
heartbeat-3.0.3-2.3.el5.i386.rpm
heartbeat-libs-3.0.3-2.3.el5.i386.rpm
kmod-drbd83-8.3.8-1.el5.centos.i686.rpm
libesmtp-1.0.4-5.el5.i386.rpm
openais-1.1.3-1.6.el5.i386.rpm
openaislib-1.1.3-1.6.el5.i386.rpm
pacemaker-1.1.5-1.1.el5.i386.rpm
pacemaker-cts-1.1.5-1.1.el5.i386.rpm
pacemaker-libs-1.1.5-1.1.el5.i386.rpm
perl-TimeDate-1.16-5.el5.noarch.rpm
resource-agents-1.0.4-1.1.el5.i386.rpm
--nogpgcheck
[root@node2~]# yum install -y cluster-glue-1.0.6-1.6.el5.i386.rpm
cluster-glue-libs-1.0.6-1.6.el5.i386.rpm
corosync-1.2.7-1.1.el5.i386.rpm
corosynclib-1.2.7-1.1.el5.i386.rpm
drbd83-8.3.8-1.el5.centos.i386.rpm
heartbeat-3.0.3-2.3.el5.i386.rpm
heartbeat-libs-3.0.3-2.3.el5.i386.rpm
kmod-drbd83-8.3.8-1.el5.centos.i686.rpm
libesmtp-1.0.4-5.el5.i386.rpm
openais-1.1.3-1.6.el5.i386.rpm
openaislib-1.1.3-1.6.el5.i386.rpm
pacemaker-1.1.5-1.1.el5.i386.rpm
pacemaker-cts-1.1.5-1.1.el5.i386.rpm
pacemaker-libs-1.1.5-1.1.el5.i386.rpm
perl-TimeDate-1.16-5.el5.noarch.rpm
resource-agents-1.0.4-1.1.el5.i386.rpm
--nogpgcheck
2.配置
node1上的:
[root@node1 ~]# cd /etc/corosync/ #切换到主配置文件的目录
[root@node1 corosync]# cp corosync.conf.example corosync.conf
[root@node1 corosync]# vim corosync.conf
compatibility: whitetank
compatibility: whitetank
totem {
version: 2
secauth: off
threads: 0
interface {
ringnumber: 0
bindnetaddr: 192.168.2.0 #修改这里
mcastaddr: 226.94.1.1
mcastport: 5405
}
}
version: 2
secauth: off
threads: 0
interface {
ringnumber: 0
bindnetaddr: 192.168.2.0 #修改这里
mcastaddr: 226.94.1.1
mcastport: 5405
}
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
cluster
debug: off
timestamp: on /
debug: off
timestamp: on /
logger_subsys {
subsys: AMF
debug: off
}
}
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
mode: disabled
}
service {
ver: 0
name: pacemaker
ver: 0
name: pacemaker
use_mgmtd: yes
}
}
aisexec {
user: root
group: root
}
user: root
group: root
}
[root@node1 corosync]# mkdir /var/log/cluster #创建cluster目录
[root@node1 corosync]# corosync-keygen #生成密钥用于验证
[root@node1 corosync]# ll
-rw-r--r-- 1 root root 5384 Jul 28 2010 amf.conf.example
-r-------- 1 root root 128 May 8 14:09 authkey
-rw-r--r-- 1 root root 538 May 8 14:08 corosync.conf
-rw-r--r-- 1 root root 436 Jul 28 2010 corosync.conf.example
drwxr-xr-x 2 root root 4096 Jul 28 2010 service.d
drwxr-xr-x 2 root root 4096 Jul 28 2010 uidgid.d
-rw-r--r-- 1 root root 5384 Jul 28 2010 amf.conf.example
-r-------- 1 root root 128 May 8 14:09 authkey
-rw-r--r-- 1 root root 538 May 8 14:08 corosync.conf
-rw-r--r-- 1 root root 436 Jul 28 2010 corosync.conf.example
drwxr-xr-x 2 root root 4096 Jul 28 2010 service.d
drwxr-xr-x 2 root root 4096 Jul 28 2010 uidgid.d
将node1节点上的文件拷贝到节点node2上面
[root@node1 corosync]# scp -p authkey corosync.conf node2:/etc/corosync/
[root@node1 corosync]# ssh node2 'mkdir /var/log/cluster'
3.在node1和node2节点上面启动 corosync 的服务,然后验证
[root@node1 corosync]# service corosync start
[root@node2 corosync]# service corosync start
验证corosync引擎是否正常启动
[root@node1 corosync]# grep -i -e "corosync cluster engine" -e "configuration file" /var/log/messages
4.查看初始化成员节点通知是否发出
[root@node1 corosync]# grep -i totem /var/log/messages
5.检查这个过程中是否有错误产生
[root@node1 corosync]#grep -i error: /var/log/messages |grep -v unpack_resources #停用stonith的错误
6.检查pacemaker是否已经启动了
[root@node1 corosync]# grep -i pcmk_startup /var/log/messages
node2的操作同node1这里不再赘述,照做就行
7.在node1上查看群集的状态
[root@node1 ~]# crm status
Last updated: Wed May 9 18:28:57 2012
Stack: openais
Current DC: node1.a.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ node1.a.com node2.a.com ]
十.配置群集的工作属性(在每个节点上都要使用下面提到的命令来配置)
corosync默认启用了stonith,而当前集群并没有相应的stonith设备,因此此默认配置目前尚不可用,这可以通过如下命令先禁用stonith:
# crm configure property stonith-enabled=false
对于双节点的集群来说,我们要配置此选项来忽略quorum,即这时候票数不起作用,一个节点也能正常运行:
# crm configure property no-quorum-policy=ignore
定义资源的粘性值,使资源不能再节点之间随意的切换,因为这样是非常浪费系统的资源的。
资源黏性值范围及其作用:
0:这是默认选项。资源放置在系统中的最适合位置。这意味着当负载能力“较好”或较差的节点变得可用时才转移资源。此选项的作用基本等同于自动故障回复,只是资源可能会转移到非之前活动的节点上;
大于0:资源更愿意留在当前位置,但是如果有更合适的节点可用时会移动。值越高表示资源越愿意留在当前位置;
小于0:资源更愿意移离当前位置。绝对值越高表示资源越愿意离开当前位置;
INFINITY:如果不是因节点不适合运行资源(节点关机、节点待机、达到migration-threshold 或配置更改)而强制资源转移,资源总是留在当前位置。此选项的作用几乎等同于完全禁用自动故障回复;
-INFINITY:资源总是移离当前位置;
我们这里可以通过以下方式为资源指定默认黏性值:
# crm configure rsc_defaults resource-stickiness=100
十一.定义群集服务及资源
1.查看当前集群的配置信息,确保已经配置全局属性参数为两节点集群所适用
[root@node1 ~]# crm configure show
node node1.a.com
node node2.a.com
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
2.将已经配置好的DRBD设备/dev/drbd0定义为集群服务
[root@node1 ~]# service drbd stop #停止服务
[root@node1 ~]# chkconfig drbd off
[root@node1 ~]# ssh node2 "service drbd stop"
[root@node1 ~]# ssh node2 "chkconfig drbd off"
[root@node1 ~]# drbd-overview #查看状态
drbd not loaded
3.配置drbd为集群资源
提供drbd的RA目前由OCF归类为linbit,其路径为
/usr/lib/ocf/resource.d/linbit/drbd我们可以使用如下命令来查看此RA及RA的meta信息
[root@node1 ~]# crm ra classes
heartbeat
lsb
ocf / heartbeat linbit pacemaker
stonith
[root@node1 ~]# crm ra list ocf linbit
drbd
查看drbd的资源代理的相关信息:
[root@node1 ~]# crm ra info ocf:linbit:drbd
drbd需要同时运行在两个节点上,但只能有一个节点(primary/secondary模型)是Master,而另一个节点为Slave;因此,它是一种比较特殊的集群资源,其资源类型为多状态(Multi-state)clone类型,即主机节点有Master和Slave之分,且要求服务刚启动时两个节点都处于slave状态。
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive mysqldrbd ocf:heartbeat:drbd params drbd_resource="mysql" op monitor role="Master" interval="30s" op monitor role="Slave" interval="31s" op start timeout="240s" op stop timeout="100s"
crm(live)configure# ms MS_mysqldrbd mysqldrbd meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify="true"
crm(live)configure# show mysqldrbd
primitive mysqldrbd ocf:heartbeat:drbd \
params drbd_resource="mysql" \
op monitor interval="30s" role="Master" \
op monitor interval="31s" role="Slave" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="100s"
crm(live)configure# show MS_mysqldrbd
ms MS_mysqldrbd mysqldrbd \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
确定无误后,提交:
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# exit
查看当前集群运行状态:
[root@node1 ~]# crm status
============
Last updated: Wed May 9 19:16:24 2012
Stack: openais
Current DC: node1.a.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ node1.a.com node2.a.com ]
Master/Slave Set: MS_mysqldrbd [mysqldrbd]
Masters: [ node1.a.com ]
Slaves: [ node2.a.com ]
由上面的信息可以看出此时的drbd服务的Primary节点为node1.a.com,Secondary节点为node2.a.com。当然,也可以在node1上使用如下命令验正当前主机是否已经成为mysql资源的Primary节点:
[root@node1 ~]# drbdadm role mysql
Primary/Secondary
我们实现将drbd设置自动挂载至/mysqldata目录。此外,此自动挂载的集群资源需要运行于drbd服务的Master节点上,并且只能在drbd服务将某节点设置为Primary以后方可启动。
确保两个节点上的设备已经卸载
[root@node1 ~]# umount /dev/drbd0
以下还在node1上操作:
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive MysqlFS ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/mysqldata" fstype="ext3" op start timeout=60s op stop timeout=60s
crm(live)configure# commit
crm(live)configure# exit
4.mysql资源的定义
先为mysql集群创建一个ip地址资源,通过集群提供服务时使用,这个地址就是客户端访问mysql服务器使用的ip地址;
[root@node1 ~]# crm configure primitive myip ocf:heartbeat:IPaddr params ip=192.168.2.100
配置mysqld服务为高可用资源
[root@node1 ~]# crm configure primitive mysqlserver lsb:mysqld
[root@node1 ~]# crm status
============
Last updated: Sat Apr 21 02:03:24 2012
Stack: openais
Current DC: node1.magedu.com - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ node1.a.com node2.a.com ]
Master/Slave Set: MS_mysqldrbd
Masters: [ node1.a.com ]
Slaves: [ node2.a.com ]
MysqlFS
(ocf::heartbeat:Filesystem): Started node1.a.com
myip
(ocf::heartbeat:IPaddr): Started node2.a.com
mysqlserver
(lsb:mysqld):
Started node1.a.com
5.配置资源的各种约束
定义如下的约束:
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# colocation MysqlFS_with_mysqldrbd inf: MysqlFS MS_mysqldrbd:Master myip mysqlserver
crm(live)configure# order MysqlFS_after_mysqldrbd inf: MS_mysqldrbd:promote MysqlFS:start
crm(live)configure# order myip_after_MysqlFS mandatory: MysqlFS myip
crm(live)configure# order mysqlserver_after_myip mandatory: myip mysqlserver
验证是否有错:
crm(live)configure# verify
提交:
crm(live)configure# commit
crm(live)configure# exit
查看配置信息:
[root@node1 ~]# crm configure show
node node1.a.com \
attributes standby="off"
node node2.a.com \
attributes standby="off"
primitive MysqlFS ocf:heartbeat:Filesystem \
params device="/dev/drbd0" directory="/mydata" fstype="ext3" \
op start interval="0" timeout="60s" \
op stop interval="0" timeout="60s"
primitive myip ocf:heartbeat:IPaddr \
params ip="192.168.2.100" #群集的虚拟ip地址
primitive mysqldrbd ocf:heartbeat:drbd \
params drbd_resource="mysql" \
op monitor interval="30s" role="Master" \
op monitor interval="31s" role="Slave" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="100s"
primitive mysqlserver lsb:mysqld
ms MS_mysqldrbd mysqldrbd \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
colocation MysqlFS_with_mysqldrbd inf: MysqlFS MS_mysqldrbd:Master myip mysqlserver
order MysqlFS_after_mysqldrbd inf: MS_mysqldrbd:promote MysqlFS:start
order myip_after_MysqlFS inf: MysqlFS myip
order mysqlserver_after_myip inf: myip mysqlserver
property $id="cib-bootstrap-options" \
dc-version="1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
查看运行状态
[root@node1 ~]# crm status
============
Last updated: Sat Apr 21 02:05:49 2012
Stack: openais
Current DC: node1.magedu.com - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ node1.magedu.com node2.magedu.com ]
Master/Slave Set: MS_mysqldrbd
Masters: [ node1.a.com ]
Slaves: [ node2.a.com ]
MysqlFS
(ocf::heartbeat:Filesystem): Started node1.a.com
myip
(ocf::heartbeat:IPaddr): Started node1.a.com
mysqlserver
(lsb:mysqld):
Started node1.a.com
可见,服务现在在node1上正常运行
在node1上的操作,查看mysql的运行状态
[root@node1 ~]# service mysqld status
MySQL running (5345) [ OK ]
查看目录:
[root@node1 ~]# ls /mysqldata/
data lost+found
查看vip的状态
[root@node1 corosync]# ifconfig
inet addr:192.168.2.100 Bcast:192.168.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:67 Base address:0x2000
继续测试:
在node1上操作,让node1下线:
[root@node1 ~]# crm node standby
查看集群运行的状态:
[root@node1 ~]# crm status
============
Last updated: Sat Apr 21 02:07:40 2012
Stack: openais
Current DC: node1.a.com - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Node node1.a.com:standby
Online: [ node2.a.com ]
Master/Slave Set: MS_mysqldrbd
Masters: [ node2.a.com ]
Stopped: [ mysqldrbd:1 ]
MysqlFS
(ocf::heartbeat:Filesystem): Started node2.a.com
myip
(ocf::heartbeat:IPaddr): Started node2.a.com
mysqlserver
(lsb:mysqld):
Started node2.a.com
可见我们的资源已经都切换到了node2上
查看node2的运行状态
[root@node1 ~]# service mysqld status
MySQL running (7585) [ OK ]
查看目录:
[root@node1 ~]# ls /mysqldata
data lost+found
到现在一切正常,我们可以验证mysql服务是否能被正常访问
首先,在node2上面建立一个test用户,密码:123456.
我们定义的是通过vip:192.168.2.100来访问mysql服务,现在node2上建立一个可以让某个网段主机能访问的账户(这个内容会同步drbd设备同步到node1上)
mysql >; grant all on *.* to test@'192.168.%.%' identified by '123456';
授予test用户对mysql的访问权限。
Query OK, 0 rows affected (0.08 sec)
mysql>; flush privileges;
Query OK, 0 rows affected (0.00 sec)
然后我们通过另一台主机进行访问:
[root@node1 ~]# mysql –u test –h 192.168.2.100
Mysql>show databases;