corosync 和openais 各自都能实现群集功能,但是功能比较简单,要想实现功能齐全、复杂的群集,需要将两者结合起来。二者主要提供心跳探测,但是没有资源管理能力。
pacemaker 可以提供资源管理能力,是从heartbeat的v3版本中分离出来的一个项目
高可用群集要求:
硬件一致性
软件(系统)一致性
时间一致性
名称互相能够解析
案例一:corosync+openais+pacemaker+web
1.按照拓扑图分别配置两个节点的参数
节点一:
ip :192.168.2.10/24
修改主机名
# vim /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=node1.a.com
#hostname node1.a.com
使两个节点可以相互解析
# vim /etc/hosts
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.2.10 node1.a.com node1
192.168.2.20 node2.a.com node2
节点二:
ip :192.168.2.20/24
修改主机名
# vim /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=node2.a.com
#hostname node2.a.com
使两个节点可以相互解析
# vim /etc/hosts
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.2.10 node1.a.com node1
192.168.2.20 node2.a.com node2
2.在节点一(node1)上配置yum工具,并创建挂载点,挂载光盘
# vim /etc/yum.repos.d/rhel-debuginfo.repo
[rhel-server]
name=Red Hat Enterprise Linux serverbaseurl=file:///mnt/cdrom/Server
enabled=1
gpgcheck=1
gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release
[rhel-cluster]
name=Red Hat Enterprise Linux cluster
baseurl=file:///mnt/cdrom/Cluster
enabled=1
gpgcheck=1
gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release
挂载光盘
# mkdir /mnt/cdrom
# mount /dev/cdrom /mnt/cdrom/
3.在节点2上创建挂载点,挂载光盘
# mkdir /mnt/cdrom
# mount /dev/cdrom /mnt/cdrom/
4.使两个节点的时钟相同,在两个节点上执行以下命令
# hwclock -s
5.利用公钥使两个节点间实现无障碍通信
node1产生自己的密钥对:
# ssh-keygen -t rsa 产生rsa密钥对
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 密钥保存位置
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 输入私钥保护密码
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa. 私钥位置
Your public key has been saved in /root/.ssh/id_rsa.pub. 公钥位置
The key fingerprint is:
be:35:46:8f:72:a8:88:1e:62:44:c0:a1:c2:0d:07:da [email protected]
node2产生密钥对:
# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
5e:4a:1e:db:69:21:4c:79:fa:59:08:83:61:6d:2e:4c [email protected]
6.切换至/root/.ssh下,可以看到公钥和私钥文件
# ll ~/.ssh/
-rw------- 1 root root 1675 10-20 10:37 id_rsa
-rw-r--r-- 1 root root 398 10-20 10:37 id_rsa.pub
7.将两个节点的公钥文件拷贝到对方,此过程需要对方的登录密码
# ssh-copy-id -i id_rsa.pub node2.a.com
# ssh-copy-id -i /root/.ssh/id_rsa.pub node1.a.com
8.将node1的yum配置文件复制到node2,很顺利,不用输入密码
# scp /etc/yum.repos.d/rhel-debuginfo.repo node2.a.com:/etc/yum.repos.d/
rhel-debuginfo.repo 100% 317 0.3KB/s 00:00
9.此时在节点一上直接就可以查看节点二的ip参数
# ssh node2.a.com 'ifconfig'
10.上传用到的软件包到节点1和节点2,并分别安装
cluster-glue-1.0.6-1.6.el5.i386.rpm
cluster-glue-libs-1.0.6-1.6.el5.i386.rpm
corosync-1.2.7-1.1.el5.i386.rpm
corosynclib-1.2.7-1.1.el5.i386.rpm
heartbeat-3.0.3-2.3.el5.i386.rpm
heartbeat-libs-3.0.3-2.3.el5.i386.rpm
libesmtp-1.0.4-5.el5.i386.rpm
openais-1.1.3-1.6.el5.i386.rpm
openaislib-1.1.3-1.6.el5.i386.rpm
pacemaker-1.1.5-1.1.el5.i386.rpm
pacemaker-cts-1.1.5-1.1.el5.i386.rpm
pacemaker-libs-1.1.5-1.1.el5.i386.rpm
perl-TimeDate-1.16-5.el5.noarch.rpm
resource-agents-1.0.4-1.1.el5.i386.rpm
# yum localinstall *.rpm -y --nogpgcheck 安装
11.在节点1上,进入corosync的主目录,将样例文件变为配置文件
# cd /etc/corosync/
#ll
-rw-r--r-- 1 root root 5384 2010-07-28 amf.conf.example openais的配置文件
-rw-r--r-- 1 root root 436 2010-07-28 corosync.conf.example corosync的配置文件
drwxr-xr-x 2 root root 4096 2010-07-28 service.d
drwxr-xr-x 2 root root 4096 2010-07-28 uidgid.d
# cp corosync.conf.example corosync.conf 生成主配置文件
12.编辑corosync.conf
#vim corosync.conf
compatibility: whitetank 向后兼容
totem { 心跳探测
version: 2 版本号
secauth: off 心跳探测时是否验证
threads: 0 为心跳探测启动的线程数量,0表示无限制
interface {
ringnumber: 0
bindnetaddr: 192.168.2.10 心跳探测的网卡ip地址
mcastaddr: 226.94.1.1 组播地址
mcastport: 5405 组播端口号
}
}
logging { 日志选项设置
fileline: off
to_stderr: no 是否将日志输出到标准输出设备(屏幕)上
to_logfile: yes 将日志记录到日志文件中
to_syslog: yes 将日志作为系统日志进行记录
logfile: /var/log/cluster/corosync.log 日志文件路径,该路径要手动创建
debug: off
timestamp: on 为日志打上时间戳
logger_subsys {
subsys: AMF
debug: off
}
}
amf { openais的选项
mode: disabled
}
处理以上外,还要在该文件内添加一些语句:
service {
ver: 0
name: pacemaker 使用pacemaker
}
aisexec { 使用openais的选项
user: root
group: root
}
13.在节点2上做上步类似的修改,只需要将totem { bindnetaddr: 192.168.2.10 }改为 192.168.2.20,其它的和节点1一样
直接将node1的/etc/corosync/corosync.conf文件复制大node2
# scp /etc/corosync/corosync.conf node2.a.com:/etc/corosync/
在修改node2的文件
14.在两个节点上创建目录/var/log/cluster,用来存放corosync的日志
# mkdir /var/log/cluster
15.在其中一个节点上,进入/etc/corosync/目录,然后产生验证文件authkey
# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Press keys on your keyboard to generate entropy (bits = 936).
Press keys on your keyboard to generate entropy (bits = 1000).
Writing corosync key to /etc/corosync/authkey.
16.将验证文件复制到另一个节点,保证两个节点的验证文件相同
# scp -p /etc/corosync/authkey node2.a.com:/etc/corosync/
17.启动节点1的corosync服务
# service corosync start
Starting Corosync Cluster Engine (corosync): [确定]
在节点1上启动节点2的corosync服务
# ssh node2.a.com 'service corosync start'
Starting Corosync Cluster Engine (corosync): [确定]
18.下面进行排错检测
在两个节点上执行以下命令:
检测启动是否正常
# grep -i -e "corosync cluster engine" -e "configuration file" /var/log/messages
Oct 20 14:01:58 localhost corosync[2069]: [MAIN ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.
Oct 20 14:01:58 localhost corosync[2069]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
检测心跳是否正常
# grep -i totem /var/log/messages
Oct 20 14:01:58 localhost corosync[2069]: [TOTEM ] The network interface [192.168.2.10] is now up.
检测其他的错误
# grep -i error: /var/log/messages ,节点1有很多关于stonith的错误,节点2无错误
Oct 20 14:03:02 localhost pengine: [2079]: ERROR: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
Oct 20 14:03:02 localhost pengine: [2079]: ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
Oct 20 14:03:02 localhost pengine: [2079]: ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
Oct 20 14:04:37 localhost pengine: [2079]: ERROR: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
Oct 20 14:04:37 localhost pengine: [2079]: ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
Oct 20 14:04:37 localhost pengine: [2079]: ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
检测pacemaker是否启动
# grep -i pcmk_startup /var/log/messages
Oct 20 14:01:59 localhost corosync[2069]: [pcmk ] info: pcmk_startup: CRM: Initialized
Oct 20 14:01:59 localhost corosync[2069]: [pcmk ] Logging: Initialized pcmk_startup
18.查看群集的状态
# crm status
============
Last updated: Sat Oct 20 14:24:26 2012
Stack: openais
Current DC: node1.a.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ node1.a.com node2.a.com ] 显示两个节点都为在线状态
19.在节点1上禁用stonith功能
# crm
crm(live)# configure
crm(live)configure# property stonith-enabled=false
crm(live)configure# commit
crm(live)configure# show
node node1.a.com
node node2.a.com
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false"
20.在节点1上定义资源
资源类型有4总:
primitive 本地主资源(同一时间只能在一个节点上使用)
group 组资源,将资源加入一个组,使组内的资源同时至现在一台节点上(例如ip地址 和 服务)
clone 需要同时在多个节点上同时启用的资源(如ocfs 、stonith,没有主次之分)
master 有主次之分的资源,如drbd
ra类型:
crm(live)ra# classes
heartbeat
lsb
ocf / heartbeat pacemaker ocf的提供者有两个:heartbeat和pacemaker
stonith
资源:
每个ra提供的总类不同,“list ra类型”可查看该ra支持的总类
格式如下:
资源类型 资源名字 ra类型:【提供者】:资源 参数
crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=192.168.2.100
crm(live)configure# commit 提交
21.此时在节点1上查看群集状态
# crm
crm(live)# status
Stack: openais
Current DC: node1.a.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
Online: [ node1.a.com node2.a.com ]
webip (ocf::heartbeat:IPaddr): Started node1.a.com 【webip资源在节点1上】
此时查看ip地址:
[root@node1 ~]# ifconfig
eth0:0 inet addr:192.168.2.100 虚拟ip地址在节点1上
节点二上:
crm(live)# status
Stack: openais
Current DC: node1.a.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
Online: [ node1.a.com node2.a.com ]
webip (ocf::heartbeat:IPaddr): Started node1.a.com
22.定义服务。在两个节点上安装httpd服务,确保httpd的服务是停止状态,并且开机不能自启动
# yum install httpd -y
由于httpd服务同一时刻只能运行在一台节点上,所以资源类型为primitive
crm(live)configure# primitive webserver lsb:httpd
crm(live)configure# show
node node1.a.com
node node2.a.com
primitive webip ocf:heartbeat:IPaddr \
params ip="192.168.2.100" ip资源
primitive webserver lsb:httpd httpd资源
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false"
提交:
crm(live)configure# commit
23.此时查看群集状态,发现webip在节点1上,httpd在节点2上
crm(live)# status
Stack: openais
Current DC: node1.a.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
2 Resources configured. 定义了两个资源
Online: [ node1.a.com node2.a.com ]
webip (ocf::heartbeat:IPaddr): Started node1.a.com webip在节点1上
webserver (lsb:httpd): Started node2.a.com httpd在节点2上
24.这个时候,node1 上将有虚拟ip地址,而node2上将启动httpd服务。可以创建一个组资源类型,将webip 和webserver 都加入该组中,同一组内的资源将会分配给同一个节点
group 组名 资源名1 资源名2
crm(live)configure# group web webip webserver
crm(live)configure# commit 提交
crm(live)configure# show
node node1.a.com
node node2.a.com
primitive webip ocf:heartbeat:IPaddr \
params ip="192.168.2.100"
primitive webserver lsb:httpd
group web webip webserver
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false"
25.再次查看群集状态,两个资源都在节点1上
crm(live)# status
Last updated: Sat Oct 20 16:39:37 2012
Stack: openais
Current DC: node1.a.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
Online: [ node1.a.com node2.a.com ]
Resource Group: web
webip (ocf::heartbeat:IPaddr): Started node1.a.com
webserver (lsb:httpd): Started node1.a.com
26.此时ip地址和httpd服务都在节点1上
[root@node1 ~]# service httpd status
httpd (pid 2800) 正在运行...
[root@node1 ~]# ifconfig eth0:0
eth0:0 Link encap:Ethernet HWaddr 00:0C:29:37:3F:E6
inet addr:192.168.2.100 Bcast:192.168.2.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:67 Base address:0x2024
27.在两个节点上分别创建网页
node1:
# echo "node1" > /var/www/html/index.html
直接在node1 上为node2创建网页
# ssh node2.a.com 'echo "node2" > /var/www/html/index.html'
28.在浏览器中输入 http://192.168.2.100访问网页
29.可以访问到node1的网页,这时可以模仿node1节点失效的情况
[root@node1 ~]# service corosync stop
Signaling Corosync Cluster Engine (corosync) to terminate: [确定]
Waiting for corosync services to unload:........ [确定
再次访问该ip地址,发现无法放到网页
30.此时在节点2上查看群集状态,没有显示webip 和 webserver 运行在哪个节点上
[root@node2 ~]# crm
crm(live)# status
Last updated: Sat Oct 20 16:55:16 2012
Stack: openais
Current DC: node2.a.com - partition WITHOUT quorum 显示node2为票数统计者,但是没有票数
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
Online: [ node2.a.com ]
OFFLINE: [ node1.a.com ]
31.此时可以关闭quorum,在此选择ignore
当票数不足一半时,可选的参数有:
ignore 忽略
freeze 冻结,已经启用的资源继续使用,没有启用的资源不能使用
stop 默认选项
suicide 杀死所有资源
32.再次启动node1 的corosync 服务,改变quorum
# service corosync start
crm(live)configure# property no-quorum-policy=ignore
crm(live)configure# commit
33.再次关闭node1的corosync服务,在node2 上查看状态
# service corosync stop 关闭node1的服务
Signaling Corosync Cluster Engine (corosync) to terminate: [确定]
Waiting for corosync services to unload:....... [确定]
node2 上的群集状态:
[root@node2 ~]# crm status
Stack: openais
Current DC: node2.a.com - partition WITHOUT quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
Online: [ node2.a.com ]
OFFLINE: [ node1.a.com ]
Resource Group: web
webip (ocf::heartbeat:IPaddr): Started node2.a.com
webserver (lsb:httpd): Started node2.a.com
34.此时访问192.168.2.100,将会看到节点2 的网页
35.此时若再次启用节点1的corosync服务
[root@node1 ~]# service corosync start
将会发现,节点1不会进行资源夺取,直到节点2 失效
[root@node1 ~]# crm status
Last updated: Sat Oct 20 17:17:24 2012
Stack: openais
Current DC: node2.a.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
Online: [ node1.a.com node2.a.com ]
Resource Group: web
webip (ocf::heartbeat:IPaddr): Started node2.a.com
webserver (lsb:httpd): Started node2.a.com
DRBD配置
36.为两个节点的磁盘进行分区,要求两个节点上的分区大小要一模一样。
以下操作在两台节点上都进行
# fdisk /dev/sda
Command (m for help): p 显示当前的分区信息
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 1288 10241437+ 83 Linux
/dev/sda3 1289 1415 1020127+ 82 Linux swap / Solaris
Command (m for help): n 增加一个分区
Command action
e extended
p primary partition (1-4)
e 增加和一个扩展分区
Selected partition 4
First cylinder (1416-2610, default 1416): 起始柱面
Using default value 1416
Last cylinder or +size or +sizeM or +sizeK (1416-2610, default 2610): 结束柱面
Using default value 2610
Command (m for help): n 增加一个分区(此时默认为逻辑分区)
First cylinder (1416-2610, default 1416): 起始柱面
Using default value 1416
Last cylinder or +size or +sizeM or +sizeK (1416-2610, default 2610): +1G 大小为1G
Command (m for help): p 再次显示分区信息
Disk /dev/sda: 21.4 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 1288 10241437+ 83 Linux
/dev/sda3 1289 1415 1020127+ 82 Linux swap / Solaris
/dev/sda4 1416 2610 9598837+ 5 Extended
/dev/sda5 1416 1538 987966 83 Linux
Command (m for help): w 保存分区结果并退出
The partition table has been altered!
Calling ioctl() to re-read partition table.
WARNING: Re-reading the partition table failed with error 16: 设备或资源忙.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.
37.使内核重新读取分区表(两个节点上做同样的操作)
# partprobe /dev/sda
# cat /proc/partitions
major minor #blocks name
8 0 20971520 sda
8 1 104391 sda1
8 2 10241437 sda2
8 3 1020127 sda3
8 4 0 sda4
8 5 987966 sda5
38.上传GRBD主程序和内核模块程序,由于当前内核模块为2.6.18 ,在2.6.33的内核中才开始集成DRBD的内核代码,但是可以使用模块方式将DRBD的载入内核。安装这两个软件
drbd83-8.3.8-1.el5.centos.i386.rpm GRBD主程序
kmod-drbd83-8.3.8-1.el5.centos.i686.rpm 内核模块
# yum localinstall drbd83-8.3.8-1.el5.centos.i386.rpm kmod-drbd83-8.3.8-1.el5.centos.i686.rpm -y --nogpgcheck
39.在两个节点上分别执行以下命令
#modprobe drbd 加载内核模块
# lsmod |grep drbd 显示是否加载成功
40.在两个节点上编辑grbd的配置文件 :/etc/grbd.conf
#
# You can find an example in /usr/share/doc/drbd.../drbd.conf.example
include "drbd.d/global_common.conf"; 包含全局通用配置文件
include "drbd.d/*.res"; 包含资源文件
# please have a a look at the example configuration file in
# /usr/share/doc/drbd83/drbd.conf
41. 在两个节点上编辑global_common.conf文件,编辑之前最好做备份
# cd /etc/drbd.d/
# cp -p global_common.conf global_common.conf.bak
#vim global_common.conf
global {
usage-count no; 不统计用法计数(影响性能)
# minor-count dialog-refresh disable-ip-verification
}
common {
protocol C; 使用C类协议当存储到对方的磁盘后才算结束
handlers {
# fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
# split-brain "/usr/lib/drbd/notify-split-brain.sh root";
# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
}
startup { 启动时延迟配置
wfc-timeout 120;
degr-wfc-timeout 120;
}
disk {
on-io-error detach; 当io出错时拆除磁盘
fencing resource-only;
}
net {
cram-hmac-alg "sha1";通讯时使用sha1加密
shared-secret "abc"; 预共享密钥,双方应相同
}
syncer {
rate 100M; 同步时的速率
}
}
42.在两个节点上分别编辑资源文件,文件名可随便写,但是不能有空格
#/etc/drbd.d/ web.res
resource web { 资源名
on node1.a.com { node1.a.com的资源
device /dev/drbd0; 逻辑设备名,在/dev/下
disk /dev/sda5; 真实设备名,节点间共享的磁盘或分区
address 192.168.2.10:7789; 节点1的ip地址
meta-disk internal; 磁盘类型
}
on node2.a.com { node2.a.com的资源
device /dev/drbd0;
disk /dev/sda5;
address 192.168.2.20:7789;
meta-disk internal;
}
43.在两个节点上初始化资源web
# drbdadm create-md web 创建多设备web
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
44.在两个节点上启动drbd服务
# service drbd start
Starting DRBD resources: [
web
Found valid meta data in the expected location, 1011671040 bytes into /dev/sda5.
d(web) n(web) ]...
45.查看当前哪台设备室激活设备
# cat /proc/drbd
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by [email protected], 2010-06-04 08:04:16
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:987896
当前设备的角色/对方的角色 ,可知当前两台设备都未激活,都无权限读取磁盘
或是使用命令drbd-overview 查看当前设备状态
drbd-overview
0:web Connected Secondary/Secondary Inconsistent/Inconsistent C r----
46.在节点1上执行命令,将当前设备成为主设备
# drbdadm -- --overwrite-data-of-peer primary web
# drbd-overview 查看当前激活设备,显示该设备为主设备,已经同步3.4%
0:web SyncSource Primary/Secondary UpToDate/Inconsistent C r----
[>....................] sync'ed: 3.4% (960376/987896)K delay_probe: 87263
节点2 上的情况:
# drbd-overview
0:web SyncTarget Secondary/Primary Inconsistent/UpToDate C r----
[=>..................] sync'ed: 10.0% (630552/692984)K queue_delay: 0.0 ms
47.在节点1上格式化主设备的磁盘
# mkfs -t ext3 -L drbdweb /dev/drbd0
48.在节点1上新建挂载点,将/dev/drbd0挂载到上面
# mkdir /mnt/web
# mount /dev/drbd0 /mnt/web
49.将node1变为备份设备,node2 变为主设备,在node1上执行命令
# drbdadm secondary web
0: State change failed: (-12) Device is held open by someone 提示资源正在被某个用户使用
Command 'drbdsetup 0 secondary' terminated with exit code 11
可以先卸载,然后再执行
# umount /mnt/web/
# drbdadm secondary web
50.查看当前设备node1的状态,显示:两个节点都为备份节点
# drbd-overview
0:web Connected Secondary/Secondary UpToDate/UpToDate C r----
51.在节点2上,将当前设备设置为主设备
# drbdadm primary web
# drbd-overview 当前设备成为主设备
0:web Connected Primary/Secondary UpToDate/UpToDate C r----
52.在节点2上格式化/dev/drbd0
# mkfs -t ext3 -L drbdweb /dev/drbd0
53.节点2上创建挂载点,将/dev/drbd0 挂载上
# mkdir /mnt/web
# mount /dev/drbd0 /mnt/web 若节点2 不是主节点,将不能挂载
54.在节点1上指定默认粘性值
crm(live)configure# rsc_defaults resource-stickiness=100
crm(live)configure# commit
55.在节点1上定义资源
crm(live)configure# primitive webdrbd ocf:heartbeat:drbd params drbd_resource=web op monitor role=Master interval=50s timeout=30s op monitor role=Slave interval=60s timeout=30s
56.创建master类型的资源,将webdrbd 加入
crm(live)configure# master MS_Webdrbd webdrbd meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
57.为Primary节点上的web资源创建自动挂载的集群服务
crm(live)configure# primitive WebFS ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/mnt/web" fstype="ext3"
58.
crm(live)configure# colocation WebFS_on_MS_webdrbd inf: WebFS MS_Webdrbd:Master
crm(live)configure# order WebFS_after_MS_Webdrbd inf: MS_Webdrbd:promote WebFS:start
crm(live)configure# verify
crm(live)configure# commit
59.将节点1 设置为主节点:drbdadm primary web,然后挂载/dev/drbd0到/mnt/web。切换至/mnt/web,创建目录html
60.编辑node1
# vim /etc/httpd/conf/httpd.conf
DocumentRoot "/mnt/web/html"
# echo "
Node1.a.org
" > /mnt/debd/html/index.html# crm configure primitive WebSite lsb:httpd //添加httpd为资源
# crm configure colocation website-with-ip INFINITY: WebSite WebIP //是IP和web服务在同一主机上
# crm configure order httpd-after-ip mandatory: WebIP WebSite //定义资源启动顺序