Fedora 15上做主从、双主模型的集群
Fedora 15上做双主模型的集群
拓扑如下所示:
----------- ----------
| HA1 |____| HA2 |
|__________| |________|
----------- ----------
| HA1 |____| HA2 |
|__________| |________|
HA1:
IP:192.168.1.78/24
HA2:
IP:192.168.1.151/24
VIP:192.168.1.110
IP:192.168.1.78/24
HA2:
IP:192.168.1.151/24
VIP:192.168.1.110
一、配置网络属性
HA1:
#ifconfit eth0 192.168.1.78/24
#route add default gw 192.168.1.1
#hostname node1.luowei.com
HA2:
#ifconfig eth0 192.168.1.151/24
#route add default gw 192.168.1.1
#hostname node2.luowei.com
二、配置主机名及两个之间不实用密码能相互通信
#vim /etc/hosts 添加如下内容
192.168.1.78 node1.luowei.com node1
192.168.1.151 node2.luowei.com node2
同样在HA2上也添加这些内容
#ping node2|node1能解析出来就OK啦
#vim /etc/hosts 添加如下内容
192.168.1.78 node1.luowei.com node1
192.168.1.151 node2.luowei.com node2
同样在HA2上也添加这些内容
#ping node2|node1能解析出来就OK啦
分别在两个HA上生成一对密钥,如下所示
[root@node1 ~]# ssh-keygen -t rsa //生成公钥和密钥
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
59:71:5d:4d:4c:6d:71:b1:ec:04:17:26:49:cb:27:a1 [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| . o*.@@|
| oo.X B|
| .E + * |
| o = |
| S . |
| |
| |
| |
| | //这个图案就是所谓的指纹信息吧,呵呵,redhat上没有
+-----------------+
[root@node1 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@node2 //拷贝公钥到对方机
[root@node1 ~]# ssh-keygen -t rsa //生成公钥和密钥
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
59:71:5d:4d:4c:6d:71:b1:ec:04:17:26:49:cb:27:a1 [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
| . o*.@@|
| oo.X B|
| .E + * |
| o = |
| S . |
| |
| |
| |
| | //这个图案就是所谓的指纹信息吧,呵呵,redhat上没有
+-----------------+
[root@node1 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@node2 //拷贝公钥到对方机
器上
The authenticity of host 'node2 (192.168.1.151)' can't be established.
RSA key fingerprint is 77:b6:c6:09:51:f9:f4:70:c1:35:81:47:a5:19:f4:d2.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node2,192.168.1.151' (RSA) to the list of known
The authenticity of host 'node2 (192.168.1.151)' can't be established.
RSA key fingerprint is 77:b6:c6:09:51:f9:f4:70:c1:35:81:47:a5:19:f4:d2.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node2,192.168.1.151' (RSA) to the list of known
hosts.
root@node2's password: //输入对方机器的密码
Now try logging into the machine, with "ssh 'root@node2'", and check in:
root@node2's password: //输入对方机器的密码
Now try logging into the machine, with "ssh 'root@node2'", and check in:
~/.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
在HA2上做同上的操作,我就不具体演示了!
在HA2上做同上的操作,我就不具体演示了!
三、配置yum源,我使用的是163做的镜像源
http://mirrors.163.com/
这上面有对应fedora的yum源配置使用的说明,我就不做详细阐述了
如果你没有DNS解析域名,还要在/etc/hosts文件中手动添加解析奥,我的如下:
66.35.62.166 mirrors.fedoraproject.org
213.129.242.84 mirrors.rpmfusion.org
123.58.173.106 mirrors.163.com
这些对应的域名和IP关系大家都会,就是使用ping可以解析出,不解释!
http://mirrors.163.com/
这上面有对应fedora的yum源配置使用的说明,我就不做详细阐述了
如果你没有DNS解析域名,还要在/etc/hosts文件中手动添加解析奥,我的如下:
66.35.62.166 mirrors.fedoraproject.org
213.129.242.84 mirrors.rpmfusion.org
123.58.173.106 mirrors.163.com
这些对应的域名和IP关系大家都会,就是使用ping可以解析出,不解释!
四、安装集群软件
两个节点上都要做的
#yum install corosync pacemaker -y //由于是网络镜像,会比较慢,耐心等会吧!
两个节点上都要做的
#yum install corosync pacemaker -y //由于是网络镜像,会比较慢,耐心等会吧!
安装完成之后就是配置了,注意配置的时候选择的端口和地址不能跟已存在的集群冲突,所
以我就做了一下简单的设置
#export ais_port=4000
#export ais_mcast=226.94.1.1
#export ais_port=4000
#export ais_mcast=226.94.1.1
接下来就是配置corosync了:
#cd /etc/corosync/
#cp corosync.conf.example corosync.conf
#vim !$ 把配置改成如下
# Please read the corosync.conf.5 manual page
compatibility: whitetank
#cd /etc/corosync/
#cp corosync.conf.example corosync.conf
#vim !$ 把配置改成如下
# Please read the corosync.conf.5 manual page
compatibility: whitetank
totem {
version: 2
secauth: on
threads: 0
interface {
ringnumber: 0
bindnetaddr: 192.168.1.0 //指定集群所在的网段的网络号
mcastaddr: 226.94.1.1 //组播地址
mcastport: 4000 //端口号
ttl: 1
}
}
version: 2
secauth: on
threads: 0
interface {
ringnumber: 0
bindnetaddr: 192.168.1.0 //指定集群所在的网段的网络号
mcastaddr: 226.94.1.1 //组播地址
mcastport: 4000 //端口号
ttl: 1
}
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: no
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
####以下是添加的内容
service {
ver: 1 //定义pacemaker的版本,fedora上使用版本1,而在redhat上可以
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: no
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
####以下是添加的内容
service {
ver: 1 //定义pacemaker的版本,fedora上使用版本1,而在redhat上可以
使用0
name: pacemaker
}
aisexec {
user: root
group: root
}
其中注释的内容为所修改的内容
配置完成之后,拷贝一个到另一个节点上
#scp -p /etc/corosync/corosync.conf node2:/etc/corosync/
name: pacemaker
}
aisexec {
user: root
group: root
}
其中注释的内容为所修改的内容
配置完成之后,拷贝一个到另一个节点上
#scp -p /etc/corosync/corosync.conf node2:/etc/corosync/
确保没有错误的情况下,可以在HA1上启动了,启动之后还要进行一些列的检测
#/etc/init.d/corosync start
#/etc/init.d/corosync start
添加认证密钥
#corosync-keygen //这个要是新机器的话,时间会长一点,要有点耐性等待!
#scp -p authkeys corosync.conf node2:/etc/corosync/
配置完成之后,现在HA1上启动corosync:
#server corosync start
Starting corosync (via systemctl): [ OK ] oK,corosync
#corosync-keygen //这个要是新机器的话,时间会长一点,要有点耐性等待!
#scp -p authkeys corosync.conf node2:/etc/corosync/
配置完成之后,现在HA1上启动corosync:
#server corosync start
Starting corosync (via systemctl): [ OK ] oK,corosync
服务启动成功!
接下来就是检测集群是否正确启动并且已经可以和其他节点建立集群关系了:
查看corosync引擎是否正常启动:
[root@node1 ~]# grep -e "Corosync Cluster Engine" -e "configuration file"
接下来就是检测集群是否正确启动并且已经可以和其他节点建立集群关系了:
查看corosync引擎是否正常启动:
[root@node1 ~]# grep -e "Corosync Cluster Engine" -e "configuration file"
/var/log/messages
Sep 18 23:09:44 node1 smartd[786]: Opened configuration file /etc/smartd.conf
Sep 19 13:41:03 node1 smartd[801]: Opened configuration file /etc/smartd.conf
Sep 19 20:44:55 node1 smartd[680]: Opened configuration file /etc/smartd.conf
[root@node1 ~]# grep -e "Corosync Cluster Engine" -e "configuration file"
Sep 18 23:09:44 node1 smartd[786]: Opened configuration file /etc/smartd.conf
Sep 19 13:41:03 node1 smartd[801]: Opened configuration file /etc/smartd.conf
Sep 19 20:44:55 node1 smartd[680]: Opened configuration file /etc/smartd.conf
[root@node1 ~]# grep -e "Corosync Cluster Engine" -e "configuration file"
/var/log/cluster/corosync.log
Sep 18 17:12:06 corosync [MAIN ] Corosync Cluster Engine ('1.4.1'): started and
Sep 18 17:12:06 corosync [MAIN ] Corosync Cluster Engine ('1.4.1'): started and
ready to provide service.
Sep 18 17:12:06 corosync [MAIN ] Successfully read main configuration file
Sep 18 17:12:06 corosync [MAIN ] Successfully read main configuration file
'/etc/corosync/corosync.conf'.
Sep 18 17:12:06 corosync [MAIN ] Corosync Cluster Engine exiting with status 8
Sep 18 17:12:06 corosync [MAIN ] Corosync Cluster Engine exiting with status 8
at main.c:1702.
Sep 18 17:16:11 corosync [MAIN ] Corosync Cluster Engine ('1.4.1'): started and
Sep 18 17:16:11 corosync [MAIN ] Corosync Cluster Engine ('1.4.1'): started and
ready to provide service.
查看初始化成员节点通知是否正常发出:
[root@node1 ~]# grep TOTEM /var/log/cluster/corosync.log
[root@node1 ~]# grep TOTEM /var/log/cluster/corosync.log
检查启动过程中是否有错误产生:
[root@node2 ~]# grep ERROR: /var/log/cluster/corosync.log | grep -v
[root@node2 ~]# grep ERROR: /var/log/cluster/corosync.log | grep -v
unpack_resources
查看pacemaker是否正常启动:
[root@node1 ~]# grep pcmk_startup /var/log/cluster/corosync.log
Sep 19 13:48:48 corosync [pcmk ] info: pcmk_startup: CRM: Initialized
Sep 19 13:48:48 corosync [pcmk ] Logging: Initialized pcmk_startup
Sep 19 13:48:48 corosync [pcmk ] info: pcmk_startup: Maximum core file size is:
[root@node1 ~]# grep pcmk_startup /var/log/cluster/corosync.log
Sep 19 13:48:48 corosync [pcmk ] info: pcmk_startup: CRM: Initialized
Sep 19 13:48:48 corosync [pcmk ] Logging: Initialized pcmk_startup
Sep 19 13:48:48 corosync [pcmk ] info: pcmk_startup: Maximum core file size is:
4294967295
Sep 19 13:48:48 corosync [pcmk ] info: pcmk_startup: Service: 9
Sep 19 13:48:48 corosync [pcmk ] info: pcmk_startup: Local hostname: node1.luo
检查完毕,接下来就可以启动另一个节点了,最好在同一个节点上启动所有的其他的集群节
Sep 19 13:48:48 corosync [pcmk ] info: pcmk_startup: Service: 9
Sep 19 13:48:48 corosync [pcmk ] info: pcmk_startup: Local hostname: node1.luo
检查完毕,接下来就可以启动另一个节点了,最好在同一个节点上启动所有的其他的集群节
点:
[root@node1 ~]# ssh node2 -- '/etc/init.d/corosync start'
Starting corosync (via systemctl): [ OK ]
启动成功了!
[root@node1 ~]# ssh node2 -- '/etc/init.d/corosync start'
Starting corosync (via systemctl): [ OK ]
启动成功了!
接下来就是启动pacemaker了!
[root@node1 corosync]# /etc/init.d/pacemaker start
Starting pacemaker (via systemctl): [ OK ]
ok,同样启动成功
[root@node1 corosync]# /etc/init.d/pacemaker start
Starting pacemaker (via systemctl): [ OK ]
ok,同样启动成功
# ps axf //查看进程
1724 ? R 5:59 /usr/lib/heartbeat/stonithd
1725 ? R 5:59 /usr/lib/heartbeat/cib
1726 ? S 0:00 /usr/lib/heartbeat/lrmd
1727 ? R 5:59 /usr/lib/heartbeat/attrd
1728 ? S 0:00 /usr/lib/heartbeat/pengine
1729 ? R 5:59 /usr/lib/heartbeat/crmd
可以看出已经有进程了
1724 ? R 5:59 /usr/lib/heartbeat/stonithd
1725 ? R 5:59 /usr/lib/heartbeat/cib
1726 ? S 0:00 /usr/lib/heartbeat/lrmd
1727 ? R 5:59 /usr/lib/heartbeat/attrd
1728 ? S 0:00 /usr/lib/heartbeat/pengine
1729 ? R 5:59 /usr/lib/heartbeat/crmd
可以看出已经有进程了
当然这个时候有个关键性的设置,就是关闭防火墙,如果你没有关闭防火墙功能,下面将会
给你带来很大的麻烦,我开始就是没有关闭防火墙,后来看日志才知道,所以你做的时候可
以把防火墙先关闭了,但是在真正应用之中,还是要开启防火墙功能
#setup 然后在里面选择Firewall configure 然后disabled就行了
#setup 然后在里面选择Firewall configure 然后disabled就行了
接下来使用crm的内部命令进行查看
#crm_mon 或crm status
Online: [ node2.luowei.com node1.luowei.com ]可以看出,集群的节点都启动了
#crm_mon 或crm status
Online: [ node2.luowei.com node1.luowei.com ]可以看出,集群的节点都启动了
一切准备停当,接下来就是双主集群的配置了!
五、安装apache服务和集群文件系统-GFS2
为了方便验证,我就安装一个apache服务用于测试:
#yum install httpd -y
在HA1上的添加测试页面:
#echo "<h1>node1.luowei.com<h1>" >/var/www/html/index.html
在HA2上的添加测试页面:
#echo "<h1>node2.luowei.com<h1>" >/var/www/html/index.html
然后把两个节点上的/etc/httpd/conf/httpd.conf的配置文件,保持一下的内容是开启的,
为了方便验证,我就安装一个apache服务用于测试:
#yum install httpd -y
在HA1上的添加测试页面:
#echo "<h1>node1.luowei.com<h1>" >/var/www/html/index.html
在HA2上的添加测试页面:
#echo "<h1>node2.luowei.com<h1>" >/var/www/html/index.html
然后把两个节点上的/etc/httpd/conf/httpd.conf的配置文件,保持一下的内容是开启的,
如果有注释的,请去掉注释
<Location /server-status>
SetHandler server-status
Order deny,allow
Deny from all
Allow from 127.0.0.1
</Location>
<Location /server-status>
SetHandler server-status
Order deny,allow
Deny from all
Allow from 127.0.0.1
</Location>
保证httpd服务不会随着开机自动启动
#chkconfig httpd off
#chkconfig httpd off
#crm configure property stonith-enabled=false //关闭stonith设备
#crm configure property no-quorum-policy=ignore //关闭两节点之间的选举
#crm configure
#crm configure property no-quorum-policy=ignore //关闭两节点之间的选举
#crm configure
为httpd添加资源
# crm configure primitive WebSite ocf:heartbeat:apache params
# crm configure primitive WebSite ocf:heartbeat:apache params
configfile=/etc/httpd/conf/httpd.conf op monitor interval=1min
# crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 params
# crm configure primitive ClusterIP ocf:heartbeat:IPaddr2 params
ip=192.168.1.110 cidr_netmask=32 op monitor interval=30s //添加一个虚拟IP
[root@node1 ~]# crm status
============
Last updated: Mon Sep 19 23:44:05 2011
Stack: openais
Current DC: node2.luowei.com - partition with quorum
Version: 1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
2 Resources configured.
============
[root@node1 ~]# crm status
============
Last updated: Mon Sep 19 23:44:05 2011
Stack: openais
Current DC: node2.luowei.com - partition with quorum
Version: 1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
2 Resources configured.
============
Online: [ node2.luowei.com node1.luowei.com ]
ClusterIP (ocf::heartbeat:IPaddr2): Started node1.luowei.com
WebSite (ocf::heartbeat:apache): Started node2.luowei.com
可以看到两个资源不在同一个节点上,所以需要做一下的设置:
#crm configure colocation website-with-ip INFINITY: WebSite ClusterIP //做一个
WebSite (ocf::heartbeat:apache): Started node2.luowei.com
可以看到两个资源不在同一个节点上,所以需要做一下的设置:
#crm configure colocation website-with-ip INFINITY: WebSite ClusterIP //做一个
位置约束
然后再使用crm status 查看资源已经都流转到同一个节点上了,如下所示
Online: [ node2.luowei.com node1.luowei.com ]
然后再使用crm status 查看资源已经都流转到同一个节点上了,如下所示
Online: [ node2.luowei.com node1.luowei.com ]
ClusterIP (ocf::heartbeat:IPaddr2): Started node1.luowei.com
WebSite (ocf::heartbeat:apache): Started node1.luowei.com
还要控制资源的启动停止顺序
#crm configure order apache-after-ip mandatory: ClusterIP WebSite //定义ip的资
WebSite (ocf::heartbeat:apache): Started node1.luowei.com
还要控制资源的启动停止顺序
#crm configure order apache-after-ip mandatory: ClusterIP WebSite //定义ip的资
源要在apache的服务启动之前启动
指定优先的Location
#crm configure location prefer-pcmk-l WebSite 50: node1.luowei.com
#crm configure location prefer-pcmk-l WebSite 50: node1.luowei.com
#crm configure show //查看一下自己的配置如下
[root@node1 ~]# crm configure show
node node1.luowei.com
node node2.luowei.com
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="192.168.1.110" cidr_netmask="32" \
op monitor interval="30s"
primitive WebSite ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf" \
op monitor interval="1min"
location prefer-pcmk-l WebSite 50: node1.luowei.com
colocation website-with-ip inf: WebSite ClusterIP
order apache-after-ip inf: ClusterIP WebSite
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
如上图所示,资源已经启动了,所以接下来就可以往下做了!
可以在浏览其中输入 http://192.168.1.110可以访问web服务了!
[root@node1 ~]# crm configure show
node node1.luowei.com
node node2.luowei.com
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="192.168.1.110" cidr_netmask="32" \
op monitor interval="30s"
primitive WebSite ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf" \
op monitor interval="1min"
location prefer-pcmk-l WebSite 50: node1.luowei.com
colocation website-with-ip inf: WebSite ClusterIP
order apache-after-ip inf: ClusterIP WebSite
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
如上图所示,资源已经启动了,所以接下来就可以往下做了!
可以在浏览其中输入 http://192.168.1.110可以访问web服务了!
六、安装DRBD软件包
DRBD实现节点之间的数据同步的,实现备份功能。
1.# yum install drbd-pacemaker drbd-udev -y
2.安装完drbd之后,首先要在两个节点上做一个单独的磁盘分区来存放数据
这里我用一块新的磁盘(/dev/sdb)进行实验,划分磁盘分区如下所示:
#fdisk /dev/sdb
[root@node1 ~]# fdisk /dev/sda1
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF
DRBD实现节点之间的数据同步的,实现备份功能。
1.# yum install drbd-pacemaker drbd-udev -y
2.安装完drbd之后,首先要在两个节点上做一个单独的磁盘分区来存放数据
这里我用一块新的磁盘(/dev/sdb)进行实验,划分磁盘分区如下所示:
#fdisk /dev/sdb
[root@node1 ~]# fdisk /dev/sda1
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF
disklabel
Building a new DOS disklabel with disk identifier 0xcaf34d49.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.
Building a new DOS disklabel with disk identifier 0xcaf34d49.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
Command (m for help): p
Disk /dev/sda1: 524 MB, 524288000 bytes
255 heads, 63 sectors/track, 63 cylinders, total 1024000 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xcaf34d49
255 heads, 63 sectors/track, 63 cylinders, total 1024000 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xcaf34d49
Device Boot Start End Blocks Id System
Command (m for help): p
Disk /dev/sda1: 524 MB, 524288000 bytes
255 heads, 63 sectors/track, 63 cylinders, total 1024000 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xcaf34d49
255 heads, 63 sectors/track, 63 cylinders, total 1024000 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0xcaf34d49
Device Boot Start End Blocks Id System
Command (m for help): q
# partprobe /dev/sdb
# pvcreate /dev/sdb1
# vgcreate VolGroupb /dev/sdb1
# lvcreate -n drbd-demo -L 1G VolGroupb
[root@node1 ~]# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
lv_root VolGroup -wi-ao 17.56g
lv_swap VolGroup -wi-ao 1.94g
drbd-demo VolGroupb -wi-a- 1.00g
ok!HA1上的逻辑卷做好了,这个过程也要在HA2上来一遍,我就不多展示了!
# partprobe /dev/sdb
# pvcreate /dev/sdb1
# vgcreate VolGroupb /dev/sdb1
# lvcreate -n drbd-demo -L 1G VolGroupb
[root@node1 ~]# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
lv_root VolGroup -wi-ao 17.56g
lv_swap VolGroup -wi-ao 1.94g
drbd-demo VolGroupb -wi-a- 1.00g
ok!HA1上的逻辑卷做好了,这个过程也要在HA2上来一遍,我就不多展示了!
3.准备完成,接下来就是配置DRBD了!
#vim /etc/drbd.conf
include "drbd.d/global_common.conf";
include "drbd.d/*.res";
#vim /etc/drbd.conf
include "drbd.d/global_common.conf";
include "drbd.d/*.res";
global {
usage-count yes;
}
common {
protocol C;
}
resource wwwdata {
meta-disk internal;
device /dev/drbd1;
syncer {
verify-alg sha1;
}
net {
allow-two-primaries;
}
on node1.luowei.com {
disk /dev/mapper/VolGroupb-drbd--demo;
address 192.168.1.78:7789; //定义HA1节点的
}
on node2.luowei.com {
disk /dev/mapper/VolGroupb-drbd--demo;
address 192.168.1.151:7789; //定义HA2节点的
}
}
usage-count yes;
}
common {
protocol C;
}
resource wwwdata {
meta-disk internal;
device /dev/drbd1;
syncer {
verify-alg sha1;
}
net {
allow-two-primaries;
}
on node1.luowei.com {
disk /dev/mapper/VolGroupb-drbd--demo;
address 192.168.1.78:7789; //定义HA1节点的
}
on node2.luowei.com {
disk /dev/mapper/VolGroupb-drbd--demo;
address 192.168.1.151:7789; //定义HA2节点的
}
}
4.接下来就是初始化并加载DRBD了
# drbdadm create-md wwwdata
New drbd meta data block successfully created.
初始化成功!
# drbdadm create-md wwwdata
New drbd meta data block successfully created.
初始化成功!
5.接下来查看DRBD的模块载入内核并检测是不是都正常
[root@node1 ~]# modprobe drbd
[root@node1 ~]# drbdadm up wwwdata
[root@node1 ~]# cat /proc/drbd
version: 8.3.9 (api:88/proto:86-95)
srcversion: CF228D42875CF3A43F2945A
1: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----s
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1048508
可以看出已经出现了Secondary了,下面在第二个节点上使用上面同样的方法进行模块载入
[root@node1 ~]# modprobe drbd
[root@node1 ~]# drbdadm up wwwdata
[root@node1 ~]# cat /proc/drbd
version: 8.3.9 (api:88/proto:86-95)
srcversion: CF228D42875CF3A43F2945A
1: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----s
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1048508
可以看出已经出现了Secondary了,下面在第二个节点上使用上面同样的方法进行模块载入
并检测,此处省略.....
6.然后在任意一个节点上查看,现在两个都已经是Secondary了,所以一切正常
[root@node1 ~]# drbd-overview
1:wwwdata Connected Secondary/Secondary Inconsistent/Inconsistent C r-----
[root@node1 ~]# drbd-overview
1:wwwdata Connected Secondary/Secondary Inconsistent/Inconsistent C r-----
7.现在我们把HA1设置为主节点
[root@node1 ~]# drbdadm -- --overwrite-data-of-peer primary wwwdata
然后使用如下命令可以实时监视这整个数据从主节点想备用节点上拷贝数据的过程:
[root@node1 ~]# watch -n 1 'drbd-overview'
1:wwwdata SyncSource Primary/Secondary UpToDate/Inconsistent C r-----
[==>.................] sync'ed: 0.8% (1042492/1048508)K
1:wwwdata Connected Primary/Secondary UpToDate/UpToDate C r-----完成了数据的
[root@node1 ~]# drbdadm -- --overwrite-data-of-peer primary wwwdata
然后使用如下命令可以实时监视这整个数据从主节点想备用节点上拷贝数据的过程:
[root@node1 ~]# watch -n 1 'drbd-overview'
1:wwwdata SyncSource Primary/Secondary UpToDate/Inconsistent C r-----
[==>.................] sync'ed: 0.8% (1042492/1048508)K
1:wwwdata Connected Primary/Secondary UpToDate/UpToDate C r-----完成了数据的
同步,现在HA1处于Primary状态,它允许写入了,可以在上面创建文件系统并把一些数据放
进去了。
8.向DRBD中添加数据:
[root@node1 ~]# mkfs.ext4 /dev/drbd1 //格式化分区
[root@node1 ~]# mount /dev/drbd1 /mnt/ //挂载分区
[root@node1 ~]# echo "<h2>drbd test page</h2>" >/mnt/index.html
[root@node1 ~]# umount /mnt/ //卸载分区
[root@node1 ~]# mkfs.ext4 /dev/drbd1 //格式化分区
[root@node1 ~]# mount /dev/drbd1 /mnt/ //挂载分区
[root@node1 ~]# echo "<h2>drbd test page</h2>" >/mnt/index.html
[root@node1 ~]# umount /mnt/ //卸载分区
9.在集群中配置DRBD
[root@node1 ~]# crm
crm(live)# cib new drbd
crm(drbd)# configure
crm(drbd)configure# primitive WebData ocf:linbit:drbd params
[root@node1 ~]# crm
crm(live)# cib new drbd
crm(drbd)# configure
crm(drbd)configure# primitive WebData ocf:linbit:drbd params
drbd_resource=wwwdata op monitor interval=60s
crm(drbd)configure# ms WebDataClone WebData meta master-max=1 master-node-max=1
crm(drbd)configure# ms WebDataClone WebData meta master-max=1 master-node-max=1
clone-max=2 clone-node-max=1 notify=true
crm(drbd)configure#commit
[root@node1 ~]# crm status
============
Last updated: Tue Sep 20 22:08:10 2011
Stack: openais
Current DC: node1.luowei.com - partition with quorum
Version: 1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
3 Resources configured.
============
crm(drbd)configure#commit
[root@node1 ~]# crm status
============
Last updated: Tue Sep 20 22:08:10 2011
Stack: openais
Current DC: node1.luowei.com - partition with quorum
Version: 1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
3 Resources configured.
============
Online: [ node2.luowei.com node1.luowei.com ]
ClusterIP (ocf::heartbeat:IPaddr2): Started node1.luowei.com
WebSite (ocf::heartbeat:apache): Started node1.luowei.com
Master/Slave Set: WebDataClone [WebData]
Masters: [ node2.luowei.com ]
Slaves: [ node1.luowei.com ]
有上面的输出信息可以看出,资源启动正常,但是我们注意到drbd的主节点在HA2上,为了
WebSite (ocf::heartbeat:apache): Started node1.luowei.com
Master/Slave Set: WebDataClone [WebData]
Masters: [ node2.luowei.com ]
Slaves: [ node1.luowei.com ]
有上面的输出信息可以看出,资源启动正常,但是我们注意到drbd的主节点在HA2上,为了
统一到同一个节点上,还需要进一步约束资源
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive WebFS ocf:heartbeat:Filesystem params
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive WebFS ocf:heartbeat:Filesystem params
device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="ext4"
crm(live)configure# colocation fs_ondrbd inf: WebFS WebDataClone:Master
crm(live)configure# order WebFS-after-WebData inf: WebDataClone:promote
crm(live)configure# colocation fs_ondrbd inf: WebFS WebDataClone:Master
crm(live)configure# order WebFS-after-WebData inf: WebDataClone:promote
WebFS:start
crm(live)configure# colocation WebSite-with-WebFS inf: WebSite WebFS
crm(live)configure# order WebSite-after-WebFS inf: WebFS WebSite
crm(live)configure# commit
crm(live)configure# colocation WebSite-with-WebFS inf: WebSite WebFS
crm(live)configure# order WebSite-after-WebFS inf: WebFS WebSite
crm(live)configure# commit
再次查看,如下内容:
[root@node1 ~]# crm status
============
Last updated: Tue Sep 20 22:38:16 2011
Stack: openais
Current DC: node1.luowei.com - partition with quorum
Version: 1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
4 Resources configured.
============
[root@node1 ~]# crm status
============
Last updated: Tue Sep 20 22:38:16 2011
Stack: openais
Current DC: node1.luowei.com - partition with quorum
Version: 1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ node2.luowei.com node1.luowei.com ]
ClusterIP (ocf::heartbeat:IPaddr2): Started node2.luowei.com
Master/Slave Set: WebDataClone [WebData]
Masters: [ node2.luowei.com ]
Slaves: [ node1.luowei.com ]
WebFS (ocf::heartbeat:Filesystem): Started node2.luowei.com
我们可以看出,资源都在同一个节点上
Master/Slave Set: WebDataClone [WebData]
Masters: [ node2.luowei.com ]
Slaves: [ node1.luowei.com ]
WebFS (ocf::heartbeat:Filesystem): Started node2.luowei.com
我们可以看出,资源都在同一个节点上
七、接下来就是在上面的基础之上做双主模式的集群了:
1、安装集群文件系统
#yum install gfs2-utils gfs2-cluster gfs-pcmk //两个节点上都要进行安装的
1、安装集群文件系统
#yum install gfs2-utils gfs2-cluster gfs-pcmk //两个节点上都要进行安装的
2、添加DLM服务
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive dlm ocf:pacemaker:controld op monitor
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive dlm ocf:pacemaker:controld op monitor
interval=120s
crm(live)configure# clone dlm-clone dlm meta interleave=true
crm(live)configure# commit
crm(live)configure# clone dlm-clone dlm meta interleave=true
crm(live)configure# commit
3、创建gfs-control这个集群资源:
[root@node1 ~]# clear
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive gfs-control ocf:pacemaker:controld params
[root@node1 ~]# clear
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive gfs-control ocf:pacemaker:controld params
daemon=gfs_controld.pcmk args="-g 0" op monitor interval=120s
crm(live)configure# clone gfs-clone gfs-control meta interleave=true
crm(live)configure# colocation gfs-with-dlm INFINITY: gfs-clone dlm-clone
crm(live)configure# order start-gfs-after-dlm mandatory: dlm-clone gfs-clone
crm(live)configure# commit
然后查看一下我们的配置如下所示:
#crm configure show
node node1.luowei.com
node node2.luowei.com
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="192.168.1.110" cidr_netmask="32" \
op monitor interval="30s"
primitive WebData ocf:linbit:drbd \
params drbd_resource="wwwdata" \
op monitor interval="60s"
primitive WebFS ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html"
crm(live)configure# clone gfs-clone gfs-control meta interleave=true
crm(live)configure# colocation gfs-with-dlm INFINITY: gfs-clone dlm-clone
crm(live)configure# order start-gfs-after-dlm mandatory: dlm-clone gfs-clone
crm(live)configure# commit
然后查看一下我们的配置如下所示:
#crm configure show
node node1.luowei.com
node node2.luowei.com
primitive ClusterIP ocf:heartbeat:IPaddr2 \
params ip="192.168.1.110" cidr_netmask="32" \
op monitor interval="30s"
primitive WebData ocf:linbit:drbd \
params drbd_resource="wwwdata" \
op monitor interval="60s"
primitive WebFS ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/wwwdata" directory="/var/www/html"
fstype="ext4"
primitive WebSite ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf" \
op monitor interval="1min"
primitive dlm ocf:pacemaker:controld \
op monitor interval="120s"
primitive gfs-control ocf:pacemaker:controld \
params daemon="gfs_controld.pcmk" args="-g 0" \
op monitor interval="120s"
ms WebDataClone WebData \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
primitive WebSite ocf:heartbeat:apache \
params configfile="/etc/httpd/conf/httpd.conf" \
op monitor interval="1min"
primitive dlm ocf:pacemaker:controld \
op monitor interval="120s"
primitive gfs-control ocf:pacemaker:controld \
params daemon="gfs_controld.pcmk" args="-g 0" \
op monitor interval="120s"
ms WebDataClone WebData \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1"
notify="true"
clone dlm-clone dlm \
meta interleave="true"
clone gfs-clone gfs-control \
meta interleave="true"
location prefer-pcmk-l WebSite 50: node1.luowei.com
colocation WebSite-with-WebFS inf: WebSite WebFS
colocation fs_ondrbd inf: WebFS WebDataClone:Master
colocation gfs-with-dlm inf: gfs-clone dlm-clone
colocation website-with-ip inf: WebSite ClusterIP
order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
order WebSite-after-WebFS inf: WebFS WebSite
order apache-after-ip inf: ClusterIP WebSite
order start-gfs-after-dlm inf: dlm-clone gfs-clone
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
查看集群输出的信息:
[root@node1 ~]# crm_mon
clone dlm-clone dlm \
meta interleave="true"
clone gfs-clone gfs-control \
meta interleave="true"
location prefer-pcmk-l WebSite 50: node1.luowei.com
colocation WebSite-with-WebFS inf: WebSite WebFS
colocation fs_ondrbd inf: WebFS WebDataClone:Master
colocation gfs-with-dlm inf: gfs-clone dlm-clone
colocation website-with-ip inf: WebSite ClusterIP
order WebFS-after-WebData inf: WebDataClone:promote WebFS:start
order WebSite-after-WebFS inf: WebFS WebSite
order apache-after-ip inf: ClusterIP WebSite
order start-gfs-after-dlm inf: dlm-clone gfs-clone
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
查看集群输出的信息:
[root@node1 ~]# crm_mon
============
Last updated: Tue Sep 20 23:18:22 2011
Stack: openais
Current DC: node1.luowei.com - partition with quorum
Version: 1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
6 Resources configured.
============
Last updated: Tue Sep 20 23:18:22 2011
Stack: openais
Current DC: node1.luowei.com - partition with quorum
Version: 1.1.5-1.fc15-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
6 Resources configured.
============
Online: [ node2.luowei.com node1.luowei.com ]
ClusterIP (ocf::heartbeat:IPaddr2): Started node1.luowei.com
Master/Slave Set: WebDataClone [WebData]
Masters: [ node2.luowei.com ]
Slaves: [ node1.luowei.com ]
WebSite (ocf::heartbeat:apache): Started node1.luowei.com
Clone Set: dlm-clone
Started: [node2.luowei.com node1.luowei.com]
Clone Set: gfs-clone
Startde: [node2.luowei.com node1.luowei.com]
WebFS (ocf::heartbeat:Filesystem): Started node1.luowei.com
Masters: [ node2.luowei.com ]
Slaves: [ node1.luowei.com ]
WebSite (ocf::heartbeat:apache): Started node1.luowei.com
Clone Set: dlm-clone
Started: [node2.luowei.com node1.luowei.com]
Clone Set: gfs-clone
Startde: [node2.luowei.com node1.luowei.com]
WebFS (ocf::heartbeat:Filesystem): Started node1.luowei.com
4、创建GFS2文件系统
[root@node1 ~]# crm_resource --resource WebFS --set-parameter target-role --meta
[root@node1 ~]# crm_resource --resource WebFS --set-parameter target-role --meta
--parameter-value Stopped
这个时候使用crm status可以看到apache 和WebFS两个资源都已经停止。
这个时候使用crm status可以看到apache 和WebFS两个资源都已经停止。
5、创建并迁移数据到GFS2分区
在两个节点上都执行以下命令:
[root@node2 ~]# mkfs.gfs2 -p lock_dlm -j 2 -t pcmk:web /dev/drbd1
This will destroy any data on /dev/drbd1.
It appears to contain: Linux rev 1.0 ext4 filesystem data, UUID=19976683-c802-
在两个节点上都执行以下命令:
[root@node2 ~]# mkfs.gfs2 -p lock_dlm -j 2 -t pcmk:web /dev/drbd1
This will destroy any data on /dev/drbd1.
It appears to contain: Linux rev 1.0 ext4 filesystem data, UUID=19976683-c802-
479c-854d-e786617be523 (extents) (large files) (huge files)
Are you sure you want to proceed? [y/n] y
6、然后迁移数据到这个新的文件系统并且为集群重新配置GFS2
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive WebFS ocf:heartbeat:Filesystem params
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive WebFS ocf:heartbeat:Filesystem params
device="/dev/drbd/by-res/wwwdata" directory="/var/www/html" fstype="gfs2"
crm(live)configure# colocation WebSite-with-WebFS inf: WebSite WebFS
crm(live)configure# colocation fs_on_debd inf: WebFS WebDataClone:Master
crm(live)configure# order WebFS-after-WebData inf: WebDataClone:promote
crm(live)configure# colocation WebSite-with-WebFS inf: WebSite WebFS
crm(live)configure# colocation fs_on_debd inf: WebFS WebDataClone:Master
crm(live)configure# order WebFS-after-WebData inf: WebDataClone:promote
WebFS:start
crm(live)configure# order WebSite-after-WebFS inf: WebFS WebSite
crm(live)configure# colocation WebFS-with-gfs-control INFINITY: WebFS gfs-clone
crm(live)configure# order start-WebFS-after-gfs-control mandatory: gfs-clone
crm(live)configure# order WebSite-after-WebFS inf: WebFS WebSite
crm(live)configure# colocation WebFS-with-gfs-control INFINITY: WebFS gfs-clone
crm(live)configure# order start-WebFS-after-gfs-control mandatory: gfs-clone
WebFS
crm(live)configure# commit
crm(live)configure# commit
7、重新配置pacemaker为Active/Active
[root@node1 ~]# crm
crm(live)# configure clone WebIP ClusterIP meta globally-unique="true" clone-
[root@node1 ~]# crm
crm(live)# configure clone WebIP ClusterIP meta globally-unique="true" clone-
max="2" clone-node-max="2"
crm(live)# configure primitive ClusterIP ocf:heartbeat:IPaddr2 params
crm(live)# configure primitive ClusterIP ocf:heartbeat:IPaddr2 params
ip="192.168.1.110" cidr_netmask="32" clusterip_hash="sourceip" op monitor
interval="30s" //设置ClusterIP的参数
crm(live)# configure clone WebFSClone WebFS
crm(live)# configure clone WebSiteClone WebSite
同时把CIB文件中的master-max改为2
资源配置完成,
crm(live)# configure clone WebFSClone WebFS
crm(live)# configure clone WebSiteClone WebSite
同时把CIB文件中的master-max改为2
资源配置完成,