1、实验环境:

Node1:192.168.1.17(RHEL5.8_32bit,web server)

Node2:192.168.1.18(RHEL5.8_32bit,web server)

NFS :192.168.1.19(RHEL5.8_32bit,nfs server)

VIP:192.168.1.20(webip)


2、准备工作

<1> 配置主机名

节点名称使用/etc/hosts解析;节点名称必须跟uname -n命令的执行结果一致

Node1:

# hostname node1.ikki.com
# vim /etc/sysconfig/network
HOSTNAME=node1.ikki.com

Node2:

# hostname node1.ikki.com
# vim /etc/sysconfig/network
HOSTNAME=node2.ikki.com

<2> 配置节点ssh基于密钥方式互相通信

Node1:

# ssh-keygen -t rsa
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node2

Node2:

# ssh-keygen -t rsa
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@node1

<3> 配置各节点基于主机名互相通信

Node1&Node2:

# vim /etc/hosts
192.168.1.17   node1.ikki.com node1
192.168.1.18   node2.ikki.com node2

<4> 配置各节点时间同步

Node1&Node2:

# crontab -e
*/5 * * * *     /sbin/ntpdate 202.120.2.101 &> /dev/null


3、安装corosync和pacemaker(各个节点)

<1> 依赖的rpm包:

libibverbs, librdmacm, lm_sensors, libtool-ltdl, openhpi-libs, openib, perl-TimeDate, libnes

<2> 下载软件包至本地某专用目录(如/root/cluster):

# cd /root/cluster
# ls
cluster-glue-1.0.6-1.6.el5.i386.rpm
cluster-glue-libs-1.0.6-1.6.el5.i386.rpm
corosync-1.2.7-1.1.el5.i386.rpm
corosynclib-1.2.7-1.1.el5.i386.rpm
heartbeat-3.0.3-2.3.el5.i386.rpm
heartbeat-libs-3.0.3-2.3.el5.i386.rpm
libesmtp-1.0.4-5.el5.i386.rpm
pacemaker-1.1.5-1.1.el5.i386.rpm
pacemaker-libs-1.1.5-1.1.el5.i386.rpm
resource-agents-1.0.4-1.1.el5.i386.rpm

<3> 安装本地软件包及依赖包:

# cd /root/cluster
# yum -y --nogpgcheck localinstall *.rpm


4、配置corosync

Node1:

# cd /etc/corosync
# cp corosync.conf.example corosync.conf
# vim corosync.conf
# 添加如下内容:
service {
  ver:  0
  name: pacemaker
  # use_mgmtd: yes
}
aisexec {
  user: root
  group:  root
}
# vim corosync.conf
# 修改如下内容:
bindnetaddr: 192.168.1.0    # 网卡所在网络的网络地址
secauth: on                 # 开启认证
to_syslog: no               # 关闭系统日志记录(使用单独logfile记录)
threads: 2                  # 设置线程数

生成节点间通信时用到的认证密钥文件:

# corosync-keygen

将corosync.conf和authkey复制至Node2:

# scp -p corosync.conf authkey  node2:/etc/corosync/

分别为两个节点创建corosync生成的日志所在的目录:

# mkdir /var/log/cluster
# ssh node2  'mkdir /var/log/cluster'


5、启动服务并检查

Node1:

# /etc/init.d/corosync start

查看corosync引擎是否正常启动:

# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
Sep 16 18:59:29 corosync [MAIN  ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.
Sep 16 18:59:29 corosync [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
Sep 16 19:28:26 corosync [MAIN  ] Corosync Cluster Engine exiting with status 0 at main.c:170.
Sep 16 19:54:14 corosync [MAIN  ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.
Sep 16 19:54:14 corosync [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.

查看初始化成员节点通知是否正常发出:

# grep  TOTEM  /var/log/cluster/corosync.log
Sep 16 18:59:29 corosync [TOTEM ] Initializing transport (UDP/IP).
Sep 16 18:59:29 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Sep 16 18:59:29 corosync [TOTEM ] The network interface [192.168.1.17] is now up.
Sep 16 18:59:29 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.

检查启动过程中是否有错误产生:

# grep ERROR: /var/log/cluster/corosync.log | grep -v unpack_resources

查看pacemaker是否正常启动:

# grep pcmk_startup /var/log/cluster/corosync.log
Sep 16 18:59:29 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized
Sep 16 18:59:29 corosync [pcmk  ] Logging: Initialized pcmk_startup
Sep 16 18:59:29 corosync [pcmk  ] info: pcmk_startup: Maximum core file size is: 4294967295
Sep 16 18:59:29 corosync [pcmk  ] info: pcmk_startup: Service: 9
Sep 16 18:59:29 corosync [pcmk  ] info: pcmk_startup: Local hostname: node1.ikki.com

如以上检查正常,即可启动Node2上的corosync(启动Node2需要在Node1上远程启动,勿要在Node2节点上直接启动)

# ssh node2 -- /etc/init.d/corosync start

查看集群节点的启动状态:

# crm status
============
Last updated: Tue Sep 17 23:39:11 2013
Stack: openais
Current DC: node1.ikki.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ node1.ikki.com node2.ikki.com ]

查看corosync启动的相关进程:

# ps auxf
root     13200  0.6  0.7  86880  3952 ?        Ssl  12:29   4:06 corosync
root     13208  0.0  0.4  11724  2104 ?        S    12:29   0:00  \_ /usr/lib/heartbeat/stonithd
101      13209  0.0  0.7  12872  3820 ?        S    12:29   0:01  \_ /usr/lib/heartbeat/cib
root     13210  0.0  0.4   6572  2156 ?        S    12:29   0:00  \_ /usr/lib/heartbeat/lrmd
101      13211  0.0  0.3  12060  2040 ?        S    12:29   0:00  \_ /usr/lib/heartbeat/attrd
101      13212  0.0  0.5   8836  2900 ?        S    12:29   0:00  \_ /usr/lib/heartbeat/pengine
101      13213  0.0  0.6  12280  3112 ?        S    12:29   0:02  \_ /usr/lib/heartbeat/crmd


6、配置集群禁用stonith设备

corosync默认启用了stonith,而当前实验环境并没有相应的stonith设备,因此需要禁用stonith:

# crm configure property stonith-enabled=false

查看当前的配置信息:

# crm configure show
node node1.ikki.com
node node2.ikki.com
property $id="cib-bootstrap-options" \
        dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        stonith-enabled="false" \


7、为集群添加IP地址资源(webip):

Node1:

# crm configure primitive webip ocf:heartbeat:IPaddr params ip=192.168.1.20

查看资源启动状态:

# crm status
 ============
Last updated: Tue Sep 17 23:48:10 2013
Stack: openais
Current DC: node1.ikki.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ node1.ikki.com node2.ikki.com ]
 webip  (ocf::heartbeat:IPaddr):        Started node1.ikki.com

查看webip是否生效:

# ifconfig
eth0:0    Link encap:Ethernet  HWaddr 08:00:27:F1:60:13
          inet addr:192.168.1.20  Bcast:192.168.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1


8、配置集群禁用法定票数

Node2:

停止Node1上的corosync服务:

# ssh node1 -- /etc/init.d/corosync stop

查看集群工作状态:

# crm status
============
Last updated: Tue Sep 17 23:49:41 2013
Stack: openais
Current DC: node2.ikki.com - partition WITHOUT quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ node2.ikki.com ]
OFFLINE: [ node1.ikki.com ]

在双节点集群环境中法定票数无法起效,当Node1离线时,则webip资源无法转移至Node2,因此需要禁用quorum:

# crm configure property no-quorum-policy=ignore

再次查看集群工作状态:

# crm status
============
Last updated: Tue Sep 17 23:51:27 2013
Stack: openais
Current DC: node2.ikki.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ node1.ikki.com node2.ikki.com ]
 webip  (ocf::heartbeat:IPaddr):        Started node2.ikki.com

启动Node1上的corosync服务:

# ssh node1 -- /etc/init.d/corosync start

为资源指定默认黏性值:

# crm configure rsc_defaults resource-stickiness=100


9、配置active/passive模型的高可用Web集群

<1> 在各节点上安装httpd服务并提供测试页面

<2> 为集群添加web服务资源(httpd):

# crm configure primitive httpd lsb:httpd

查看资源的启用状态:

# crm status
============
Last updated: Tue Sep 17 23:54:36 2013
Stack: openais
Current DC: node2.ikki.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
2 Resources configured.
============
Online: [ node1.ikki.com node2.ikki.com ]
 webip  (ocf::heartbeat:IPaddr):        Started node1.ikki.com
 httpd  (lsb:httpd):    Started node2.ikki.com

<3> 配置资源约束:

# crm configure colocation httpd-with-ip INFINITY: httpd webip

<4> 配置资源顺序(资源启动顺序为webip, httpd):

# crm configure order httpd-after-ip mandatory: webip httpd

<5> 配置集群位置约束:

# crm configure location prefer-node1 httpd rule 200: #uname eq node1.ikki.com


10、搭建NFS服务器

NFS:

# mkdir -p /web/htdocs
# vim /etc/exports
/web/htdocs     192.168.1.0/24(ro)
# exportfs -rav


11、为集群添加由nfs提供的webstore资源并配置约束

Node1:

<1> 添加webstore资源

# crm configure primitive webstore ocf:heartbeat:Filesystem params device=192.168.1.19:/web/htdocs directory=/var/www/html fstype=nfs op start timeout=60 op stop timeout=60

<2> 设置位置约束

# crm configure colocation httpd_with_webstore inf: httpd webstore

<3> 设置顺序约束

# crm configure order webstore_before_httpd mandatory: webstore httpd

<4> 设置顺序约束(使用crm交互式命令)

# crm(live)configure# edit
删除此前定义的约束order httpd_after_ip inf: webip httpd
# crm(live)configure# order webstore_after_ip inf: webip webstore
# crm(live)configure# verify
# crm(live)configure# commit


12、集群配置总览和查看资源状态

<1> 查看集群配置

# crm configure show
node node1.ikki.com \
        attributes standby="off"
node node2.ikki.com
primitive httpd lsb:httpd \
        meta target-role="Started"
primitive webip ocf:heartbeat:IPaddr \
        params ip="192.168.1.20" \
        meta target-role="Started"
primitive webstore ocf:heartbeat:Filesystem \
        params device="192.168.1.19:/web/htdocs" directory="/var/www/html" fstype="nfs" \
        op start interval="0" timeout="60" \
        op stop interval="0" timeout="60" \
        meta target-role="Started"
location perfer_node1 httpd \
        rule $id="perfer_node1-rule" 200: #uname eq node1.ikki.com
colocation httpd_with_webip inf: httpd webip
colocation httpd_with_webstore inf: httpd webstore
order webstore_after_ip inf: webip webstore
order webstore_before_httpd inf: webstore httpd
property $id="cib-bootstrap-options" \
        dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        stonith-enabled="false" \
        no-quorum-policy="ignore" \
        last-lrm-refresh="1379355508"

<2> 查看资源状态

# crm status
============
Last updated: Tue Sep 17 23:58:35 2013
Stack: openais
Current DC: node2.ikki.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
3 Resources configured.
============
Online: [ node1.ikki.com node2.ikki.com ]
 webip  (ocf::heartbeat:IPaddr):        Started node1.ikki.com
 httpd  (lsb:httpd):    Started node1.ikki.com
 webstore       (ocf::heartbeat:Filesystem):    Started node1.ikki.com