-------------利用corosync+pacemaker实现www的高可用性群集--------------------
【高可用群集:
high availability cluster-----简称HA cluster,高可用性群集为了是群集的整体服务尽可能可用,从而减少计算机和软件易错性带来的损失。在群集中如果有一个节点失效或者损坏,它的备用节点就会在很快的时间内接管它的职责,对于使用的用户而言,群集是永远不会停机。又称有两个节点的高可用群集称为双机热备,即两台服务器间相互的备份,高可用群集系统可以支持两个以上的节点; 】
本实验大致的拓扑: 使用到的系统是Linux Redhat 5.4 Enterprise
(1)、修改网络参数:
node1
[root@web1 ~]# vim /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=node1.hanyu.com
[root@web1 ~]# vim /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.1.101 node1.hanyu.com node1
192.168.1.102 node2.hanyu.com node2
[root@web1 ~]# hostname node1.hanyu.com
[root@web1 ~]# hostname
node1.hanyu.com
node2
root@web2 ~]# vim /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=node2.hanyu.com
[root@web2 ~]# vim /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
192.168.1.101 node1.hanyu.com node1
192.168.1.102 node2.hanyu.com node2
[root@web2 ~]# hostname node2.hanyu.com
[root@web2 ~]# hostname
node2.hanyu.com
(2)、同步系统时间:
[root@web1 ~]# hwclock –s
[root@web2 ~]# hwclock –s
(3)、在各节点上生成密钥实现无密码通信:
node1:
[root@node1 ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
a5:e0:b7:03:7b:95:c1:00:af:00:2a:bc:8e:b9:c1:14 [email protected] 产生一个rsa的非对称加密的私钥对;
[root@node1 ~]# ssh-copy-id -i .ssh/id_rsa.pub node2
15
The authenticity of host 'node2 (192.168.1.102)' can't be established.
RSA key fingerprint is d4:f1:06:3b:a0:81:fd:85:65:20:9e:a1:ee:46:a6:8b.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node2,192.168.1.102' (RSA) to the list of known hosts.
root@node2's password:
Now try logging into the machine, with "ssh 'node2'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
[root@node2 ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
47:e6:36:40:c8:72:ba:03:85:87:d2:c3:7a:04:29:c4 [email protected]
[root@node2 ~]# ssh-copy-id -i .ssh/id_rsa.pub node1
15
ssh: node1: Temporary failure in name resolution
[root@node2 ~]# ssh-copy-id -i .ssh/id_rsa.pub 192.168.1.101
15
The authenticity of host '192.168.1.101 (192.168.1.101)' can't be established.
RSA key fingerprint is 78:55:3c:c8:4d:c3:73:91:dc:ae:73:c7:ab:a0:c6:d4.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.1.101' (RSA) to the list of known hosts.
[email protected]'s password:
Now try logging into the machine, with "ssh '192.168.1.101'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
(4)、在各节点上配置相应的yum客户端:
在node1和node2的下面配置yum文件中添加以下的内容:
[root@node1 ~]# vim /etc/yum.repos.d/rhel-debuginfo.repo
[root@node2 ~]# vim /etc/yum.repos.d/rhel-debuginfo.repo
[rhel-server]
name=Red Hat Enterprise Linux $releasever - $basearch - Debug
baseurl=file:///mnt/cdrom/Server
enabled=1
gpgcheck=1
gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release
[rhel-Cluster]
name=Red Hat Enterprise Linux $releasever - $basearch - Debug
baseurl=file:///mnt/cdrom/Cluster
enabled=1
gpgcheck=1
gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release
[rhel-ClusterStorage]
name=Red Hat Enterprise Linux $releasever - $basearch - Debug
baseurl=file:///mnt/cdrom/ClusterStorage
enabled=1
gpgcheck=1
gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release
[rhel-VT]
name=Red Hat Enterprise Linux $releasever - $basearch - Debug
baseurl=file:///mnt/cdrom/VT
enabled=1
gpgcheck=1
gpgkey=file:///mnt/cdrom/RPM-GPG-KEY-redhat-release
(5)、本实验需要以下的数据包:
cluster-glue-1.0.6-1.6.el5.i386.rpm
cluster-glue-libs-1.0.6-1.6.el5.i386.rpm
corosync-1.2.7-1.1.el5.i386.rpm
corosynclib-1.2.7-1.1.el5.i386.rpm
drbd83-8.3.8-1.el5.centos.i386.rpm
heartbeat-3.0.3-2.3.el5.i386.rpm
heartbeat-libs-3.0.3-2.3.el5.i386.rpm
kmod-drbd83-8.3.8-1.el5.centos.i686.rpm
ldirectord-1.0.1-1.el5.i386.rpm 探测服务器不需要安装!!!!!!!
libesmtp-1.0.4-5.el5.i386.rpm
mysql-5.5.15-linux2.6-i686.tar.gz
openais-1.1.3-1.6.el5.i386.rpm
openaislib-1.1.3-1.6.el5.i386.rpm
pacemaker-1.1.5-1.1.el5.i386.rpm
pacemaker-cts-1.1.5-1.1.el5.i386.rpm
pacemaker-libs-1.1.5-1.1.el5.i386.rpm
perl-TimeDate-1.16-5.el5.noarch.rpm
resource-agents-1.0.4-1.1.el5.i386.rpm
(6)、在各个节点上安装所有的rpm包:
[root@node1 ~]# yum localinstall *.rpm -y --nogpgcheck
[root@node2 ~]# yum localinstall *.rpm -y –nogpgcheck
(7)、对各个节点进行相应的配置:
node1:
[root@node1 ~]# cd /etc/corosync/
[root@node1 corosync]# cp corosync.conf.example corosync.conf
[root@node1 corosync]# vim corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank
totem {
version: 2
secauth: off
threads: 0
interface {
ringnumber: 0
bindnetaddr: 192.168.1.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
service {
ver: 0
name: pacemaker
}
aisexec {
user: root
group: root
}
[root@node1 corosync]# mkdir /var/log/cluster
[root@node1 corosync]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Writing corosync key to /etc/corosync/authkey.
[root@node1 corosync]# ll
总计 52
-rw-r--r-- 1 root root 5384 2010-07-28 amf.conf.example
-r-------- 1 root root 128 06-04 10:41 authkey
-rw-r--r-- 1 root root 546 06-04 10:35 corosync.conf
-rw-r--r-- 1 root root 436 2010-07-28 corosync.conf.example
drwxr-xr-x 2 root root 4096 2010-07-28 service.d
drwxr-xr-x 2 root root 4096 2010-07-28 uidgid.d
将node1节点上的文件拷贝到节点node2上面 一定要带上 -p
[root@node1 corosync]# scp -p authkey corosync.conf node2:/etc/corosync/
authkey 100% 128 0.1KB/s 00:00
corosync.conf 100% 546 0.5KB/s 00:00
开启node1节点上的corosync服务:
[root@node1 corosync]# service corosync start
Starting Corosync Cluster Engine (corosync): [确定]
验证corosync引擎是否正常工作启动:
[root@node1 corosync]# grep -i -e "corosync cluster engine" -e "configuration file" /var/log/messages
Mar 20 23:06:16 localhost smartd[3286]: Opened configuration file /etc/smartd.conf
Mar 20 23:06:16 localhost smartd[3286]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
Jun 4 10:42:30 localhost corosync[23702]: [MAIN ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.
Jun 4 10:42:30 localhost corosync[23702]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
Jun 4 10:46:36 localhost corosync[23702]: [MAIN ] Corosync Cluster Engine exiting with status 0 at main.c:170.
Jun 4 10:49:07 localhost corosync[23763]: [MAIN ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.
Jun 4 10:49:07 localhost corosync[23763]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
查看初始成员节点通知是否发出:
[root@node1 corosync]# grep -i totem /var/log/messages
Jun 4 10:42:30 localhost corosync[23702]: [TOTEM ] Initializing transport (UDP/IP).
Jun 4 10:42:30 localhost corosync[23702]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jun 4 10:42:31 localhost corosync[23702]: [TOTEM ] The network interface [192.168.1.101] is now up.
Jun 4 10:42:31 localhost corosync[23702]: [TOTEM ] Process pause detected for 600 ms, flushing membership messages.
Jun 4 10:42:31 localhost corosync[23702]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
Jun 4 10:49:07 localhost corosync[23763]: [TOTEM ] Initializing transport (UDP/IP).
Jun 4 10:49:07 localhost corosync[23763]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jun 4 10:49:08 localhost corosync[23763]: [TOTEM ] The network interface [192.168.1.101] is now up.
Jun 4 10:49:08 localhost corosync[23763]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
检查过程中是否有错误产生:
[root@node1 corosync]# grep -i error: /var/log/messages |grep -v unpack_resources
没有提示是最好的结果;
检查pacemaker的启动状态:
[root@node1 corosync]# grep -i pcmk_startup /var/log/messages
Jun 4 10:42:31 localhost corosync[23702]: [pcmk ] info: pcmk_startup: CRM: Initialized
Jun 4 10:42:31 localhost corosync[23702]: [pcmk ] Logging: Initialized pcmk_startup
Jun 4 10:42:31 localhost corosync[23702]: [pcmk ] info: pcmk_startup: Maximum core file size is: 4294967295
Jun 4 10:42:31 localhost corosync[23702]: [pcmk ] info: pcmk_startup: Service: 9
Jun 4 10:42:31 localhost corosync[23702]: [pcmk ] info: pcmk_startup: Local hostname: node1.hanyu.com
Jun 4 10:49:08 localhost corosync[23763]: [pcmk ] info: pcmk_startup: CRM: Initialized
Jun 4 10:49:08 localhost corosync[23763]: [pcmk ] Logging: Initialized pcmk_startup
Jun 4 10:49:08 localhost corosync[23763]: [pcmk ] info: pcmk_startup: Maximum core file size is: 4294967295
Jun 4 10:49:08 localhost corosync[23763]: [pcmk ] info: pcmk_startup: Service: 9
Jun 4 10:49:08 localhost corosync[23763]: [pcmk ] info: pcmk_startup: Local hostname: node1.hanyu.com
node2:
[root@node2 corosync]# mkdir /var/lo
local/ lock/ log/
[root@node2 corosync]# mkdir /var/log/cluster
[root@node2 corosync]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Writing corosync key to /etc/corosync/authkey.
验证corosync引擎是否正常工作:
[root@node2 corosync]# service corosync start
Starting Corosync Cluster Engine (corosync): [ OK ]
[root@node2 corosync]# grep -i -e "corosync cluster engine" -e "configuration file" /var/log/messages
Jun 4 11:02:31 Eleven corosync[6203]: [MAIN ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.
Jun 4 11:02:31 Eleven corosync[6203]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
查看初始化成员节点通知是否发出:
[root@node2 corosync]# grep -i totem /var/log/messages
Jun 4 11:02:31 Eleven corosync[6203]: [TOTEM ] Initializing transport (UDP/IP).
Jun 4 11:02:31 Eleven corosync[6203]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jun 4 11:02:31 Eleven corosync[6203]: [TOTEM ] The network interface [192.168.1.102] is now up.
Jun 4 11:02:34 Eleven corosync[6203]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
检查过程中是有错误产生:
[root@node2 corosync]# grep -i error: /var/log/messages |grep -v unpack_resources
检查packmasker是否已经启动:
[root@node2 corosync]# grep -i pcmk_startup /var/log/messages
Jun 4 11:02:32 Eleven corosync[6203]: [pcmk ] info: pcmk_startup: CRM: Initialized
Jun 4 11:02:32 Eleven corosync[6203]: [pcmk ] Logging: Initialized pcmk_startup
Jun 4 11:02:33 Eleven corosync[6203]: [pcmk ] info: pcmk_startup: Maximum core file size is: 4294967295
Jun 4 11:02:33 Eleven corosync[6203]: [pcmk ] info: pcmk_startup: Service: 9
Jun 4 11:02:33 Eleven corosync[6203]: [pcmk ] info: pcmk_startup: Local hostname: node2.hanyu.com
(8)、
在节点node1
查看群集的状态:
[root@node1 corosync]# crm status
============
Last updated: Mon Jun 4 11:20:17 2012
Stack: openais
Current DC: node1.hanyu.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
0 Resources configured.
============
Online: [ node1.hanyu.com node2.hanyu.com ]
提供高可用性:
[root@node1 corosync]# crm configure show
node node1.hanyu.com
node node2.hanyu.com
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2"
如何验证该文件的语法错误:
[root@node1 corosync]# crm_verify -L
crm_verify[23860]: 2012/06/04_11:25:46 ERROR: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
crm_verify[23860]: 2012/06/04_11:25:46 ERROR: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
crm_verify[23860]: 2012/06/04_11:25:46 ERROR: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
-V may provide more details
禁用stonith:
[root@node1 corosync]# crm
crm(live)# configure
crm(live)configure# property stonith-enabled=false
crm(live)configure# commit
crm(live)configure#
检查一下状态:
crm(live)configure# show
node node1.hanyu.com
node node2.hanyu.com
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false"
再次检测:
[root@node1 corosync]# crm_verify –L
系统上有专门的stonith命令
【stonith -L 显示stonith所指示的类型
crm可以使用交互式模式
可以执行help
保存在cib里面,以xml的格式
集群的资源类型有4种
primitive 本地主资源 (只能运行在一个节点上)
group 把多个资源轨道一个组里面,便于管理
clone 需要在多个节点上同时启用的 (如ocfs2 ,stonith ,没有主次之分)
master 有主次之分,如drbd】
[root@node1 corosync]# crm
crm(live)# ra
crm(live)ra# classes
heartbeat
lsb
ocf / heartbeat linbit pacemaker
stonith
crm(live)ra# list lsb
NetworkManager acpid anacron
apmd atd auditd
autofs avahi-daemon avahi-dnsconfd
bluetooth capi conman
corosync cpuspeed crond
cups cups-config-daemon dc_client
dc_server dnsmasq drbd
dund firstboot functions
gpm haldaemon halt
heartbeat hidd hplip
httpd ip6tables ipmi
iptables irda irqbalance
isdn kdump killall
krb524 kudzu lm_sensors
logd lvm2-monitor mcstrans
mdmonitor mdmpd messagebus
microcode_ctl multipathd netconsole
netfs netplugd network
nfs nfslock nscd
ntpd openais openibd
pacemaker pand pcscd
portmap psacct rawdevices
rdisc readahead_early readahead_later
restorecond rhnsd rpcgssd
rpcidmapd rpcsvcgssd saslauthd
sendmail setroubleshoot single
smartd squid sshd
syslog tux vncserver
wdaemon winbind wpa_supplicant
xfs xinetd ypbind
yum-updatesd
crm(live)ra#
crm(live)ra#
crm(live)ra# list ocf heartbeat
AoEtarget AudibleAlarm CTDB
ClusterMon Delay Dummy
EvmsSCC Evmsd Filesystem
ICP IPaddr IPaddr2
IPsrcaddr IPv6addr LVM
LinuxSCSI MailTo ManageRAID
ManageVE Pure-FTPd Raid1
Route SAPDatabase SAPInstance
SendArp ServeRAID SphinxSearchDaemon
Squid Stateful SysInfo
VIPArip VirtualDomain WAS
WAS6 WinPopup Xen
Xinetd anything apache
conntrackd db2 drbd
eDir88 exportfs fio
iSCSILogicalUnit iSCSITarget ids
iscsi jboss ldirectord
mysql mysql-proxy nfsserver
nginx oracle oralsnr
pgsql pingd portblock
postfix proftpd rsyncd
scsi2reservation sfex syslog-ng
tomcat vmware
(9)、配置群集资源:
配置vip----资源ip
crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=192.168.1.100
查看:
crm(live)configure# show
node node1.hanyu.com
node node2.hanyu.com
primitive webip ocf:heartbeat:IPaddr \
params ip="192.168.1.100"
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false"
提交
crm(live)configure# commit
[root@node1 ~]# crm
crm(live)# status
============
Last updated: Mon Jun 4 12:01:34 2012
Stack: openais
Current DC: node1.hanyu.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ node1.hanyu.com node2.hanyu.com ]
webip (ocf::heartbeat:IPaddr): Started node1.hanyu.com
在节点node1上查看实用ifconfig
[root@node1 ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:0C:29:E9:57:8C
inet addr:192.168.1.101 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fee9:578c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:159820 errors:0 dropped:0 overruns:0 frame:0
TX packets:61874 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:177495313 (169.2 MiB) TX bytes:7395778 (7.0 MiB)
Interrupt:67 Base address:0x2000
eth0:0 Link encap:Ethernet HWaddr 00:0C:29:E9:57:8C
inet addr:192.168.1.100 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:67 Base address:0x2000
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:20040 errors:0 dropped:0 overruns:0 frame:0
TX packets:20040 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:14177363 (13.5 MiB) TX bytes:14177363 (13.5 MiB)
查看httpd的参数:
[root@node1 ~]# crm
crm(live)# ra
crm(live)ra# meta lsb:httpd
lsb:httpd
Apache is a World Wide Web server. It is used to serve \
HTML files and CGI.
Operations' defaults (advisory minimum):
start timeout=15
stop timeout=15
status timeout=15
restart timeout=15
force-reload timeout=15
monitor interval=15 timeout=15 start-delay=15
定义httpd的资源:
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive webserver lsb:httpd
crm(live)configure# show
node node1.hanyu.com
node node2.hanyu.com
primitive webip ocf:heartbeat:IPaddr \
params ip="192.168.1.100"
primitive webserver lsb:httpd
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false"
查看状态:
crm(live)# status
============
Last updated: Mon Jun 4 12:18:23 2012
Stack: openais
Current DC: node1.hanyu.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
2 Resources configured.
============
Online: [ node1.hanyu.com node2.hanyu.com ]
webip (ocf::heartbeat:IPaddr): Started node1.hanyu.com
webserver (lsb:httpd): Started node2.hanyu.com
发现httpd已经启动,在node2节点上:
需要约束在同一节点上,定义成一个组
实用
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# help group
The `group` command creates a group of resources.
Usage:
...............
group <name> <rsc> [<rsc>...]
[meta attr_list]
[params attr_list]
attr_list :: [$id=<id>] <attr>=<val> [<attr>=<val>...] | $id-ref=<id>
...............
Example:
...............
group internal_www disk0 fs0 internal_ip apache \
meta target_role=stopped
...............
crm(live)configure#
crm(live)configure# group web webip webserver
crm(live)configure# show
node node1.hanyu.com
node node2.hanyu.com
primitive webip ocf:heartbeat:IPaddr \
params ip="192.168.1.100"
primitive webserver lsb:httpd
group web webip webserver
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false"
[root@node1 ~]# crm
crm(live)# status
============
Last updated: Mon Jun 4 12:32:03 2012
Stack: openais
Current DC: node1.hanyu.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ node1.hanyu.com node2.hanyu.com ]
Resource Group: web
webip (ocf::heartbeat:IPaddr): Started node1.hanyu.com
webserver (lsb:httpd): Started node1.hanyu.com
[root@node1 ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:0C:29:E9:57:8C
inet addr:192.168.1.101 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fee9:578c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:170656 errors:0 dropped:0 overruns:0 frame:0
TX packets:77145 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:178709996 (170.4 MiB) TX bytes:9320783 (8.8 MiB)
Interrupt:67 Base address:0x2000
eth0:0 Link encap:Ethernet HWaddr 00:0C:29:E9:57:8C
inet addr:192.168.1.100 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:67 Base address:0x2000
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:20048 errors:0 dropped:0 overruns:0 frame:0
TX packets:20048 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:14177971 (13.5 MiB) TX bytes:14177971 (13.5 MiB)
使用pc机测试
[root@node2 ~]# cd /etc/corosync/
[root@node2 corosync]# crm status
============
Last updated: Mon Jun 4 12:40:01 2012
Stack: openais
Current DC: node2.hanyu.com - partition WITHOUT quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ node2.hanyu.com ]
OFFLINE: [ node1.hanyu.com ]
关闭quorum
[root@node1 corosync]# crm
crm(live)# configure
crm(live)configure# property no-quorum-policy=ignore
crm(live)configure# show
node node1.hanyu.com
node node2.hanyu.com
primitive webip ocf:heartbeat:IPaddr \
params ip="192.168.1.100"
primitive webserver lsb:httpd
group web webip webserver
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
提交并退出;
[root@node2 corosync]# crm status
============
Last updated: Mon Jun 4 12:48:48 2012
Stack: openais
Current DC: node2.hanyu.com - partition WITHOUT quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ node2.hanyu.com ]
OFFLINE: [ node1.hanyu.com ]
Resource Group: web
webip (ocf::heartbeat:IPaddr): Started node2.hanyu.com
webserver (lsb:httpd): Started node2.hanyu.com
[root@node1 ~]# cd /etc/corosync/
[root@node1 corosync]# crm
crm crm_diff crm_master crm_node crm_resource crm_simulate crm_uuid crmadmin
crm_attribute crm_failcount crm_mon crm_report crm_shadow crm_standby crm_verify
corosync的常见指令
1,crm_attribute 修改集群的全局属性信息
比如前面的stonith和quorum
实际上是修改dc上的cib
2. crm_resource 修改资源
3. crm_node 管理节点
crm_node -e 查看节点的时代(配置文件修改过几次了)
[root@node2 corosync]# crm_node -q 显示当前节点的票数
1
4. cibadmin 集群配置的工具
-u, --upgrade Upgrade the configuration to the latest syntax
-Q, --query Query the contents of the CIB
-E, --erase Erase the contents of the whole CIB
-B, --bump Increase the CIB's epoch value by 1
如果某一个资源定义错了,就可以实用该工具进行删除
-D, --delete Delete the first object matching the supplied criteria, Eg.
也可以在crm的命令行下
crm(live)configure# delete
usage: delete <id> [<id>...]
也可以在该模式下执行edit
执行完毕后,commit 提交
5:集群的资源类型
crm(live)configure# help
- `primitive`本地主资源 (只能运行在一个节点上)
- `monitor`
- `group`
把多个资源轨道一个组里面,便于管理
- `clone` 需要在多个节点上同时启用的 (如ocfs2 ,stonith ,没有主次之分
- `ms`/`master` (master-slave) 有主次之分,如drbd
6:集群的资源代理ra可以用的类型
crm(live)configure ra# classes
heartbeat
lsb
ocf / heartbeat pacemaker
stonith
crm(live)configure ra# list heartbeat
AudibleAlarm Delay Filesystem ICP IPaddr IPaddr2 IPsrcaddr IPv6addr LVM
LinuxSCSI MailTo OCF Raid1 SendArp ServeRAID WAS WinPopup Xinetd
apache db2 hto-mapfuncs ids portblock
crm(live)configure ra# list lsb
NetworkManager acpid anacron apmd atd auditd
autofs avahi-daemon avahi-dnsconfd bluetooth capi conman
corosync cpuspeed crond cups cups-config-daemon dc_client
dc_server dnsmasq dund firstboot functions gpm
haldaemon halt heartbeat hidd hplip httpd
ip6tables ipmi iptables irda irqbalance isdn
kdump killall krb524 kudzu lm_sensors logd
lvm2-monitor mcstrans mdmonitor mdmpd messagebus microcode_ctl
multipathd mysqld netconsole netfs netplugd network
nfs nfslock nscd ntpd openais openibd
pacemaker pand pcscd portmap psacct rawdevices
rdisc readahead_early readahead_later restorecond rhnsd rpcgssd
rpcidmapd rpcsvcgssd saslauthd sendmail setroubleshoot single
smartd snmpd snmptrapd sshd syslog tgtd
vncserver vsftpd wdaemon winbind wpa_supplicant xfs
xinetd ypbind yum-updatesd
crm(live)configure ra# list ocf
AoEtarget AudibleAlarm CTDB ClusterMon Delay Dummy
EvmsSCC Evmsd Filesystem HealthCPU HealthSMART ICP
IPaddr IPaddr2 IPsrcaddr IPv6addr LVM LinuxSCSI
MailTo ManageRAID ManageVE Pure-FTPd Raid1 Route
SAPDatabase SAPInstance SendArp ServeRAID SphinxSearchDaemon Squid
Stateful SysInfo SystemHealth VIPArip VirtualDomain WAS
WAS6 WinPopup Xen Xinetd anything apache
conntrackd controld db2 drbd eDir88 exportfs
fio iSCSILogicalUnit iSCSITarget ids iscsi jboss
ldirectord mysql mysql-proxy nfsserver nginx o2cb
oracle oralsnr pgsql ping pingd portblock
postfix proftpd rsyncd scsi2reservation sfex syslog-ng
tomcat vmware
crm(live)configure ra# list stonith
apcmaster apcmastersnmp apcsmart baytech bladehpi
cyclades external/drac5 external/dracmc-telnet external/hmchttp external/ibmrsa
external/ibmrsa-telnet external/ipmi external/ippower9258 external/kdumpcheck external/rackpdu
external/riloe external/sbd external/vmware external/xen0 external/xen0-ha
fence_legacy ibmhmc ipmilan meatware nw_rpc100s
rcd_serial rps10 suicide wti_mpc
所需软件包下载地址:
本文出自 “小小屋,好好男人” 博客,转载请与作者联系!