本次试验是基于Corosync和NFS对LAMP做高可用;

在一台宕机后另一台可以接替。

很多原理性的理论:HA Cluster—heartbeat v2基于crm配置有介绍;这里就不作介绍;直接进入配置阶段。

一、准备环境

服务器 IP 主机名
httpd+php+mysql 192.168.0.111 node1.soul.com
httpd+php+mysql 192.168.0.112 node2.soul.com
NFS 192.168.0.113 nfs.soul.com
VIP
192.168.0.222
  • 同步时间

#为了方便,这里单独使用了一台ansible机器,并非实验必须
[root@ansible ~]# ansible nodes -a "date"
node1.soul.com | success | rc=0 >>
Wed Apr 23 09:36:53 CST 2014
node2.soul.com | success | rc=0 >>
Wed Apr 23 09:36:53 CST 2014
nfs.soul.com | success | rc=0 >>
Wed Apr 23 09:36:53 CST 2014
  • 对应的机器上安装软件

[root@node1 ~]# rpm -q httpd php
httpd-2.2.15-29.el6.centos.x86_64
php-5.3.3-26.el6.x86_64
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
[root@node2 ~]# rpm -q httpd php
httpd-2.2.15-29.el6.centos.x86_64
php-5.3.3-26.el6.x86_64
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
#两台机器都需要
[root@node1 ~]# chkconfig httpd off
[root@node1 ~]# chkconfig --list httpd
httpd           0:off   1:off   2:off   3:off   4:off   5:off   6:off
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
#分别在两台机器上安装mysql
#注意的是如果在node1上操作,那node2上就不要初始化数据库了
#这里数据库的存储目录需要指定为NFS的共享目录;且初始化的时候需要先挂载NFS
[root@node1 ~]# mount
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
192.168.0.113:/webstore on /share type nfs (rw,vers=4,addr=192.168.0.113,clientaddr=192.168.0.111)
#如上所示,两台都是如此
  • 安装nfs

[root@nfs ~]# service nfs start
Starting NFS services:                                     [  OK  ]
Starting NFS quotas:                                       [  OK  ]
Starting NFS mountd:                                       [  OK  ]
Starting NFS daemon:                                       [  OK  ]
Starting RPC idmapd:                                       [  OK  ]
[root@nfs ~]#
[root@nfs ~]# exportfs -v
[root@nfs ~]# exportfs -v
/webstore       192.168.0.111(rw,wdelay,no_root_squash,no_subtree_check)
/webstore       192.168.0.112(rw,wdelay,no_root_squash,no_subtree_check
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
#在httpd服务器上查看
[root@node1 ~]# showmount -e 192.168.0.113
Export list for 192.168.0.113:
/webstore 192.168.0.112,192.168.0.111
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
#所有服务都准备测试好以后,全部关闭,并关闭开启自动启动。
#NFS服务需要开机启动;否则挂不上

二、安装配置corosync和pacemaker

#这里以node1操作
[root@node1 ~]# rpm -q corosync pacemaker
corosync-1.4.1-17.el6.x86_64
pacemaker-1.1.10-14.el6.x86_64
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
#配置corosync
[root@node1 ~]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf
[root@node1 ~]# vim /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank
totem {
        version: 2            #版本号
        secauth: on           #认证
        threads: 0            #认证时并行线程数
        interface {
                ringnumber: 0    #环号码
                bindnetaddr: 192.168.0.0    #绑定的网络
                mcastaddr: 226.94.40.1      #多播地址
                mcastport: 5405             #多播端口
                ttl: 1                      #发送次数
        }
}
logging {
        fileline: off
        to_stderr: no                #输入到标准错误
        to_logfile: yes              #启用日志
        to_syslog: no                #发送到系统日志
        logfile: /var/log/cluster/corosync.log #日志路径
        debug: off                   #是否开启debug
        timestamp: on                #是否开启时间戳
        logger_subsys {
                subsys: AMF
                debug: off
        }
}
amf {
        mode: disabled
}
service {
        ver:    0                #版本号
        name:   pacemaker        #开启自动启动pacemaker
}
aisexec {
        user:   root            #运行时用户
        group:  root
}
-- INSERT --
  • 生成认证密钥

[root@node1 corosync]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Writing corosync key to /etc/corosync/authkey.
#可能会有与随机数不够导致需要敲键盘;可以选择敲键盘;也可以使用伪随机数
#建议没事多敲敲键盘;即保证了安全;有锻炼了身体
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
#拷贝authkey corosync.conf到node2
[root@node1 corosync]# ls
authkey        corosync.conf.example       service.d
corosync.conf  corosync.conf.example.udpu  uidgid.d
#
[root@node1 corosync]# scp -p authkey corosync.conf node2:/etc/corosync/
authkey                                                     100%  128     0.1KB/s   00:00
corosync.conf                                               100%  520     0.5KB/s   00:00
[root@node1 corosync]#
#注意权限
  • 启动测试

[root@node1 ~]# service corosync start
Starting Corosync Cluster Engine (corosync):               [  OK  ]
[root@node1 ~]# ssh node2 'service corosync start'
Starting Corosync Cluster Engine (corosync): [  OK  ]
[root@node1 ~]#
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
#验证是否启动成功
#验证启动是否正常
[root@node1 ~]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
Apr 23 11:48:40 corosync [MAIN  ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service.
Apr 23 11:48:40 corosync [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
#验证初始化成员节点通知是否正常发出
[root@node1 ~]# grep TOTEM /var/log/cluster/corosync.log
Apr 23 11:48:40 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
Apr 23 11:48:40 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Apr 23 11:48:40 corosync [TOTEM ] The network interface [192.168.0.111] is now up.
Apr 23 11:48:41 corosync [TOTEM ] Process pause detected for 879 ms, flushing membership messages.
Apr 23 11:48:41 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Apr 23 11:48:53 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
#查看pacemaker是否正常启动
[root@node1 ~]# grep pcmk_startup /var/log/cluster/corosync.log
Apr 23 11:48:40 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized
Apr 23 11:48:40 corosync [pcmk  ] Logging: Initialized pcmk_startup
Apr 23 11:48:40 corosync [pcmk  ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
Apr 23 11:48:40 corosync [pcmk  ] info: pcmk_startup: Service: 9
Apr 23 11:48:40 corosync [pcmk  ] info: pcmk_startup: Local hostname: node1.soul.com
[root@node1 ~]#
  • 安装crmsh和pssh两个包

[root@node1 ~]# scp -p pssh-2.3.1-2.el6.x86_64.rpm crmsh-1.2.6-4.el6.x86_64.rpm node2:/root
pssh-2.3.1-2.el6.x86_64.rpm                                 100%   49KB  48.8KB/s   00:00
crmsh-1.2.6-4.el6.x86_64.rpm                                100%  484KB 483.7KB/s   00:00
[root@node1 ~]#
                                                                                                                                                                                                                                                                                                                                                                                                                                                         
[root@node1 ~]# yum -y install crmsh-1.2.6-4.el6.x86_64.rpm pssh-2.3.1-2.el6.x86_64.rpm
                                                                                                                                                                                                                                                                                                                                                                                                                                                         
[root@node2 ~]# yum -y install crmsh-1.2.6-4.el6.x86_64.rpm pssh-2.3.1-2.el6.x86_64.rpm
                                                                                                                                                                                                                                                                                                                                                                                                                                                       
#安装完成后即可使用crm命令来查看
[root@node1 ~]# crm status
Last updated: Wed Apr 23 11:57:29 2014
Last change: Wed Apr 23 11:49:04 2014 via crmd on node2.soul.com
Stack: classic openais (with plugin)
Current DC: node2.soul.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
0 Resources configured
Online: [ node1.soul.com node2.soul.com ]
                                                                                                                                                                                                                                                                                                                                                                                                                                                     
#crm的用法:
[root@node1 ~]# crm
crm(live)# help
#就是这样,一句话也说不清楚,需要慢慢研究,
This is crm shell, a Pacemaker command line interface.
Available commands:
    cib              manage shadow CIBs
    resource         resources management
    configure        CRM cluster configuration
    node             nodes management
    options          user preferences
    history          CRM cluster history
    site             Geo-cluster support
    ra               resource agents information center
    status           show cluster status
    help,?           show help (help topics for list of topics)
    end,cd,up        go back one level
    quit,bye,exit    exit the program

一切准备就绪后;下面就是配置资源了,最繁琐的也是这里

三、配置高可用集群资源

首先做个简单的规划;需要配置哪些资源,先后次序后续启动时很重要的:

1、配置VIP

2、配置NFS共享存储

3、配置httpd服务

4、配置mysql服务

5、配置一个资源组;把上述资源加入该组

所需要的命令或需要使用的代理大致:

#资源类型:
crm(live)# ra
crm(live)ra# classes
lsb
ocf / heartbeat pacemaker
service
stonith
crm(live)ra#
                                                                                                                                                                                                                                                                                                                                                                                          
#资源代理,可以此类推查看
crm(live)ra# list lsb
NetworkManager    abrt-ccpp         abrt-oops         abrtd             acpid
atd               auditd            autofs            blk-availability  bluetooth
corosync          corosync-notifyd  cpuspeed          crond             cups
dnsmasq           firstboot         haldaemon         halt              htcacheclean
httpd             ip6tables         iptables          irqbalance        kdump
killall           libvirt-guests    lvm2-lvmetad      lvm2-monitor      mdmonitor
messagebus        mysqld            netconsole        netfs             network
nfs               nfslock           ntpd              ntpdate           pacemaker
php-fpm           portreserve       postfix           psacct            quota_nld
rdisc             restorecond       rngd              rpcbind           rpcgssd
rpcidmapd         rpcsvcgssd        rsyslog           sandbox           saslauthd
single            smartd            spice-vdagentd    sshd              svnserve
sysstat           udev-post         wdaemon           winbind           wpa_supplicant
                                                                                                                                                                                                                                                                                                                                                                                        
#详细信息
crm(live)ra# info lsb:nfs
lsb:nfs
NFS is a popular protocol for file sharing across networks.
               This service provides NFS server functionality, which is \
               configured via the /etc/exports file.
Operations' defaults (advisory minimum):
    start         timeout=15
    stop          timeout=15
    status        timeout=15
    restart       timeout=15
    force-reload  timeout=15
    monitor       timeout=15 interval=15

大致了解后,添加资源:

#首先配置几个全局属性信息
#禁用stonith设备,因为这里没有该设备可以使用
crm(live)configure# property stonith-enabled=false
crm(live)configure# verify    #校验下
                                                                                                                                                                                                                                                                                                                                                                             
#忽略不满足法定票数时的操作
crm(live)configure# property no-quorum-policy=ignore
crm(live)configure# verify
crm(live)configure# commit     #确认后提交
                                                                                                                                                                                                                                                                                                                                                                             
#查看配置的信息
crm(live)configure# show
node node1.soul.com
node node2.soul.com
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-14.el6-368c726" \
    cluster-infrastructure="classic openais (with plugin)" \
    expected-quorum-votes="2" \
    stonith-enabled="false" \
    no-quorum-policy="ignore"
  • 配置VIP

crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=192.168.0.222 op monitor interval=30s timeout=30s on-fail=restart
#参数可以help查看
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show
node node1.soul.com
node node2.soul.com
primitive webip ocf:heartbeat:IPaddr \
    params ip="192.168.0.222" \
    op monitor interval="30s" timeout="30s" on-fail="restart"
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-14.el6-368c726" \
    cluster-infrastructure="classic openais (with plugin)" \
    expected-quorum-votes="2" \
    stonith-enabled="false" \
    no-quorum-policy="ignore"
                                                                                                                                                                                                                                                                                                                                                
#commit后就可以查看状态信息
crm(live)# status
Last updated: Wed Apr 23 12:26:20 2014
Last change: Wed Apr 23 12:25:22 2014 via cibadmin on node1.soul.com
Stack: classic openais (with plugin)
Current DC: node2.soul.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
1 Resources configured
Online: [ node1.soul.com node2.soul.com ]
 webip  (ocf::heartbeat:IPaddr):    Started node1.soul.com
  • 配置NFS共享存储

crm(live)configure# primitive webstore ocf:heartbeat:Filesystem \
params device="192.168.0.113:/webstore" \
directory="/share" fstype="nfs" \
op monitor interval=40s timeout=40s \
op start timeout=60s op stop timeout=60s
crm(live)configure# verify
crm(live)configure# show
node node1.soul.com
node node2.soul.com
primitive webip ocf:heartbeat:IPaddr \
    params ip="192.168.0.222" \
    op monitor interval="30s" timeout="30s" on-fail="restart"
primitive webstore ocf:heartbeat:Filesystem \
    params device="192.168.0.113:/webstore" directory="/share" fstype="nfs" \
    op monitor interval="40s" timeout="40s" \
    op start timeout="60s" interval="0" \
    op stop timeout="60s" interval="0"
property $id="cib-bootstrap-options" \
    dc-version="1.1.10-14.el6-368c726" \
    cluster-infrastructure="classic openais (with plugin)" \
    expected-quorum-votes="2" \
    stonith-enabled="false" \
    no-quorum-policy="ignore"
  • 配置httpd服务

crm(live)configure# primitive webserver lsb:httpd op monitor interval=30s timeout=30s on-fail=restart
crm(live)configure# verify
crm(live)configure# commit
crm(live)# status
Last updated: Wed Apr 23 13:20:54 2014
Last change: Wed Apr 23 13:20:46 2014 via crmd on node2.soul.com
Stack: classic openais (with plugin)
Current DC: node2.soul.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
3 Resources configured
Online: [ node1.soul.com node2.soul.com ]
 webip  (ocf::heartbeat:IPaddr):    Started node1.soul.com
 webstore   (ocf::heartbeat:Filesystem):    Started node2.soul.com
 webserver  (lsb:httpd):    Started node1.soul.com
  • 配置mysql服务

crm(live)configure# primitive webdb lsb:mysqld op monitor interval=30s timeout=30s on-fail=restart
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Wed Apr 23 13:25:38 2014
Last change: Wed Apr 23 13:25:17 2014 via cibadmin on node1.soul.com
Stack: classic openais (with plugin)
Current DC: node2.soul.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
4 Resources configured
Online: [ node1.soul.com node2.soul.com ]
 webip  (ocf::heartbeat:IPaddr):    Started node1.soul.com
 webstore   (ocf::heartbeat:Filesystem):    Started node2.soul.com
 webserver  (lsb:httpd):    Started node1.soul.com
 webdb  (lsb:mysqld):   Started node2.soul.com
  • 配置一个资源组;并将以上资源加入该

#从上面的资源状态信息可以看出;资源会自动负载均衡到两台机器
#所以需要让其都归到一个组内
crm(live)configure# group webcluster webip webstore webserver webdb
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Wed Apr 23 13:30:25 2014
Last change: Wed Apr 23 13:29:55 2014 via cibadmin on node1.soul.com
Stack: classic openais (with plugin)
Current DC: node2.soul.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
4 Resources configured
Online: [ node1.soul.com node2.soul.com ]
 Resource Group: webcluster
     webip  (ocf::heartbeat:IPaddr):    Started node1.soul.com
     webstore   (ocf::heartbeat:Filesystem):    Started node1.soul.com
     webserver  (lsb:httpd):    Started node1.soul.com
     webdb  (lsb:mysqld):   Started node1.soul.com
#组添加后就可以看到资源自动转移到同一节点
  • 此时;定义一个顺序约束;让其按照指定顺序启动/关闭

crm(live)configure# help order
Usage:
...............
        order  {kind|}: [:] [:] ...
          [symmetrical=]
        kind :: Mandatory | Optional | Serialize
                                                                                                                                                                                                                               
crm(live)configure# order ip_store_http_db Mandatory: webip webstore webserver webdb  
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# show xml


  
    
      
        
        
        
        
        
        
      
    
    
      
      
    
    
      
        
          
            
          
          
            
          
        
        
          
            
            
            
          
          
            
            
            
          
        
        
          
            
          
        
        
          
            
          
        
      
    
    
      
        
          
          
          
          
        
      
    
  

四、安装论坛测试

#首先查看目前资源运行于哪个节点
[root@node1 ~]# crm status
Last updated: Wed Apr 23 13:52:07 2014
Last change: Wed Apr 23 13:37:27 2014 via cibadmin on node1.soul.com
Stack: classic openais (with plugin)
Current DC: node2.soul.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
4 Resources configured
Online: [ node1.soul.com node2.soul.com ]
 Resource Group: webcluster
     webip  (ocf::heartbeat:IPaddr):    Started node1.soul.com
     webstore   (ocf::heartbeat:Filesystem):    Started node1.soul.com
     webserver  (lsb:httpd):    Started node1.soul.com
     webdb  (lsb:mysqld):   Started node1.soul.com
#可以看出都运行在node1上,然后更改下node1的httpd的配置文件
#更改对应的网页目录到NFS下的/share/www
#在到/share/www创建一个文件进行测试
[root@node1 www]# vim /share/www/index.php

Page!!!!

#保存测试

HA Cluster—Corosync+NFS实现LAMP高可用_第1张图片

测试访问正常。可以安装论坛测试

HA Cluster—Corosync+NFS实现LAMP高可用_第2张图片

  • 测试资源转移:

[root@node1 ~]# crm status
Last updated: Wed Apr 23 15:15:01 2014
Last change: Wed Apr 23 15:14:42 2014 via crm_attribute on node2.soul.com
Stack: classic openais (with plugin)
Current DC: node1.soul.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
4 Resources configured
Online: [ node1.soul.com node2.soul.com ]
 Resource Group: webcluster
     webip  (ocf::heartbeat:IPaddr):    Started node1.soul.com
     webstore   (ocf::heartbeat:Filesystem):    Started node1.soul.com
     webserver  (lsb:httpd):    Started node1.soul.com
     webdb  (lsb:mysqld):   Started node1.soul.com
                                                                                                                                                      
#目前查看运行在node1上,现在让node1节点standby:
[root@node1 ~]# crm node standby node1.soul.com
[root@node1 ~]# crm status
Last updated: Wed Apr 23 15:20:51 2014
Last change: Wed Apr 23 15:20:41 2014 via crm_attribute on node1.soul.com
Stack: classic openais (with plugin)
Current DC: node1.soul.com - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
4 Resources configured
Node node1.soul.com: standby
Online: [ node2.soul.com ]
 Resource Group: webcluster
     webip  (ocf::heartbeat:IPaddr):    Started node2.soul.com
     webstore   (ocf::heartbeat:Filesystem):    Started node2.soul.com
     webserver  (lsb:httpd):    Started node2.soul.com
     webdb  (lsb:mysqld):   Started node2.soul.com
                                                                                                                                                    
#查看全部转移到node2上
#刷新网页查看下

HA Cluster—Corosync+NFS实现LAMP高可用_第3张图片

测试一切正常;到此;LAMP的高可用已配置完成。如有问题;可以留言讨论。




如有错误;恳请纠正。