***********************************************
一、准备实验环境
二、安装corosync+cman
三、Drbd的安装于配置
四、安装Mysql服务器
五、使用corosync+crm定义资源
六、测试mysql是否是高可用
************************************************
一、准备实验环境
1、服务器IP地址规划
Drbd1:172.16.10.3
Drbd1:172.16.10.4
2、服务器操作系统
Drbd1:Centos 6.4 x86_64
Drbd2:Centos 6.4 x86_64
3、修改主机名以及hosts文件
####drbd1 server############ sed -i 's@\(HOSTNAME=\).*@\1drbd1@g' /etc/sysconfig/network hostname drbd1 [root@drbd1 ~]# echo "172.16.10.3 drbd1" >> /etc/hosts [root@drbd1 ~]# echo "172.16.10.4 drbd2" >> /etc/hosts [root@drbd1 ~]# ssh-keygen -t rs [root@drbd1 ~]# ssh-copy-id -i .ssh/id_rsa.pub drbd2 [root@drbd1 ~]# scp /etc/hosts drbd2:/etc/ ####drbd2 server############ sed -i 's@\(HOSTNAME=\).*@\1drbd2@g' /etc/sysconfig/network hostname drbd2 [root@drbd2 ~]# ssh-keygen -t rsa [root@drbd2 ~]# ssh-copy-id -i .ssh/id_rsa.pub drbd1
4、创建Drbd磁盘分区规划
在drbd1与drbd2上创建磁盘分区(sda3)大小为5G,相信这个对大家来说都很easy,此处就省略了。注意在Centos6系统上必须重启系统,内核才能读到分区信息
二、安装Corosync+Crm
1、安装Corosync
###########drbd1############# [root@drbd1 ~]# yum install corosync -y ###########drbd2############# [root@drbd2 ~]# yum install corosync -y
2、配置Corosync
[root@drbd1 ~]# cd /etc/corosync/ [root@drbd1 corosync]# cp corosync.conf.example corosync.conf [root@drbd1 corosync]# vim corosync.conf # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2 secauth: off threads: 0 interface { # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2 secauth: off # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2 secauth: on #开启认证功能 threads: 0 interface { ringnumber: 0 bindnetaddr: 172.16.0.0 #修改为本机的网络地址 mcastaddr: 226.94.1.1 mcastport: 5405 ttl: 1 } } logging { fileline: off to_stderr: no to_logfile: yes to_syslog: yes logfile: /var/log/cluster/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } } amf { mode: disabled } ####添加以下几行 service { ver: 0 name: pacemaker # use_mgmtd: yes } aisexec { user: root group: root }
3、生成节点间通信时用到的认证密钥文件并将秘钥以及配置文件copy到drbd2
[root@drbd1 corosync]# corosync-keygen Corosync Cluster Engine Authentication key generator. Gathering 1024 bits for key from /dev/random. Press keys on your keyboard to generate entropy. Writing corosync key to /etc/corosync/authkey. [root@drbd1 corosync]# scp corosync.conf authkey drbd2:/etc/corosync/
4、为两个节点创建corosync生成的日志所在的目录
[root@drbd1 corosync]# mkdir -pv /var/log/cluster [root@drbd1 corosync]# ssh drbd2 "mkdir -pv /var/log/cluster"
5、安装Crmsh
[root@drbd1 ~]# wget ftp://195.220.108.108/linux/opensuse/factory/repo/oss/suse/x86_64/crmsh-1.2.6-0.rc2.1.1.x86_64.rpm [root@drbd1 ~]# wget ftp://195.220.108.108/linux/opensuse/factory/repo/oss/suse/noarch/pssh-2.3.1-6.1.noarch.rpm [root@drbd1 ~]# yum localinstall --nogpgcheck crmsh*.rpm pssh*.rpm -y [root@drbd1 ~]# scp crmsh-1.2.6-4.el6.x86_64.rpm pssh-2.3.1-6.1.noarch.rpm drbd2:/root/ [root@drbd1 ~]# ssh drbd2 "yum localinstall --nogpgcheck crmsh*.rpm pssh*.rpm -y"qid
6、启动Corosync
[root@drbd1 ~]# /etc/init.d/corosync start #启动之后没有报错,启动节点2 [root@drbd1 ~]# ssh drbd2 "/etc/init.d/corosync start"
7、查看集群节点的启动状态
[root@drbd1 ~]# crm status Last updated: Thu Sep 19 11:06:54 2013 Last change: Thu Sep 19 11:03:21 2013 via crmd on drbd1 Stack: classic openais (with plugin) Current DC: drbd2 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 0 Resources configured. Online: [ drbd1 drbd2 ] #drbd1与drbd2都在线 You have new mail in /var/spool/mail/root [root@drbd1 ~]#
Drbd共有两部分组成:内核模块和用户空间的管理工具。其中drbd内核模块代码已经整合进Linux内核2.6.33以后的版本中,因此,如果您的内核版本高于此版本的话,你只需要安装管理工具即可;否则,您需要同时安装内核模块和管理工具两个软件包,并且此两者的版本号一定要保持对应。
目前适用CentOS 5的drbd版本主要有8.0、8.2、8.3三个版本,其对应的rpm包的名字分别为drbd, drbd82和drbd83,对应的内核模块的名字分别为kmod-drbd, kmod-drbd82和kmod-drbd83。而适用于CentOS 6的版本为8.4,其对应的rpm包为drbd和drbd-kmdl,但在实际选用时,要切记两点:drbd和drbd-kmdl的版本要对应;另一个是drbd-kmdl的版本要与当前系统的内容版本相对应。我们实验所用的平台为x86_64且系统为CentOS 6.4,因此需要同时安装内核模块和管理工具。我们这里选用最新的8.4的版本(drbd-8.4.3-33.el6.x86_64.rpm和drbd-kmdl-2.6.32-358.el6-8.4.3-33.el6.x86_64.rpm),下载地址为ftp://rpmfind.net/linux/atrpms/,请按照需要下载。
三、Drbd的安装于配置
1、安装Drbd软件包
[root@drbd1 ~]# wget ftp://195.220.108.108/linux/atrpms/el6-x86_64/atrpms/stable/drbd-8.4.3-33.el6.x86_64.rpm [root@drbd1 ~]# wget ftp://195.220.108.108/linux/atrpms/el6-x86_64/atrpms/stable/drbd-kmdl-2.6.32-358.6.2.el6-8.4.3-33.el6.x86_64.rpm [root@drbd1 ~]# rpm -ivh drbd-8.4.3-33.el6.x86_64.rpm drbd-kmdl-2.6.32-358.el6-8.4.3-33.el6.x86_64.rpm warning: drbd-8.4.3-33.el6.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID 66534c2b: NOKEY Preparing... ########################################### [100%] 1:drbd-kmdl-2.6.32-358.el########################################### [ 50%] 2:drbd ########################################### [100%] [root@drbd1 ~]# scp drbd-* drbd2:/root/ drbd-8.4.3-33.el6.x86_64.rpm 100% 283KB 283.3KB/s 00:00 drbd-kmdl-2.6.32-358.el6-8.4.3-33.el6.x86_64.rpm 100% 145KB 145.2KB/s 00:00 [root@drbd1 ~]# ssh drbd2 " rpm -ivh drbd-8.4.3-33.el6.x86_64.rpm drbd-kmdl-2.6.32-358.el6-8.4.3-33.el6.x86_64.rpm " warning: drbd-8.4.3-33.el6.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID 66534c2b: NOKEY Preparing... ################################################## drbd-kmdl-2.6.32-358.el6 ################################################## drbd ################################################## [root@drbd1 ~]#
2、修改Drbd的主配置文件为
[root@drbd1 ~]# vim /etc/drbd.d/global_common.conf global { usage-count no; # minor-count dialog-refresh disable-ip-verification } common { protocol C; handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; # fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; # split-brain "/usr/lib/drbd/notify-split-brain.sh root"; # out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root"; # before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k"; # after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh; } startup { #wfc-timeout 120; #degr-wfc-timeout 120; } disk { on-io-error detach; #fencing resource-only; } net { cram-hmac-alg "sha1"; shared-secret "mydrbdlab"; } syncer { rate 1000M; } }
3、定义一个资源名称为drbd
[root@drbd1 ~]# vim /etc/drbd.d/web.res resource drbd { on drbd1 { device /dev/drbd0; disk /dev/sda3; address 172.16.10.3:7789; meta-disk internal; } on drbd2 { device /dev/drbd0; disk /dev/sda3; address 172.16.10.4:7789; meta-disk internal; } }
4、将配置文件copy到Drbd2
[root@drbd1 ~]# cd /etc/drbd.d/ [root@drbd1 drbd.d]# scp global_common.conf web.res drbd2:/etc/drbd.d/ global_common.conf 100% 1401 1.4KB/s 00:00 web.res 100% 266 0.3KB/s 00:00 [root@drbd1 drbd.d]#
5、在两个节点上初始化已 定义的资源并启动服务
[root@drbd1 ~]# drbdadm create-md drbd NOT initializing bitmap Writing meta data... initializing activity log New drbd meta data block successfully created. lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory [root@drbd1 ~]# ssh drbd2 "drbdadm create-md drbd" NOT initializing bitmap Writing meta data... initializing activity log New drbd meta data block successfully created. lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory [root@drbd1 ~]#/etc/init.d/drbd start [root@drbd2 ~]# /etc/init.d/drbd start
6、查看Drbd状态信息
[root@drbd1 ~]# drbd-overview 0:drbd/0 Connected Secondary/Secondary Inconsistent/Inconsistent C r----- [root@drbd1 ~]#
从上面的信息中可以看出此时两个节点均处于Secondary状态。于是,我们接下来需要将其中一个节点设置为Primary。
7、设置drbd的节点为主节点
[root@drbd1 ~]# drbdadm primary --force drbd [root@drbd1 ~]# drbd-overview #查看开始同步数据 0:drbd/0 SyncSource Primary/Secondary UpToDate/Inconsistent C r----- [>....................] sync'ed: 2.9% (4984/5128)M [root@drbd1 ~]# drbd-overview #在此查看同步完成 0:drbd/0 Connected Primary/Secondary UpToDate/UpToDate C r----- [root@drbd1 ~]#
8、创建文件系统并格式化
[root@drbd1 ~]# mkdir -pv /mydata [root@drbd1 ~]# mkfs.ext4 /dev/drbd0
9、挂载文件系统,查看状态
[root@drbd1 ~]# mount /dev/drbd0 /mydata/ [root@drbd1 ~]# drbd-overview 0:drbd/0 Connected Primary/Secondary UpToDate/UpToDate C r----- /mydata ext4 5.0G 139M 4.6G 3% [root@drbd1 ~]#
四、安装Mysql服务器
1、drbd1上安装(drbd主节点)
[root@drbd1 ~]# useradd -r -u 306 mysql [root@drbd1 ~]# tar xf mysql-5.5.33-linux2.6-x86_64.tar.gz -C /usr/local/ [root@drbd1 ~]# cd /usr/local/ [root@drbd1 local]# ln -sv mysql-5.5.33-linux2.6-x86_64/ mysql `mysql' -> `mysql-5.5.33-linux2.6-x86_64/' [root@drbd1 local]# cd mysql [root@drbd1 mysql]# chown root.mysql -R * [root@drbd1 mysql]# ./scripts/mysql_install_db --user=mysql --datadir=/mydata/data/ [root@drbd1 mysql]# cp support-files/my-large.cnf /etc/my.cnf [root@drbd1 mysql]# cp support-files/mysql.server /etc/init.d/mysqld [root@drbd1 mysql]# vim /etc/my.cnf thread_concurrency = 8 datadir=/mydata/data #指定数据目录 [root@drbd1 mysql]# stervice mysqld start
注意:安装mysql(drbd2)之前需要卸载文件系统并降级
[root@drbd1 mysql]#start mysqld stop [root@drbd1 mysql]# umount /mydata/ [root@drbd1 mysql]# drbdadm secondary drbd #资源降级
2、drbd2安装
[root@drbd2 ~]# useradd -r -u 306 mysql [root@drbd2 ~]# drbdadm primary drbd [root@drbd2 ~]# mkdir -pv /mydata/ [root@drbd2 ~]# chown -R mysql.mysql /mydata/ [root@drbd2 ~]# mount /dev/drbd0 /mydata/ [root@drbd2 ~]#useradd -r -u 306 mysql [root@drbd2 ~]#tar xf mysql-5.5.33-linux2.6-x86_64.tar.gz -C /usr/local/ [root@drbd2 ~]# cd /usr/local/ [root@drbd2 local]# ln -sv mysql-5.5.33-linux2.6-x86_64/ mysql [root@drbd2 local]#cd mysql [root@drbd2 mysql]# chown root:mysql * -R [root@nginx2 mysql]# cp support-files/my-large.cnf /etc/my.cnf [root@nginx2 mysql]# cp support-files/mysql.server /etc/init.d/mysqld
3、注意要测试mysql是否能正常启动
[root@drbd2 ~]# mount /dev/drbd0 /mydata/ [root@drbd2 ~]#service mysqld start
如果启动正常将继续下面的操作
[root@drbd2 ~]#service mysqld stop [root@drbd2 ~]#umount /mydata [root@drbd2 ~]#chkconfig mysqld off [root@drbd2 ~]#ssh drbd1 "chkconfig mysqld off"
五、使用corosync+crm定义资源
[root@drbd1 ~]# crm conf crm(live)configure# property no-quorum-policy=ignore crm(live)configure# commit crm(live)configure# primitive webdrbd ocf:linbit:drbd params drbd_resource=drbd op monitor role=Master interval=10 timeout=20 op monitor role=Slave interval=20 timeout=20 op start timeout=240 op stop timeout=100 crm(live)configure# verify crm(live)configure# master ms_webdrbd webdrbd meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" crm(live)configure# verify crm(live)configure# commit crm(live)configure# exit
注释:
行4:定义了drbd的资源,并设置状态监控
行9:定义了drbd主节点的资源,并设置了状态监控
1、查看资源的状态
[root@drbd1 ~]# crm status Last updated: Thu Sep 19 15:49:41 2013 Last change: Thu Sep 19 15:49:31 2013 via cibadmin on drbd1 Stack: classic openais (with plugin) Current DC: drbd1 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 2 Resources configured. Online: [ drbd1 drbd2 ] Master/Slave Set: ms_webdrbd [webdrbd] Masters: [ drbd1 ] Slaves: [ drbd2 ]
如果出现上述的结果,表示正常,如有问题,请检查操作步骤,此时drbd已经可以自动主从切换了
1.1、测试节点是否能正常切换
[root@drbd1 ~]# crm node standby #drbd成为备用节点 [root@drbd1 ~]# crm status Last updated: Thu Sep 19 15:54:19 2013 Last change: Thu Sep 19 15:54:13 2013 via crm_attribute on drbd1 Stack: classic openais (with plugin) Current DC: drbd1 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 2 Resources configured. Node drbd1: standby Online: [ drbd2 ] Master/Slave Set: ms_webdrbd [webdrbd] Masters: [ drbd2 ] #切换到drbd2上 Stopped: [ webdrbd:1 ] [root@drbd1 ~]#
[root@drbd1 ~]# crm node online #drbd1上线 [root@drbd1 ~]# crm status Last updated: Thu Sep 19 15:55:37 2013 Last change: Thu Sep 19 15:55:31 2013 via crm_attribute on drbd1 Stack: classic openais (with plugin) Current DC: drbd1 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 2 Resources configured. Online: [ drbd1 drbd2 ] Master/Slave Set: ms_webdrbd [webdrbd] Masters: [ drbd2 ] Slaves: [ drbd1 ] #drbd1上线成功 [root@drbd1 ~]#
2、定义mysql资源并定义约束
[root@drbd1 ~]# crm confi crm(live)configure# primitive mystore ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/mydata" fstype=ext4 op monitor interval=40 timeout=40 op start timeout=60 op stop timeout=60 crm(live)configure# verify crm(live)configure# primitive myip ocf:heartbeat:IPaddr params ip="172.16.10.8" op monitor interval=20 timeout=20 on-fail=restart crm(live)configure# verify crm(live)configure# primitive myserver lsb:mysqld op monitor interval=20 timeout=20 on-fail=restart crm(live)configure# verify crm(live)configure# exit
注释:
行2:定义了文件系统资源,用于过载drbd,并设置监控时长
行6:定义虚拟IP,并设置监控
行8:定义mysql服务的资源,并设置监控时长
3、查看资源状态
[root@drbd1 ~]# crm status Last updated: Thu Sep 19 16:05:58 2013 Last change: Thu Sep 19 16:05:19 2013 via cibadmin on drbd1 Stack: classic openais (with plugin) Current DC: drbd1 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 5 Resources configured. Online: [ drbd1 drbd2 ] Master/Slave Set: ms_webdrbd [webdrbd] Masters: [ drbd2 ] Slaves: [ drbd1 ] myip (ocf::heartbeat:IPaddr): Started drbd1 mystore (ocf::heartbeat:Filesystem): Started drbd2 Failed actions: myserver_start_0 (node=drbd1, call=149, rc=1, status=complete): unknown error mystore_start_0 (node=drbd1, call=141, rc=1, status=complete): unknown error myserver_start_0 (node=drbd2, call=143, rc=1, status=complete): unknown error
此时出现了一点点小问题那么我们就解决
[root@drbd1 ~]# crm resource cleanup mystore [root@drbd1 ~]# crm resource cleanup myserver
4、此时在查看资源状态,已然正常
[root@drbd1 ~]# crm status Last updated: Thu Sep 19 16:07:39 2013 Last change: Thu Sep 19 16:07:31 2013 via crmd on drbd2 Stack: classic openais (with plugin) Current DC: drbd1 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 5 Resources configured. Online: [ drbd1 drbd2 ] Master/Slave Set: ms_webdrbd [webdrbd] Masters: [ drbd2 ] Slaves: [ drbd1 ] myip (ocf::heartbeat:IPaddr): Started drbd1 myserver (lsb:mysqld): Started drbd2 mystore (ocf::heartbeat:Filesystem): Started drbd2 [root@drbd1 ~]#
此时资源没有运行在同一个节点上,显然是不符合实际需要的我们通过定义排列约束,将所有资源运行在同一个节点上
crm(live)configure# colocation mystore_with_ms_webdrbd inf: mystore ms_webdrbd:Master crm(live)configure# verify crm(live)configure# colocation myserver_with_mystore_with_myip inf: myserver mystore myip crm(live)configure# verify crm(live)configure#
5、再此查看是否运行中同一个节点上
[root@drbd1 ~]# crm status Last updated: Thu Sep 19 16:18:20 2013 Last change: Thu Sep 19 16:17:52 2013 via cibadmin on drbd1 Stack: classic openais (with plugin) Current DC: drbd1 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 5 Resources configured. Online: [ drbd1 drbd2 ] Master/Slave Set: ms_webdrbd [webdrbd] Masters: [ drbd2 ] Slaves: [ drbd1 ] myip (ocf::heartbeat:IPaddr): Started drbd2 myserver (lsb:mysqld): Started drbd2 mystore (ocf::heartbeat:Filesystem): Started drbd2 [root@drbd1 ~]#
由上看出已然运行在同一节点上
6、此时资源之间并没有启动顺序,定义排列约束
crm(live)configure# order ms_webdrbd_before_mystore inf: ms_webdrbd:promote mystore:start crm(live)configure# show xml crm(live)configure# verify crm(live)configure# commit crm(live)configure# show xml crm(live)configure# order mystore_before_myserver inf: mystore:start myserver:start crm(live)configure# verify crm(live)configure# order myserver_before_myip inf: myserver:start myip:start crm(live)configure# verify crm(live)configure# commit
六、测试mysql是否是高可用
1、模拟节点故障并查看状态
[root@drbd1 ~]# ssh drbd2 "crm node standby" #将drbd2设置为备用节点 [root@drbd1 ~]# crm status Last updated: Thu Sep 19 16:29:10 2013 Last change: Thu Sep 19 16:28:50 2013 via crm_attribute on drbd2 Stack: classic openais (with plugin) Current DC: drbd1 - partition with quorum Version: 1.1.8-7.el6-394e906 2 Nodes configured, 2 expected votes 5 Resources configured. Node drbd2: standby Online: [ drbd1 ] Master/Slave Set: ms_webdrbd [webdrbd] Masters: [ drbd1 ] #资源已经转移 Stopped: [ webdrbd:1 ] myip (ocf::heartbeat:IPaddr): Started drbd1 myserver (lsb:mysqld): Started drbd1 mystore (ocf::heartbeat:Filesystem): Started drbd1 [root@drbd1 ~]#
2、模拟mysql服务出现故障
[root@drbd1 ~]# service mysqld stop #####使用watch命令动态查看mysql端口#### [root@drbd1 ~]# watch "netstat -anpt | grep 3306" Every 2.0s: netstat -anpt | grep 3306 Thu Sep 19 17:17:45 2013 tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 36400/mysqld
当mysql服务停止之后,corosync会自动启动mysql,实现了mysql的高可用!!!
总结:
为了保证实验的成功,提醒大家在创建mysql用户时应该保证在两台服务器的mysql用户的UID相同,以免导致Mysql服务不能启动;在定义资源约束时,一定要定义合理,不然有可能在资源转移的时候出现问题,导致资源不能启动。
本博客至此结束,如有问题,望大家多提宝贵意见。欢迎大家一起来探讨高可用的相关话题!!!