一、DRBD部分配置
1、安装环境说明
node1 192.168.110.141
node2 192.168.110.142
Node1:
# sed -i 's@\(HOSTNAME=\).*@\1node1.pancou.com@g' /etc/sysconfig/network
# hostname node1.pancou.com
# vim /etc/hosts
192.168.110.141 node1.pancou.com node1
192.168.110.142 node2.pancou.com node2
Node2:
# sed -i 's@\(HOSTNAME=\).*@\1node2.pancou.com@g' /etc/sysconfig/network
# hostname node1.pancou.com
# vim /etc/hosts
192.168.110.141 node1 node1.pancou.com
192.168.110.142 node2 node2.pancou.com
2、设定两个节点可以基于密钥进行ssh通信,这可以通过类似如下的命令实现
Node1:
[root@node1 ~]# yum install openssh-clients
[root@node1 ~]# ssh-keygen -t rsa
[root@node1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
Node2:
[root@node2 ~]# yum install openssh-clients
[root@node2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub [email protected]
3、安装软件包
drbd共有两部分组成:内核模块和用户空间的管理工具。其中drbd内核模块代码已经整合进Linux内核2.6.33以后的版本中,因此,如果您的内核版本高于此版本的话,你只需要安装管理工具即可;否则,您需要同时安装内核模块和管理工具两个软件包,并且此两者的版本号一定要保持对应。
目前适用CentOS 5的drbd版本主要有8.0、8.2、8.3三个版本,其对应的rpm包的名字分别为drbd, drbd82和drbd83,对应的内核模块的名字分别为kmod-drbd, kmod-drbd82和kmod-drbd83。而适用于CentOS 6的版本为8.4,其对应的rpm包为drbd和drbd-kmdl,但在实际选用时,要切记两点:drbd和drbd-kmdl的版本要对应;另一个是drbd-kmdl的版本要与当前系统的内容版本相对应。各版本的功能和配置等略有差异;我们实验所用的平台为x86_64且系统为CentOS 6.5,因此需要同时安装内核模块和管理工具。我们这里选用最新的8.4的版本(drbd-8.4.3-33.el6.x86_64.rpm和drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm),下载地址为ftp://rpmfind.net/linux/atrpms/,请按照需要下载。
实际使用中,您需要根据自己的系统平台等下载符合您需要的软件包版本,这里不提供各版本的下载地址。
[root@node1 ~]# rpm -ivh drbd-8.4.3-33.el6.x86_64.rpm drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm
warning: drbd-8.4.3-33.el6.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID 66534c2b: NOKEY
Preparing... ########################################### [100%]
1:drbd-kmdl-2.6.32-431.el########################################### [ 50%]
2:drbd ########################################### [100%]
[root@node1 ~]# scp drbd-* node2:/root/
drbd-8.4.3-33.el6.x86_64.rpm 100% 283KB 283.3KB/s 00:00
drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm 100% 145KB 145.2KB/s 00:00
[root@node2 ~]# rpm -ivh drbd-8.4.3-33.el6.x86_64.rpm drbd-kmdl-2.6.32-431.el6-8.4.3-33.el6.x86_64.rpm
warning: drbd-8.4.3-33.el6.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID 66534c2b: NOKEY
Preparing... ########################################### [100%]
1:drbd-kmdl-2.6.32-431.el########################################### [ 50%]
2:drbd ########################################### [100%]
4、配置drbd
drbd的主配置文件为/etc/drbd.conf;为了管理的便捷性,目前通常会将些配置文件分成多个部分,且都保存至/etc/drbd.d/目录中,主配置文件中仅使用"include"指令将这些配置文件片断整合起来。通常,/etc/drbd.d目录中的配置文件为global_common.conf和所有以.res结尾的文件。其中global_common.conf中主要定义global段和common段,而每一个.res的文件用于定义一个资源。
在配置文件中,global段仅能出现一次,且如果所有的配置信息都保存至同一个配置文件中而不分开为多个文件的话,global段必须位于配置文件的最开始处。目前global段中可以定义的参数仅有minor-count, dialog-refresh, disable-ip-verification和usage-count。
common段则用于定义被每一个资源默认继承的参数,可以在资源定义中使用的参数都可以在common段中定义。实际应用中,common段并非必须,但建议将多个资源共享的参数定义为common段中的参数以降低配置文件的复杂度。
resource段则用于定义drbd资源,每个资源通常定义在一个单独的位于/etc/drbd.d目录中的以.res结尾的文件中。资源在定义时必须为其命名,名字可以由非空白的ASCII字符组成。每一个资源段的定义中至少要包含两个host子段,以定义此资源关联至的节点,其它参数均可以从common段或drbd的默认中进行继承而无须定义。
下面的操作在node1.magedu.com上完成。
[root@node1 ~]# cat /etc/drbd.conf
# You can find an example in /usr/share/doc/drbd.../drbd.conf.example
include "drbd.d/global_common.conf";
include "drbd.d/*.res";
[root@node1 ~]# ls /etc/drbd.d/
global_common.conf
1. 配置/etc/drbd.d/global-common.conf
[root@node1 ~]# cat /etc/drbd.d/global_common.conf
global {
usage-count no;
# minor-count dialog-refresh disable-ip-verification
}
common {
handlers {
# These are EXAMPLE handlers only.
# They may have severe implications,
# like hard resetting the node under certain circumstances.
# Be careful when chosing your poison.
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";
# fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
# split-brain "/usr/lib/drbd/notify-split-brain.sh root";
# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
}
startup {
# wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb
}
options {
# cpu-mask on-no-data-accessible
}
disk {
# size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes
# disk-drain md-flushes resync-rate resync-after al-extents
# c-plan-ahead c-delay-target c-fill-target c-max-rate
# c-min-rate disk-timeout
on-io-error detach;
}
net {
# protocol timeout max-epoch-size max-buffers unplug-watermark
# connect-int ping-int sndbuf-size rcvbuf-size ko-count
# allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri
# after-sb-1pri after-sb-2pri always-asbp rr-conflict
# ping-timeout data-integrity-alg tcp-cork on-congestion
# congestion-fill congestion-extents csums-alg verify-alg
# use-rle
protocol C;
cram-hmac-alg "sha1";
shared-secret "mydrbdlab";
}
syncer {
rate 1000M;
}
}
2、定义一个资源/etc/drbd.d/mystorage.res,内容如下:
[root@node1 ~]# cat /etc/drbd.d/mystorage.res
resource mystorage {
on node1.pancou.com {
device /dev/drbd0;
disk /dev/sdb;
address 192.168.110.141:7789;
meta-disk internal;
}
on node2.pancou.com {
device /dev/drbd0;
disk /dev/sdb;
address 192.168.110.142:7789;
meta-disk internal;
}
}
以上文件在两个节点上必须相同,因此,可以基于ssh将刚才配置的文件全部同步至另外一个节点。
[root@node1 ~]# ls /etc/drbd.d/
global_common.conf mystorage.res
[root@node1 ~]# scp /etc/drbd.d/* node2:/etc/drbd.d/
global_common.conf 100% 1949 1.9KB/s 00:00
mystorage.res
3、在两个节点上初始化已定义的资源并启动服务:
1)初始化资源,在Node1和Node2上分别执行:
[root@node1 ~]# iptables -F
[root@node1 ~]# ssh node2 "iptables -F"
[root@node1 ~]# drbdadm create-md mystorage
Writing meta data...
initializing activity log
NOT initializing bitmap
lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory
New drbd meta data block successfully created.
lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory
[root@node1 ~]# ssh node2 "drbdadm create-md mystorage"
NOT initializing bitmap
Writing meta data...
initializing activity log
New drbd meta data block successfully created.
lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory
lk_bdev_save(/var/lib/drbd/drbd-minor-0.lkbd) failed: No such file or directory
2)启动服务,在Node1和Node2上分别执行:
[root@node1 ~]# /etc/init.d/drbd start
Starting DRBD resources: [
create res: mystorage
prepare disk: mystorage
adjust disk: mystorage
adjust net: mystorage
]
..........
***************************************************************
DRBD's startup script waits for the peer node(s) to appear.
..........
[root@node2 ~]# /etc/init.d/drbd start
Starting DRBD resources: [
create res: mystorage
prepare disk: mystorage
adjust disk: mystorage
adjust net: mystorage
]
.
3)查看启动状态:
[root@node1 ~]# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:10485404
ro:表示角色信息,第一次启动drbd时,两个drbd节点默认都处于Secondary状态。
ds:表示磁盘状态信息,“Inconsistent/Inconsistent” 即 “不一致/不一致”状态,表示两个节点的磁盘数据处于不一致状态。
ns:表示网络发送的数据包信息。
dw:表示磁盘写信息。
dr:表示磁盘读信息。
也可以使用drbd-overview命令来查看:
[root@node1 ~]# drbd-overview
0:mystorage/0 Connected Secondary/Secondary Inconsistent/Inconsistent C r-----
从上面的信息中可以看出此时两个节点均处于Secondary状态。于是,我们接下来需要将其中一个节点设置为Primary。
4)设置主节点
在要设置为Primary的节点上执行如下命令:
[root@node1 ~]# drbdadm --force primary mystorage #在节点上把其中一个提升为主的(第一次)
注: 也可以在要设置为Primary的节点上使用如下命令来设置主节点:
# drbdadm -- --overwrite-data-of-peer primary mystorage
而后再次查看状态,可以发现数据同步过程已经开始:
[root@node1 ~]# drbd-overview
0:mystorage/0 SyncSource Primary/Secondary UpToDate/Inconsistent C r---n-
[>...................] sync'ed: 9.7% (9248/10236)M
也可以查看动态信息:
[root@node1 ~]# watch -n 10 "cat /proc/drbd"
Every 10.0s: cat /proc/drbd Fri Aug 12 00:28:07 2016
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n-
ns:5382940 nr:0 dw:0 dr:5389976 al:0 bm:328 lo:0 pe:2 ua:7 ap:0 ep:1 wo:f oos:5104284
[=========>..........] sync'ed: 51.4% (4984/10236)M
finish: 0:01:11 speed: 71,108 (61,848) K/sec
Every 10.0s: cat /proc/drbd Fri Aug 12 00:28:47 2016
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n-
ns:8115556 nr:0 dw:0 dr:8123032 al:0 bm:495 lo:0 pe:2 ua:7 ap:0 ep:1 wo:f oos:2371228
[==============>.....] sync'ed: 77.5% (2312/10236)M
finish: 0:00:37 speed: 63,216 (63,888) K/sec
Every 10.0s: cat /proc/drbd Fri Aug 12 00:29:17 2016
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n-
ns:10091464 nr:0 dw:0 dr:10099352 al:0 bm:615 lo:0 pe:2 ua:8 ap:0 ep:1 wo:f oos:395932
[==================>.] sync'ed: 96.3% (384/10236)M
finish: 0:00:05 speed: 67,204 (63,856) K/sec
等数据同步完成以后再次查看状态,可以发现节点已经实时状态,且节点已经有了主次:
[root@node1 ~]# drbd-overview
0:mystorage/0 Connected Primary/Secondary UpToDate/UpToDate C r-----
[root@node1 ~]# watch -n 10 "cat /proc/drbd"
Every 10.0s: cat /proc/drbd Fri Aug 12 00:31:37 2016
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by gardner@, 2013-11-29 12:28:00
0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:10485404 nr:0 dw:0 dr:10486068 al:0 bm:640 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
4、创建文件系统
文件系统的挂载只能在Primary节点进行,因此,也只有在设置了主节点后才能对drbd设备进行格式化:
# mke2fs -j -L DRBD /dev/drbd0
# mkdir /mnt/drbd
# mount /dev/drbd0 /mnt/drbd
[root@node1 ~]# mkfs -t ext4 /dev/drbd0
[root@node1 ~]# mkdir /data
[root@node1 ~]# mount /dev/drbd0 /data/
[root@node1 ~]# ls /data/
lost+found
5、测试切换Primary和Secondary节点
对主Primary/Secondary模型的drbd服务来讲,在某个时刻只能有一个节点为Primary,因此,要切换两个节点的角色,只能在先将原有的Primary节点设置为Secondary后,才能将原来的Secondary节点设置为Primary:
[root@node1 ~]# cp /etc/inittab /data/
[root@node1 ~]# ls /data/
inittab lost+found
[root@node1 ~]# vim /data/inittab
#
id:3:initdefault:
hello!!!
1)切换node1 Primary为Secondary节点
[root@node1 ~]# umount /data
[root@node1 ~]# drbd-overview
0:mystorage/0 Connected Primary/Secondary UpToDate/UpToDate C r-----
[root@node1 ~]# drbdadm secondary mystorage
[root@node1 ~]# drbd-overview
0:mystorage/0 Connected Secondary/Secondary UpToDate/UpToDate C r-----
2)切换node2 为Primary
[root@node2 ~]# drbd-overview
0:mystorage/0 Connected Secondary/Secondary UpToDate/UpToDate C r-----
[root@node2 ~]# drbdadm primary mystorage
[root@node2 ~]# drbd-overview
0:mystorage/0 Connected Primary/Secondary UpToDate/UpToDate C r-----
[root@node2 ~]# mkdir /data
[root@node2 ~]# mount /dev/drbd0 /data/
[root@node2 ~]# ls /data/
inittab lost+found
[root@node2 ~]# cat /data/inittab
#
id:3:initdefault:
hello!!!
二、Corosync+Pacemaker部分配置
为了让高可用的配置顺利,两个节点都不能设置为主的,而且都不能启动,也不能开机自动启动,所以卸载降级
[root@node1 ~]# umount /data/
[root@node1 ~]# drbdadm secondary mystorage
[root@node1 ~]# drbd-overview
0:mystorage/0 Connected Secondary/Secondary UpToDate/UpToDate C r-----
[root@node1 ~]# service drbd stop
Stopping all DRBD resources: .
[root@node1 ~]# chkconfig --add drbd
[root@node1 ~]# chkconfig drbd off
[root@node1 ~]# chkconfig --list drbd
drbd 0:off 1:off 2:off 3:off 4:off 5:off 6:off
[root@node2 ~]# service drbd stop
Stopping all DRBD resources: .
[root@node2 ~]# chkconfig --add drbd
[root@node2 ~]# chkconfig drbd off
[root@node2 ~]# chkconfig --list drbd
drbd 0:off 1:off 2:off 3:off 4:off 5:off 6:off
1、安装corosync + pacemaker,这里直接用yum来安装,两个节点都要安装上
(1)、Corosync
Corosync是集群管理套件的一部分,它在传递信息的时候可以通过一个简单的配置文件来定义信息传递的方式和协议等。它是一个新兴的软件,2008年推出,但其实它并不是一个真正意义上的新软件,在2002年的时候有一个项目Openais , 它由于过大,分裂为两个子项目,其中可以实现HA心跳信息传输的功能就是Corosync ,它的代码60%左右来源于Openais. Corosync可以提供一个完整的HA功能,但是要实现更多,更复杂的功能,那就需要使用Openais了。Corosync是未来的发展方向。在以后的新项目里,一般采用Corosync,而hb_gui可以提供很好的HA管理功能,可以实现图形化的管理。另外相关的图形化有RHCS的套件luci+ricci,当然还有基于java开发的LCMC集群管理工具。
corosync只提供了message layer(即实现HeartBeat + CCM),而没有直接提供CRM,一般使用Pacemaker进行资源管理。
(2)、Pacemaker
Pacemaker是一个集群资源管理器。它利用集群基础构件(OpenAIS 、heartbeat或corosync)提供的消息和成员管理能力来探测并从节点或资源级别的故障中恢复,以实现群集服务(亦称资源)的最大可用性。
Pacemaker是由Linux-HA工程的Heartbeat项目中的CRM组件发展而来。 Heartbeat 3开始,Heartbeat 3 按模块把的原来Heartbeat
拆分为多个子项目。CRM组件单独由另一独立项目Pacemaker 负责。单独成一个项目之后,Pacemaker以做一个可扩展性
高可用集群资源管理器(A scalable High-Availability cluster resource manager)为目标,并同时支持Corosync和Heartbeat。
从pacemaker 1.1.8开始,crmsh发展成了一个独立项目,叫crmsh。pacemaker默认不提供命令接口工具,需要单独安装crmsh。
RHEL自6.4起不再提供集群的命令行配置工具crmsh,转而使用pcs;如果你习惯了使用crm命令,可下载相关的程序包自行安装即可。crmsh依赖于pssh,因此需要一并下载。
crmsh和pssh需要下载:
http://crmsh.github.io/download/
2、安装和配置corosync和pacemaker,(以下命令在web1.pancou.com上执行)
[root@node1 ~]# yum install -y corosync pacemaker
[root@node2 ~]# yum install -y corosync pacemaker
3、配置corosync
[root@node1 ~]# ls /etc/corosync/
corosync.conf.example corosync.conf.example.udpu service.d uidgid.d
[root@node1 ~]# cd /etc/corosync/
[root@node1 corosync]# mv corosync.conf.example /etc/corosync/corosync.conf
[root@node1 corosync]# ls
corosync.conf corosync.conf.example corosync.conf.example.udpu service.d uidgid.d
[root@node1 corosync]# vim corosync.conf
[root@node1 corosync]# cat corosync.conf
# Please read the corosync.conf.5 manual page
compatibility: whitetank
totem {
version: 2
# secauth: Enable mutual node authentication. If you choose to
# enable this ("on"), then do remember to create a shared
# secret with "corosync-keygen".
secauth: on
threads: 0
# interface: define at least one interface to communicate
# over. If you define more than one interface stanza, you must
# also set rrp_mode.
interface {
# Rings must be consecutively numbered, starting at 0.
ringnumber: 0
# This is normally the *network* address of the
# interface to bind to. This ensures that you can use
# identical instances of this configuration file
# across all your cluster nodes, without having to
# modify this option.
bindnetaddr: 192.168.110.0
# However, if you have multiple physical network
# interfaces configured for the same subnet, then the
# network address alone is not sufficient to identify
# the interface Corosync should bind to. In that case,
# configure the *host* address of the interface
# instead:
# bindnetaddr: 192.168.1.1
# When selecting a multicast address, consider RFC
# 2365 (which, among other things, specifies that
# 239.255.x.x addresses are left to the discretion of
# the network administrator). Do not reuse multicast
# addresses across multiple Corosync clusters sharing
# the same network.
mcastaddr: 239.255.110.1
# Corosync uses the port you specify here for UDP
# messaging, and also the immediately preceding
# port. Thus if you set this to 5405, Corosync sends
# messages over UDP ports 5405 and 5404.
mcastport: 5405
# Time-to-live for cluster communication packets. The
# number of hops (routers) that this ring will allow
# itself to pass. Note that multicast routing must be
# specifically enabled on most network routers.
ttl: 1
}
}
logging {
# Log the source file and line where messages are being
# generated. When in doubt, leave off. Potentially useful for
# debugging.
fileline: off
# Log to standard error. When in doubt, set to no. Useful when
# running in the foreground (when invoking "corosync -f")
to_stderr: no
# Log to a log file. When set to "no", the "logfile" option
# must not be set.
to_logfile: yes
logfile: /var/log/cluster/corosync.log
# Log to the system log daemon. When in doubt, set to yes.
to_syslog: no
# Log debug messages (very verbose). When in doubt, leave off.
debug: off
# Log messages with time stamps. When in doubt, set to on
# (unless you are only logging to syslog, where double
# timestamps can be annoying).
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
service {
ver: 0
name: pacemaker
# use_mgmtd: yes
}
aisexec {
user: root
group: root
}
生成节点间通信时用到的认证密钥文件:
用corosync-keygen生成key时,由于要使用/dev/random生成随机数,因此如果新装的系统操作不多,
如果没有足够的熵,狂敲键盘即可,随意敲,敲够即可。
[root@node1 corosync]# corosync-keygen
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Press keys on your keyboard to generate entropy (bits = 976).
fasdkfmakdsjfksdklakfo kdasokfodks afokodsaofjidjasfsjdaojfijdasifjidsajfij dsiajfijdasijfijeawijfdmkslngjkadsfkofksdaojfkdasjfkaodsklfpdsakfoewpkafodskfokasofmdosakfokdsaofksdoakfodsaofksdaokfdosakfodsamfkasdkfopsdkaofksdaopkfodsakfosdkaopfkdsoafosadkfopdsakfodskfodskaofdsakfodspafkodlsmfkdasmfklsadkfpeoakefoewkoakak3kfeowakfolkdjgfewif9ik3owqkfeagfaioe934gjkfewoakgfoekasogmnkasdgmolkafr30or30kroeawksfokjeadf949oweakofk9w0ai3owekaeoskgoja9w3eoajgoeaoegf3wwa3a43tfg4awg4ega3e3ag3a
.....................
[root@node1 corosync]# ls
authkey corosync.conf corosync.conf.example corosync.conf.example.udpu service.d uidgid.d
[root@node1 corosync]# scp authkey corosync.conf node2:/etc/corosync/
authkey 100% 128 0.1KB/s 00:00
corosync.conf 100% 2766 2.7KB/s 00:00
4、启动corosync(以下命令在web1上执行):
[root@node1 ~]# service corosync start
Starting Corosync Cluster Engine (corosync): [ OK ]
[root@node1 ~]# ssh node2 "service corosync start"
Starting Corosync Cluster Engine (corosync): [ OK ]
5、检查启动情况
查看corosync引擎是否正常启动:
[root@node1 ~]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
Aug 12 02:04:11 corosync [MAIN ] Corosync Cluster Engine ('1.4.7'): started and ready to provide service.
Aug 12 02:04:11 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
Aug 12 02:06:32 corosync [MAIN ] Corosync Cluster Engine exiting with status 0 at main.c:2055.
查看初始化成员节点通知是否正常发出:
[root@node1 ~]# grep TOTEM /var/log/cluster/corosync.log
Aug 12 02:04:11 corosync [TOTEM ] Initializing transport (UDP/IP Multicast).
Aug 12 02:04:11 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Aug 12 02:04:11 corosync [TOTEM ] The network interface [192.168.110.141] is now up.
Aug 12 02:04:11 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Aug 12 02:04:11 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
检查启动过程中是否有错误产生。下面的错误信息表示packmaker不久之后将不再作为corosync的插件运行,因此,建议使用cman作为集群基础架构服务;此处可安全忽略。
[root@node1 ~]# grep ERROR: /var/log/cluster/corosync.log | grep -v unpack_resources
Aug 12 03:25:34 corosync [pcmk ] ERROR: process_ais_conf: You have configured a cluster using the Pacemaker plugin for Corosync. The plugin is not supported in this environment and will be removed very soon.
Aug 12 03:25:34 corosync [pcmk ] ERROR: process_ais_conf: Please see Chapter 8 of 'Clusters from Scratch' (http://www.clusterlabs.org/doc) for details on using Pacemaker with CMAN
Aug 12 03:26:42 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child process crmd exited (pid=3370, rc=201)
查看pacemaker是否正常启动:
[root@node1 ~]#
Aug 12 03:25:34 corosync [pcmk ] info: pcmk_startup: CRM: Initialized
Aug 12 03:25:34 corosync [pcmk ] Logging: Initialized pcmk_startup
Aug 12 03:25:34 corosync [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
Aug 12 03:25:34 corosync [pcmk ] info: pcmk_startup: Service: 9
Aug 12 03:25:34 corosync [pcmk ] info: pcmk_startup: Local hostname: node1.pancou.com
执行ps auxf命令可以查看corosync启动的各相关进程。
[root@node1 ~]# ps auxf |grep corosync
root 3401 0.0 0.1 103252 856 pts/2 S+ 03:36 0:00 | \_ grep corosync
root 3360 1.0 0.9 612116 4912 ? Ssl 03:25 0:06 corosync
[root@node1 ~]# ps auxf |grep pacemaker
root 3399 0.0 0.1 103252 864 pts/2 S+ 03:33 0:00 | \_ grep pacemaker
189 3365 0.1 2.1 94072 10840 ? S< 03:25 0:00 \_ /usr/libexec/pacemaker/cib
root 3366 0.0 0.9 95136 4624 ? S< 03:25 0:00 \_ /usr/libexec/pacemaker/stonithd
root 3367 0.0 0.6 62968 3352 ? S< 03:25 0:00 \_ /usr/libexec/pacemaker/lrmd
189 3368 0.0 0.7 86012 3736 ? S< 03:25 0:00 \_ /usr/libexec/pacemaker/attrd
189 3369 0.0 0.7 81552 3628 ? S< 03:25 0:00 \_ /usr/libexec/pacemaker/pengine
189 3385 0.0 0.7 94956 3840 ? S< 03:26 0:00 \_ /usr/libexec/pacemaker/crmd
[root@node1 ~]# netstat -anptul
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 2025/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 1122/master
tcp 0 0 192.168.110.141:22 192.168.110.1:53648 ESTABLISHED 1963/sshd
tcp 0 52 192.168.110.141:22 192.168.110.1:61309 ESTABLISHED 2807/sshd
tcp 0 0 :::22 :::* LISTEN 2025/sshd
tcp 0 0 ::1:25 :::* LISTEN 1122/master
udp 0 0 192.168.110.141:5404 0.0.0.0:* 3360/corosync
udp 0 0 192.168.110.141:5405 0.0.0.0:* 3360/corosync
udp 0 0 239.255.110.1:5405 0.0.0.0:* 3360/corosync
查看web2上的信息:
如果上面命令执行均没有问题,接着可以执行如下命令启动node2上的corosync
# ssh node2 -- /etc/init.d/corosync start
注意:启动node2需要在node1上使用如上命令进行,不要在node2节点上直接启动。下面是node1上的相关日志。
[root@node2 corosync]# tail /var/log/cluster/corosync.log
Aug 12 03:27:15 [2529] node2.pancou.com pengine: info: determine_online_status_fencing: Node node2.pancou.com is active
Aug 12 03:27:15 [2529] node2.pancou.com pengine: info: determine_online_status: Node node2.pancou.com is online
Aug 12 03:27:15 [2529] node2.pancou.com pengine: notice: stage6: Delaying fencing operations until there are resources to manage
Aug 12 03:27:15 [2530] node2.pancou.com crmd: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ]
Aug 12 03:27:15 [2530] node2.pancou.com crmd: info: do_te_invoke: Processing graph 0 (ref=pe_calc-dc-1470943635-19) derived from /var/lib/pacemaker/pengine/pe-input-0.bz2
Aug 12 03:27:15 [2530] node2.pancou.com crmd: notice: run_graph: Transition 0 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacemaker/pengine/pe-input-0.bz2): Complete
Aug 12 03:27:15 [2530] node2.pancou.com crmd: info: do_log: FSA: Input I_TE_SUCCESS from notify_crmd() received in state S_TRANSITION_ENGINE
Aug 12 03:27:15 [2530] node2.pancou.com crmd: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ]
Aug 12 03:27:15 [2529] node2.pancou.com pengine: notice: process_pe_message: Calculated Transition 0: /var/lib/pacemaker/pengine/pe-input-0.bz2
Aug 12 03:27:15 [2529] node2.pancou.com pengine: notice: process_pe_message: Configuration ERRORs found during PE processing. Please run "crm_verify -L" to identify issues.
root@node1 ~]# crm
crm(live)# status
Last updated: Fri Aug 12 03:37:27 2016 Last change: Fri Aug 12 03:26:10 2016 by hacluster via crmd on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 0 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
crm(live)# configure
ERROR: CIB not supported: validator 'pacemaker-2.4', release '3.0.10'
ERROR: You may try the upgrade command
crm(live)configure# cd
crm(live)# quit
bye
[root@node1 ~]# service corosync stop
Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ]
Waiting for corosync services to unload:. [ OK ]
[root@node1 ~]# ssh node2 "service corosync stop"
Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ]
Waiting for corosync services to unload:.[ OK ]
http://crmsh.github.io/download/
[network_ha-clustering_Stable]
name=Stable High Availability/Clustering packages (CentOS_CentOS-6)
type=rpm-md
baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/
gpgcheck=1
gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6//repodata/repomd.xml.key
enabled=1
[root@node1 ~]# vim /etc/yum.repos.d/CentOS-Base.repo
把上面的信息粘贴到CentOS-Base.repo最下面
[root@node1 ~]# yum install crmsh pssh ----node2同样
[root@node1 ~]# crm status
Last updated: Fri Aug 12 03:55:12 2016 Last change: Fri Aug 12 03:26:24 2016 by hacluster via crmd on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 0 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
[root@node1 ~]# crm
crm(live)# configure show
node node1.pancou.com
node node2.pancou.com
property cib-bootstrap-options: \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2
[root@node1 ~]# crm ra classes
lsb
ocf / .isolation heartbeat linbit pacemaker
service
stonith
[root@node1 ~]# crm ra list ocf linbit
drbd
[root@node1 ~]# crm ra info ocf:linbit:drbd
Manages a DRBD device as a Master/Slave resource (ocf:linbit:drbd)
This resource agent manages a DRBD resource as a master/slave resource.
DRBD is a shared-nothing replicated storage device.
Note that you should configure resource level fencing in DRBD,
this cannot be done from this resource agent.
See the DRBD User's Guide for more information.
http://www.drbd.org/docs/applications/
Parameters (*: required, []: default):
drbd_resource* (string): drbd resource name
The name of the drbd resource from the drbd.conf file.
drbdconf (string, [/etc/drbd.conf]): Path to drbd.conf
Full path to the drbd.conf file.
stop_outdates_secondary (boolean, [false]): outdate a secondary on stop
Recommended setting: until pacemaker is fixed, leave at default (disabled).
Note that this feature depends on the passed in information in
OCF_RESKEY_CRM_meta_notify_master_uname to be correct, which unfortunately is
not reliable for pacemaker versions up to at least 1.0.10 / 1.1.4.
If a Secondary is stopped (unconfigured), it may be marked as outdated in the
drbd meta data, if we know there is still a Primary running in the cluster.
Note that this does not affect fencing policies set in drbd config,
but is an additional safety feature of this resource agent only.
You can enable this behaviour by setting the parameter to true.
If this feature seems to not do what you expect, make sure you have defined
fencing policies in the drbd configuration as well.
Operations' defaults (advisory minimum):
start timeout=240
promote timeout=90
demote timeout=90
notify timeout=90
stop timeout=100
monitor_Slave timeout=20 interval=20
monitor_Master timeout=20 interval=10
[root@node1 ~]# crm configure property stonith-enabled=false
[root@node1 ~]# crm configure property no-quorum-policy=ignore
[root@node1 ~]# crm configure rsc_defaults resource-stickiness=100
[root@node1 ~]# crm
crm(live)# configure show
node node1.pancou.com
node node2.pancou.com
property cib-bootstrap-options: \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
--help \
--list \
stonith-enabled=false \
no-quorum-policy=ignore
rsc_defaults rsc-options: \
resource-stickiness=100
drbd需要同时运行在两个节点上,但只能有一个节点(primary/secondary模型)是Master,而另一个节点为Slave;因此,它是一种比较特殊的集群资源,其资源类型为多态(Multi-state)clone类型,即主机节点有Master和Slave之分,且要求服务刚启动时两个节点都处于slave状态。
# 定义资源,切换到configure中,mysqlstore定义资源名,drbd_resource=mystorage这个是drbd名,后面定义的都是一些监控
crm(live)configure# primitive mysqlstore ocf:linbit:drbd params
drbd_resource=mystorage op monitor role=Master interval=30s timeout=20s op monitor role=Slave interval=60s timeout=20s op start timeout=240s op stop timeout=100s
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
# 定义主资源
crm(live)configure# master ms_mysqlstore mysqlstore meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify="True"
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
crm(live)configure# commit
crm(live)configure# show
node node1.pancou.com
node node2.pancou.com
primitive mysqlstore ocf:linbit:drbd \
params drbd_resource=mystorage \
op monitor role=Master interval=30s timeout=20s \
op monitor role=Slave interval=60s timeout=20s \
op start timeout=240s interval=0 \
op stop timeout=100s interval=0
ms ms_mysqlstore mysqlstore \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=True
property cib-bootstrap-options: \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
--help \
--list \
stonith-enabled=false \
no-quorum-policy=ignore
rsc_defaults rsc-options: \
resource-stickiness=100
crm(live)configure# cd
crm(live)# status
Last updated: Fri Aug 12 04:29:00 2016 Last change: Fri Aug 12 04:28:57 2016 by root via cibadmin on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 2 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Slaves: [ node2.pancou.com ]
定义共享文件的资源,并且让服务器切换时可以自动挂载,device=/dev/drbd0挂载的设备,directory=/drbd挂载点,
两个挂载点的文件名要一致:
crm(live)configure# primitive mysqlfs ocf:heartbeat:Filesystem params device=/dev/drbd0 directory=/data fstype=ext4 op monitor interval=30s timeout=40s on-fail=restart op start timeout=60s op stop timeout=60s
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
先不要提交,接着定义排列约束:
crm(live)configure# collocation mysqlfs_with_ms_mysqlstore_master inf: mysqlfs ms_mysqlstore:Master
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
定义顺序约束:
crm(live)configure# order mysqlfs_after_ms_mysqlstore_master mandatory: ms_mysqlstore:promote mysqlfs:start
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
crm(live)# configure show
node node1.pancou.com
node node2.pancou.com
primitive mysqlfs Filesystem \
params device="/dev/drbd0" directory="/data" fstype=ext4 \
op monitor interval=30s timeout=40s on-fail=restart \
op start timeout=60s interval=0 \
op stop timeout=60s interval=0
primitive mysqlstore ocf:linbit:drbd \
params drbd_resource=mystorage \
op monitor role=Master interval=30s timeout=20s \
op monitor role=Slave interval=60s timeout=20s \
op start timeout=240s interval=0 \
op stop timeout=100s interval=0
ms ms_mysqlstore mysqlstore \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=True
order mysqlfs_after_ms_mysqlstore_master Mandatory: ms_mysqlstore:promote mysqlfs:start
colocation mysqlfs_with_ms_mysqlstore_master inf: mysqlfs ms_mysqlstore:Master
property cib-bootstrap-options: \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
--help \
--list \
stonith-enabled=false \
no-quorum-policy=ignore
rsc_defaults rsc-options: \
resource-stickiness=100
crm(live)# status
Last updated: Fri Aug 12 05:06:55 2016 Last change: Fri Aug 12 05:04:20 2016 by root via cibadmin on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 3 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Slaves: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node1.pancou.com
[root@node1 ~]# ls /data/
inittab lost+found
[root@node1 ~]# cat /data/inittab
..........
......
# 6 - reboot (Do NOT set initdefault to this)
#
id:3:initdefault:
下面我们模拟node1节点故障,看此些资源可否正确转移至node2。
以下命令在Node1上执行:
# crm node standby node1.pancou.com
[root@node1 ~]# crm node standby
[root@node1 ~]# crm status
Last updated: Fri Aug 12 05:13:59 2016 Last change: Fri Aug 12 05:14:01 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 3 resources configured, 2 expected votes
Node node1.pancou.com: standby
Online: [ node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node2.pancou.com ]
Stopped: [ node1.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node2.pancou.com
在node2上验证是否成功
root@node2 ~]# crm status
Last updated: Fri Aug 12 05:16:39 2016 Last change: Fri Aug 12 05:14:01 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 3 resources configured, 2 expected votes
Node node1.pancou.com: standby
Online: [ node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node2.pancou.com ]
Stopped: [ node1.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node2.pancou.com
[root@node2 corosync]# cd
[root@node2 ~]# ls /data/
inittab lost+found
[root@node2 ~]# cat /data/inittab
..........
......
# 6 - reboot (Do NOT set initdefault to this)
#
id:3:initdefault:
由上面的信息可以推断出,node1已经转入standby模式,其drbd服务已经停止,但故障转移已经完成,所有资源已经正常转移至node2。
让node1重新上线:
# crm node online node1.pancou.com
[root@node1 ~]# crm node online
[root@node1 ~]# crm status
Last updated: Fri Aug 12 05:19:32 2016 Last change: Fri Aug 12 05:19:35 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 3 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node2.pancou.com ]
Slaves: [ node1.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node2.pancou.com
三、mysql部分配置
[root@node2 mariadb-10.1.16]# crm status
Last updated: Fri Aug 12 08:44:44 2016 Last change: Fri Aug 12 08:44:36 2016 by root via crm_attribute on node2.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 3 resources configured, 2 expected votes
Node node1.pancou.com: standby
Online: [ node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node2.pancou.com ]
Stopped: [ node1.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node2.pancou.com
1、node2 mysql的安装和设置
1)、解决依赖关系
[root@node2 ~]# yum install make cmake
bison ncurses-devel autoconf automake curl curl-devel gcc gcc-c++ gtk+-devel zlib-devel openssl openssl-devel pcre-devel perl kernel-headers compat* cpp glibc libgomp libstdc++-devel keyutils-libs-devel libsepol-devel libselinux-devel krb5-devel libXpm* freetype freetype-devel freetype* fontconfig fontconfig-devel libpng* ncurses* libtool* libxml2-devel bison libaio-devel
2)、解压安装
[root@node2 ~]# tar -zxvf mariadb-10.1.16.tar.gz -C /usr/local/src/
[root@node2 ~]# cd /usr/local/src/mariadb-10.1.16/
[root@node2 mariadb-10.1.16]# cmake -DCMAKE_INSTALL_PREFIX=/usr/local/mysql -DSYSCONFDIR=/etc -DMYSQL_DATADIR=/data/mysql -DMYSQL_UNIX_ADDR=/var/lib/mysql/mysql.sock -DMYSQL_USER=mysql -DMYSQL_TCP_PORT=3306 -DDEFAULT_CHARSET=utf8 -DDEFAULT_COLLATION=utf8_general_ci -DWITH_EXTRA_CHARSETS=all -DWITH_MYISAM_STORAGE_ENGINE=1 -DWITH_INNOBASE_STORAGE_ENGINE=1 -DWITH_MEMORY_STORAGE_ENGINE=1 -DWITH_READLINE=1 -DENABLED_LOCAL_INFILE=1 -DWITH_SSL=system
[root@node2 mariadb-10.1.16]# make && make install
3)、创建mysql用户
[root@node2 mariadb-10.1.16]# useradd -r -s /sbin/nologin mysql
[root@node2 mariadb-10.1.16]# id mysql
uid=498(mysql) gid=498(mysql) groups=498(mysql)
4)、设置数据目录的权限(就是挂载drbd的目录)
[root@node2 mariadb-10.1.16]# mkdir -p /data/mysql
[root@node2 mariadb-10.1.16]# mkdir -p /data/logs -------node1上不要再创建
[root@node2 mariadb-10.1.16]# chown -R mysql:mysql /data/
5)、设置配置文件及服务
[root@node2 mariadb-10.1.16]# cp support-files/my-large.cnf /etc/my.cnf
[root@node2 mariadb-10.1.16]# cp support-files/mysql.server /etc/init.d/mysqld
[root@node2 mariadb-10.1.16]# chmod +x /etc/init.d/mysqld
[root@node2 mariadb-10.1.16]# ll /etc/init.d/mysqld
-rwxr-xr-x 1 root root 12549 Aug 12 08:33 /etc/init.d/mysqld
[root@node2 mariadb-10.1.16]# chkconfig --add mysqld
[root@node2 mariadb-10.1.16]# chkconfig --list mysqld
mysqld 0:off 1:off 2:on 3:on 4:on 5:on 6:off
[root@node2 mariadb-10.1.16]# chkconfig mysqld off
[root@node2 mariadb-10.1.16]# chkconfig --list mysqld
mysqld 0:off 1:off 2:off 3:off 4:off 5:off 6:off
[root@node2 mariadb-10.1.16]# vim /etc/init.d/mysqld
basedir=/usr/local/mysql
datadir=/data/mysql
6)、mysql 初始化
[root@node2 mariadb-10.1.16]# /usr/local/mysql/scripts/mysql_install_db \
--defaults-file=/etc/my.cnf \
--basedir=/usr/local/mysql \
--datadir=/data/mysql \
--user=mysql
---------------注意node1上不要初始化
7)、启动mysqld服务
[root@node2 mariadb-10.1.16]# service mysqld start
Starting MySQL.. SUCCESS!
8)、设置环境变量
[root@node2 mariadb-10.1.16]# vim /etc/profile.d/mysql.sh
export PATH=/usr/local/mysql/bin:$PATH
[root@node2 mariadb-10.1.16]# source /etc/profile.d/mysql.sh
[root@node2 mariadb-10.1.16]# echo $PATH
/usr/local/mysql/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
9)、登录测试
[root@node2 mariadb-10.1.16]# mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.16-MariaDB Source distribution
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
4 rows in set (0.00 sec)
[root@node2 mariadb-10.1.16]# service mysqld stop
Shutting down MySQL.. SUCCESS!
[root@node2 mariadb-10.1.16]# crm node standby node2.pancou.com
[root@node2 mariadb-10.1.16]# crm node online node2.pancou.com
[root@node2 mariadb-10.1.16]# crm status
Last updated: Fri Aug 12 09:05:33 2016 Last change: Fri Aug 12 09:05:29 2016 by root via crm_attribute on node2.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 3 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Slaves: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node1.pancou.com
2、node1 mysql的安装和设置
[root@node1 mariadb-10.1.16]# ls /data/
inittab logs lost+found mysql
1)、mysql安装设置和node2 相同
[root@node1 mariadb-10.1.16]# useradd -r -s /sbin/nologin mysql
[root@node1 mariadb-10.1.16]# id mysql
uid=498(mysql) gid=498(mysql) groups=498(mysql)
[root@node1 mariadb-10.1.16]# ll /data/
total 28
-rw-r--r-- 1 mysql mysql 884 Aug 12 00:54 inittab
drwxr-xr-x 2 mysql mysql 4096 Aug 12 08:46 logs
drwx------ 2 mysql mysql 16384 Aug 12 00:39 lost+found
drwxr-xr-x 5 mysql mysql 4096 Aug 12 09:02 mysql
[root@node1 mariadb-10.1.16]# ll -d /data/
drwxr-xr-x 5 mysql mysql 4096 Aug 12 08:46 /data/
[root@node1 mariadb-10.1.16]# cp support-files/my-large.cnf /etc/my.cnf
[root@node2 mariadb-10.1.16]# scp /etc/init.d/mysqld node1:/etc/init.d/
mysqld 100% 12KB 12.3KB/s 00:00
[root@node1 mariadb-10.1.16]# ll /etc/init.d/mysqld
-rwxr-xr-x 1 root root 12576 Aug 12 09:13 /etc/init.d/mysqld
[root@node1 mariadb-10.1.16]# chkconfig --add mysqld
[root@node1 mariadb-10.1.16]# chkconfig mysqld off
[root@node1 mariadb-10.1.16]# chkconfig --list mysqld
mysqld 0:off 1:off 2:off 3:off 4:off 5:off 6:off
[root@node1 mariadb-10.1.16]# service mysqld start
Starting MySQL.... SUCCESS!
[root@node1 mariadb-10.1.16]# vim /etc/profile.d/mysql.sh
export PATH=/usr/local/mysql/bin:$PATH
[root@node1 mariadb-10.1.16]# source /etc/profile.d/mysql.sh
[root@node1 mariadb-10.1.16]# echo $PATH
/usr/local/mysql/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
[root@node1 mariadb-10.1.16]# mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.16-MariaDB Source distribution
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
4 rows in set (0.07 sec)
[root@node1 mariadb-10.1.16]# service mysqld stop
Shutting down MySQL... SUCCESS!
3、配置pacemaker+mysql
1)、增加mysql资源
[root@node2 ~]# crm ra info ocf:heartbeat:mysql
Manages a MySQL database instance (ocf:heartbeat:mysql)
Resource script for MySQL.
May manage a standalone MySQL database, a clone set with externally
managed replication, or a complete master/slave replication setup.
While managing replication, the default behavior is to use uname -n
values in the change master to command. Other IPs can be specified
manually by adding a node attribute _mysql_master_IP
giving the IP to use for replication. For example, if the mysql primitive
you are using is p_mysql, the attribute to set will be
p_mysql_mysql_master_IP.
Parameters (*: required, []: default):
binary (string, [/usr/local/mysql/bin/mysqld_safe]): MySQL server binary
Location of the MySQL server binary
client_binary (string, [mysql]): MySQL client binary
Location of the MySQL client binary
config (string, [/etc/my.cnf]): MySQL config
Configuration file
datadir (string, [/var/lib/mysql]): MySQL datadir
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive mysqld ocf:heartbeat:mysql binary="/usr/local/mysql/bin/mysqld_safe" config="/etc/my.cnf" datadir="/data/mysql" pid="/data/mysql/mysql.pid" socket="/var/lib/mysql/mysql.sock" op start timeout=180s op stop timeout=180s op monitor interval=20s timeout=60s on-fail=restart
crm(live)configure# verify --有点问题
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
crm(live)configure# show
node node1.pancou.com \
attributes standby=off
node node2.pancou.com \
attributes standby=off
primitive mysqld mysql \
params binary="/usr/local/mysql/bin/mysqld_safe" config="/etc/my.cnf" datadir="/data/mysql" pid="/data/mysql/mysql.pid" socket="/var/lib/mysql/mysql.sock" \
op start timeout=180s interval=0 \
op stop timeout=180s interval=0 \
op monitor interval=20s timeout=60s on-fail=restart
primitive mysqlfs Filesystem \
params device="/dev/drbd0" directory="/data" fstype=ext4 \
op monitor interval=30s timeout=40s on-fail=restart \
op start timeout=60s interval=0 \
op stop timeout=60s interval=0
primitive mysqlstore ocf:linbit:drbd \
params drbd_resource=mystorage \
op monitor role=Master interval=30s timeout=20s \
op monitor role=Slave interval=60s timeout=20s \
op start timeout=240s interval=0 \
op stop timeout=100s interval=0
crm(live)configure# delete mysqld
INFO: hanging colocation:mysqld_with_mysqlfs deleted
crm(live)configure# commit
crm(live)configure# primitive mysqld lsb:mysqld
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
crm(live)configure# colocation mysqld_with_mysqlfs inf: mysqld mysqlfs
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Fri Aug 12 10:15:25 2016 Last change: Fri Aug 12 10:16:03 2016 by root via cibadmin on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 4 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Slaves: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node1.pancou.com
mysqld (lsb:mysqld): Started node1.pancou.com
crm(live)configure# colocation mysqld_with_mysqlfs inf: mysqld mysqlfs
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
crm(live)configure# order mysqld_after_mysqlfs mandatory: mysqlfs mysqld
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
crm(live)configure# show
node node1.pancou.com \
attributes standby=off
node node2.pancou.com \
attributes standby=off
primitive mysqld lsb:mysqld
primitive mysqlfs Filesystem \
params device="/dev/drbd0" directory="/data" fstype=ext4 \
op monitor interval=30s timeout=40s on-fail=restart \
op start timeout=60s interval=0 \
op stop timeout=60s interval=0
primitive mysqlstore ocf:linbit:drbd \
params drbd_resource=mystorage \
op monitor role=Master interval=30s timeout=20s \
op monitor role=Slave interval=60s timeout=20s \
op start timeout=240s interval=0 \
op stop timeout=100s interval=0
ms ms_mysqlstore mysqlstore \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=True
order mysqld_after_mysqlfs Mandatory: mysqlfs mysqld
colocation mysqld_with_mysqlfs inf: mysqld mysqlfs
order mysqlfs_after_ms_mysqlstore_master Mandatory: ms_mysqlstore:promote mysqlfs:start
colocation mysqlfs_with_ms_mysqlstore_master inf: mysqlfs ms_mysqlstore:Master
property cib-bootstrap-options: \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
crm(live)configure# commit
2)、增加VIP资源
crm(live)configure# primitive VIP ocf:heartbeat:IPaddr2 ip="192.168.110.150" cidr_netmask=32 nic=eth0 op monitor interval=30s
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
crm(live)configure# colocation VIP_with_mysqld inf: VIP mysqld
crm(live)configure# verify
ERROR: cib-bootstrap-options: attribute --help does not exist
ERROR: cib-bootstrap-options: attribute --list does not exist
crm(live)configure# show
node node1.pancou.com \
attributes standby=off
node node2.pancou.com \
attributes standby=off
primitive VIP IPaddr2 \
params ip=192.168.110.150 cidr_netmask=32 nic=eth0 \
op monitor interval=30s
primitive mysqld lsb:mysqld
primitive mysqlfs Filesystem \
params device="/dev/drbd0" directory="/data" fstype=ext4 \
op monitor interval=30s timeout=40s on-fail=restart \
op start timeout=60s interval=0 \
op stop timeout=60s interval=0
primitive mysqlstore ocf:linbit:drbd \
params drbd_resource=mystorage \
op monitor role=Master interval=30s timeout=20s \
op monitor role=Slave interval=60s timeout=20s \
op start timeout=240s interval=0 \
op stop timeout=100s interval=0
ms ms_mysqlstore mysqlstore \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=True
colocation VIP_with_mysqld inf: VIP mysqld
order mysqld_after_mysqlfs Mandatory: mysqlfs mysqld
colocation mysqld_with_mysqlfs inf: mysqld mysqlfs
order mysqlfs_after_ms_mysqlstore_master Mandatory: ms_mysqlstore:promote mysqlfs:start
crm(live)configure# commit
crm(live)configure# cd
crm(live)# status
Last updated: Sat Aug 13 02:04:15 2016 Last change: Sat Aug 13 02:04:53 2016 by root via cibadmin on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Slaves: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node1.pancou.com
mysqld (lsb:mysqld): Started node1.pancou.com
VIP (ocf::heartbeat:IPaddr2): Started node1.pancou.com
crm(live)configure# order VIP_before_ms_mysqlstore_master mandatory: VIP ms_mysqlstore:promote mysqlfs:start
crm(live)configure# verify
crm(live)configure# show
node node1.pancou.com \
attributes standby=off
node node2.pancou.com \
attributes standby=off
primitive VIP IPaddr2 \
params ip=192.168.110.150 cidr_netmask=32 nic=eth0 \
op monitor interval=30s
primitive mysqld lsb:mysqld
primitive mysqlfs Filesystem \
params device="/dev/drbd0" directory="/data" fstype=ext4 \
op monitor interval=30s timeout=40s on-fail=restart \
op start timeout=60s interval=0 \
op stop timeout=60s interval=0
primitive mysqlstore ocf:linbit:drbd \
params drbd_resource=mystorage \
op monitor role=Master interval=30s timeout=20s \
op monitor role=Slave interval=60s timeout=20s \
op start timeout=240s interval=0 \
op stop timeout=100s interval=0
ms ms_mysqlstore mysqlstore \
meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=True
order VIP_before_ms_mysqlstore_master Mandatory: VIP ms_mysqlstore:promote mysqlfs:start
colocation VIP_with_mysqld inf: VIP mysqld
order mysqld_after_mysqlfs Mandatory: mysqlfs mysqld
colocation mysqld_with_mysqlfs inf: mysqld mysqlfs
order mysqlfs_after_ms_mysqlstore_master Mandatory: ms_mysqlstore:promote mysqlfs:start
colocation mysqlfs_with_ms_mysqlstore_master inf: mysqlfs ms_mysqlstore:Master
property cib-bootstrap-options: \
dc-version=1.1.14-8.el6-70404b0 \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes=2 \
--help \
--list \
stonith-enabled=false \
no-quorum-policy=ignore
rsc_defaults rsc-options: \
resource-stickiness=100
crm(live)configure# commit
6)、模拟故障转移
模拟node1离线:
[root@node1 ~]# crm node standby node1.pancou.com
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:15:26 2016 Last change: Sat Aug 13 02:15:59 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Node node1.pancou.com: standby
Online: [ node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node2.pancou.com ]
Stopped: [ node1.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node2.pancou.com
mysqld (lsb:mysqld): Stopped
VIP (ocf::heartbeat:IPaddr2): Started node2.pancou.com
Failed Actions:
* mysqlstore_monitor_30000 on node1.pancou.com 'unknown error' (1): call=133, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:39:29 2016', queued=0ms, exec=0ms
* mysqlstore_monitor_60000 on node2.pancou.com 'unknown error' (1): call=111, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:30:40 2016', queued=0ms, exec=0ms
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:16:05 2016 Last change: Sat Aug 13 02:15:59 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Node node1.pancou.com: standby
Online: [ node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node2.pancou.com ]
Stopped: [ node1.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node2.pancou.com
mysqld (lsb:mysqld): Started node2.pancou.com
VIP (ocf::heartbeat:IPaddr2): Started node2.pancou.com
Failed Actions:
* mysqlstore_monitor_30000 on node1.pancou.com 'unknown error' (1): call=133, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:39:29 2016', queued=0ms, exec=0ms
* mysqlstore_monitor_60000 on node2.pancou.com 'unknown error' (1): call=111, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:30:40 2016', queued=0ms, exec=0ms
node1上线
[root@node1 ~]# crm node online node1.pancou.com
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:19:21 2016 Last change: Sat Aug 13 02:19:58 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node2.pancou.com ]
Slaves: [ node1.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node2.pancou.com
mysqld (lsb:mysqld): Started node2.pancou.com
VIP (ocf::heartbeat:IPaddr2): Started node2.pancou.com
Failed Actions:
* mysqlstore_monitor_30000 on node1.pancou.com 'unknown error' (1): call=133, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:39:29 2016', queued=0ms, exec=0ms
* mysqlstore_monitor_60000 on node2.pancou.com 'unknown error' (1): call=111, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:30:40 2016', queued=0ms, exec=0ms
模拟node2离线
[root@node1 ~]# crm node standby node2.pancou.com
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:20:37 2016 Last change: Sat Aug 13 02:21:11 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Node node2.pancou.com: standby
Online: [ node1.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Stopped: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Stopped
mysqld (lsb:mysqld): Stopped
VIP (ocf::heartbeat:IPaddr2): Started node1.pancou.com
Failed Actions:
* mysqlstore_monitor_30000 on node1.pancou.com 'unknown error' (1): call=133, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:39:29 2016', queued=0ms, exec=0ms
* mysqlstore_monitor_60000 on node2.pancou.com 'unknown error' (1): call=111, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:30:40 2016', queued=0ms, exec=0ms
[root@node1 ~]# ls /data/
inittab logs lost+found mysql
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:20:53 2016 Last change: Sat Aug 13 02:21:11 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Node node2.pancou.com: standby
Online: [ node1.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Stopped: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node1.pancou.com
mysqld (lsb:mysqld): Started node1.pancou.com
VIP (ocf::heartbeat:IPaddr2): Started node1.pancou.com
Failed Actions:
* mysqlstore_monitor_30000 on node1.pancou.com 'unknown error' (1): call=133, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:39:29 2016', queued=0ms, exec=0ms
* mysqlstore_monitor_60000 on node2.pancou.com 'unknown error' (1): call=111, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:30:40 2016', queued=0ms, exec=0ms
node2 上线
[root@node1 ~]# crm node online node2.pancou.com
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:22:11 2016 Last change: Sat Aug 13 02:22:46 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Slaves: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node1.pancou.com
mysqld (lsb:mysqld): Started node1.pancou.com
VIP (ocf::heartbeat:IPaddr2): Started node1.pancou.com
Failed Actions:
* mysqlstore_monitor_30000 on node1.pancou.com 'unknown error' (1): call=133, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:39:29 2016', queued=0ms, exec=0ms
* mysqlstore_monitor_60000 on node2.pancou.com 'unknown error' (1): call=111, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:30:40 2016', queued=0ms, exec=0ms
好了,到这里所有的资源配置全部完成,下面我们进行测试一下。
7)、测试mysql高可用
[root@node1 ~]# mysql
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 10.1.16-MariaDB Source distribution
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> grant all on *.* to pancou@'%' identified by 'pancou';
Query OK, 0 rows affected (0.00 sec)
MariaDB [(none)]> show grants for pancou;
+----------------------------------------------------------------------------------------------------------------+
| Grants for pancou@% |
+----------------------------------------------------------------------------------------------------------------+
| GRANT ALL PRIVILEGES ON *.* TO 'pancou'@'%' IDENTIFIED BY PASSWORD '*48D10DDED8C5DDCE0BFD4C7BB6EF82DA15159F83' |
+----------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
在远程上登录
[root@node2 ~]# mysql -h 192.168.110.150 -upancou -p
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 5
Server version: 10.1.16-MariaDB Source distribution
Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
4 rows in set (0.05 sec)
模拟故障
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:35:51 2016 Last change: Sat Aug 13 02:22:46 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Slaves: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node1.pancou.com
mysqld (lsb:mysqld): Started node1.pancou.com
VIP (ocf::heartbeat:IPaddr2): Started node1.pancou.com
[root@node1 ~]# crm node standby node1.pancou.com
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:38:04 2016 Last change: Sat Aug 13 02:37:38 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Node node1.pancou.com: standby
Online: [ node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node2.pancou.com ]
Stopped: [ node1.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node2.pancou.com
mysqld (lsb:mysqld): Started node2.pancou.com
VIP (ocf::heartbeat:IPaddr2): Started node2.pancou.com
Failed Actions:
* mysqlstore_monitor_30000 on node1.pancou.com 'unknown error' (1): call=133, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:39:29 2016', queued=0ms, exec=0ms
* mysqlstore_monitor_60000 on node2.pancou.com 'unknown error' (1): call=111, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:30:40 2016', queued=0ms, exec=0ms
测试mysql
MariaDB [(none)]> show databases;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 4
Current database: *** NONE ***
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
4 rows in set (15.07 sec)
[root@node1 ~]# crm node standby node2.pancou.com
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:40:41 2016 Last change: Sat Aug 13 02:41:16 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Node node2.pancou.com: standby
Online: [ node1.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Slaves: [ node1.pancou.com ]
Stopped: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Stopped
mysqld (lsb:mysqld): Stopped
VIP (ocf::heartbeat:IPaddr2): Started node1.pancou.com
Failed Actions:
* mysqlstore_monitor_30000 on node1.pancou.com 'unknown error' (1): call=133, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:39:29 2016', queued=0ms, exec=0ms
* mysqlstore_monitor_60000 on node2.pancou.com 'unknown error' (1): call=111, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:30:40 2016', queued=0ms, exec=0ms
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:40:46 2016 Last change: Sat Aug 13 02:41:16 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Node node2.pancou.com: standby
Online: [ node1.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Stopped: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node1.pancou.com
mysqld (lsb:mysqld): Started node1.pancou.com
VIP (ocf::heartbeat:IPaddr2): Started node1.pancou.com
Failed Actions:
* mysqlstore_monitor_30000 on node1.pancou.com 'unknown error' (1): call=133, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:39:29 2016', queued=0ms, exec=0ms
* mysqlstore_monitor_60000 on node2.pancou.com 'unknown error' (1): call=111, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:30:40 2016', queued=0ms, exec=0ms
MariaDB [(none)]> show databases;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 4
Current database: *** NONE ***
+--------------------+
| Database |
+--------------------+
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
4 rows in set (0.03 sec)
[root@node1 ~]# crm node online node2.pancou.com
[root@node1 ~]# crm status
Last updated: Sat Aug 13 02:44:01 2016 Last change: Sat Aug 13 02:44:37 2016 by root via crm_attribute on node1.pancou.com
Stack: classic openais (with plugin)
Current DC: node2.pancou.com (version 1.1.14-8.el6-70404b0) - partition with quorum
2 nodes and 5 resources configured, 2 expected votes
Online: [ node1.pancou.com node2.pancou.com ]
Full list of resources:
Master/Slave Set: ms_mysqlstore [mysqlstore]
Masters: [ node1.pancou.com ]
Slaves: [ node2.pancou.com ]
mysqlfs (ocf::heartbeat:Filesystem): Started node1.pancou.com
mysqld (lsb:mysqld): Started node1.pancou.com
VIP (ocf::heartbeat:IPaddr2): Started node1.pancou.com
Failed Actions:
* mysqlstore_monitor_30000 on node1.pancou.com 'unknown error' (1): call=133, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:39:29 2016', queued=0ms, exec=0ms
* mysqlstore_monitor_60000 on node2.pancou.com 'unknown error' (1): call=111, status=Timed Out, exitreason='none',
last-rc-change='Fri Aug 12 07:30:40 2016', queued=0ms, exec=0ms
MariaDB [(none)]> select host,user,password from mysql.user;
+------------------+--------+-------------------------------------------+
| host | user | password |
+------------------+--------+-------------------------------------------+
| localhost | root | |
| node2.pancou.com | root | |
| 127.0.0.1 | root | |
| ::1 | root | |
| localhost | | |
| node2.pancou.com | | |
| % | pancou | *48D10DDED8C5DDCE0BFD4C7BB6EF82DA15159F83 |
+------------------+--------+-------------------------------------------+
7 rows in set (0.01 sec)