一、实验拓扑
二、拓扑说明
分别在两个节点上实现部署MySQL,数据库文件存储在后端的NFS主机上,通过挂载至相应的节点上使用,在两个节点分别安装corosync和pacemaker用于实现MySQL的高可用,通过crmsh程序对pacemaker进行配置,当其中一个节点出现问题时用于前端访问的VIP地址将被移到另一个节点上,并挂载后端的NFS数据库存储文件,然后启动MySQL数据库程序,以实现在两个节点上实现MySQL高可用。
三、架构布置
服务器:CentOS 6.6 x86_64;
数据库IP地址即VIP:172.16.9.100;
两个节点分别是:node-02、node-03;相应的IP地址分别为:172.16.9.82,172.16.9.83;
NFS服务器:IP地址172.16.9.84、主机名为:node-04;
网关服务器:提供时间服务器,网关地址为:172.16.0.1
MySQL版本:mariadb-5.5.43-linux-x86_64.tar.gz
corosync版本:corosync-1.4.7-1.el6.x86_64
pacemaker版本:pacemaker-1.1.12-4.el6.x86_64
crmsh版本:crmsh-2.1-1.6.x86_64.rpm
crmsh依赖包:pssh-2.3.1-2.el6.x86_64.rpm
四、准备工作
在构建高可用集群服务器时需要做四个准备工作,分别是:
①节点间时间必须同步:使用ntp协议实现;
②节点间需要通过主机名互相通信,必须解析主机至IP地址;
(a)建议名称解析功能使用hosts文件来实现;
(b)通信中使用的名字与节点名字必须保持一致:“uname -n”命令,或“hostname”展示出的名字保持一致;
③考虑仲裁设备是否会用到;
④建立各节点之间的root用户能够基于密钥认证;
1)配置节点时间同步
配置时间同步使用ntpdate命令,建立一个定时任务,实现周期性的时间同步
[root@node-02 ~]# ntpdate 172.16.0.1 3Jun 08:56:53 ntpdate[1655]: step time server 172.16.0.1 offset 22520.390088 sec [root@node-02 ~]# crontab -l */3 * * * * /usr/sbin/ntpdate 172.16.0.1&>/dev/null [root@node-03 ~]# /usr/sbin/ntpdate172.16.0.1 3Jun 08:57:50 ntpdate[2094]: step time server 172.16.0.1 offset 23311.837688 sec [root@node-03 ~]# crontab -l */3 * * * * /usr/sbin/ntpdate 172.16.0.1&>/dev/null
2)节点间基于主机名互相通信,在/etc/hosts文件中进行配置
[root@node-02 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 172.16.0.1 server.magelinux.com server 172.16.9.82 node-02 node2 172.16.9.83 node-03 node3 [root@node-03 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 172.16.0.1 server.magelinux.com server 172.16.9.82 node-02 node2 172.16.9.83 node-03 node3
3)节点之间基于root用户的密钥认证
[root@node-02 ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key(/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in/root/.ssh/id_rsa. Your public key has been saved in/root/.ssh/id_rsa.pub. The key fingerprint is: 93:c1:f9:42:63:cd:2c:98:b3:a9:ef:7e:02:24:db:f2root@node-02 The key's randomart p_w_picpath is: +--[ RSA 2048]----+ | | | + = | | + O + | | .. * * | | = o S . | | oo. o | | o.. | | E.. . | | o+o | +-----------------+ [root@node-02 ~]# ssh-copy-id -i.ssh/id_rsa.pub node3 Warning: Permanently added 'node3' (RSA) tothe list of known hosts. root@node3's password: Now try logging into the machine, with"ssh 'node3'", and check in: .ssh/authorized_keys to make sure we haven't added extra keysthat you weren't expecting. [root@node-03 ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key(/root/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in/root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: b7:b1:3b:ca:78:9c:bc:72:0e:a0:18:a6:8d:ac:d0:99root@node-03 The key's randomart p_w_picpath is: +--[ RSA 2048]----+ | | | | | | | | |.. . S o | |+* + . . + | |=.E .o .o | |o .+* .. | |. .==o.. | +-----------------+ [root@node-03 ~]# [root@node-03 ~]# ssh-copy-id -i .ssh/id_rsa.pub node2 root@node2's password: Now try logging into the machine, with"ssh 'node2'", and check in: .ssh/authorized_keys to make sure we haven't added extra keysthat you weren't expecting.
测试节点间是否名密钥登陆
[root@node-02 ~]# ssh node3 Last login: Wed Jun 3 02:28:06 2015 from 172.16.9.9 [root@node-03 ~]# exit logout Connection to node3 closed. [root@node-02 ~]# [root@node-03 ~]# ssh node2 Last login: Wed Jun 3 02:41:17 2015 from 172.16.9.9 [root@node-02 ~]# exit logout Connection to node2 closed. [root@node-03 ~]#
五、配置共享存储NFS
5.1 配置NFS
[root@node-04 ~]# mkdir /mydata/data -p [root@node-04 ~]# cat /etc/exports /web/htdoc 172.16.0.0/16(rw) /mydata/data 172.16.0.0/16(rw,no_root_squash)
#提示:建议在安装完数据库后所no_root_squash选择取消了,使用选择太危险。
5.2 设置共享文件/mydata/data文件的属主和属组
[root@node-04 ~]# userdel -r mysql [root@node-04 ~]# useradd -r -u 336 mysql [root@node-04 ~]# id mysql uid=336(mysql) gid=336(mysql)groups=336(mysql) [root@node-04 ~]# chown -R mysql.mysql/mydata/data/
5.3启动NFS服务
[root@node-04 ~]# service rpcbind start [root@node-04 ~]# service nfs start Starting NFS services: [ OK ] Starting NFS quotas: [ OK ] Starting NFS mountd: [ OK ] Starting NFS daemon: [ OK ] Starting RPC idmapd: [ OK ]
5.4 查看nfs共享存储文件
[root@node-04 ~]# showmount -e 172.16.9.84 Export list for 172.16.9.84: /mydata/data 172.16.0.0/16
六、安装布置MySQL数据库
MySQL使用的是MairaDB的数据库,只有在两个节点中的其中一个节点初始化数据库就行,因为两个节点都是共享提供的一个数据库文件,如安装初始化数据库在node2上进行操作,在node3节点上不用初始化数据库其它操都是一样的,这里就不给出操作过程。
6.1 创建MariaDB运行的用户
[root@node-02 ~]# useradd -r -u 336 mysql [root@node-02 ~]# id mysql uid=336(mysql) gid=336(mysql)groups=336(mysql)
6.2 挂载NFS共享数据库目录
[root@node-02 ~]# mkdir /data [root@node-02 ~]# mount -t nfs172.16.9.84:/mydata/data /data [root@node-02 ~]# mount |tail -1 172.16.9.84:/mydata/data on /data type nfs(rw,vers=4,addr=172.16.9.84,clientaddr=172.16.9.82)
6.3 解压MariaDB程序包到/usr/local目录下
[root@node-2 tools]# tar xfmariadb-5.5.43-linux-x86_64.tar.gz -C /usr/local/
6.4 创建软链接
[root@node-2 tools]# cd /usr/local/ [root@node-2 local]# ln -smariadb-5.5.43-linux-x86_64/ mysql
6.5 初始化数据库
[root@node-2 local]# cd mysql [root@node-2 mysql]# chown -R root.mysql ./* [root@node-2 mysql]# scripts/mysql_install_db--datadir=/data --user=mysql
6.6 提供MySQL的主配置文件
[root@node-2 mysql]# mkdir /etc/mysql [root@node-2 mysql]# cpsupport-files/my-large.cnf /etc/mysql/my.cnf
6.7 编辑/etc/mysql/my.cnf配置文件
在/etc/mysql/my.cnf配置文件中在[mysqld]标签中添加数据库存放目录。 datadir = /data innodb_file_per_table= on skip_name_resolve = on
6.8 为MySQL提供服务脚本
[root@node-2 mysql]# cpsupport-files/mysql.server /etc/rc.d/init.d/mysqld [root@node-2 mysql]# chmod +x/etc/rc.d/init.d/mysqld [root@node-2 mysql]# chkconfig --add mysqld [root@node-2 mysql]# chkconfig mysqld off
6.9 启动MariaDB服务进行测试
[root@node-02 mysql]# service mysqld start Starting MySQL.... [ OK ] [root@node-02 mysql]# bin/mysql Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 2 Server version: 5.5.43-MariaDB-log MariaDBServer Copyright (c) 2000, 2015, Oracle, MariaDBCorporation Ab and others. Type 'help;' or '\h' for help. Type '\c' toclear the current input statement. MariaDB [(none)]> create database node2; Query OK, 1 row affected (0.03 sec) MariaDB [(none)]> flush privileges; Query OK, 0 rows affected (0.00 sec) MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | node2 | | performance_schema | | test | +--------------------+ 5 rows in set (0.01 sec) MariaDB [(none)]> exit Bye
6.10 停止MySQL服务,卸载NFS共享存储/data
[root@node-02 mysql]# service mysqld stop Shutting down MySQL. [ OK ] [root@node-02 mysql]# umount /data
七、安装配置HA程序
7.1 安装corosync和pacemaker程序
在配置好yum源之后直接安装corosync和pacemaker程序,分别两个节点上执行yum 命令安装
yum install corosync pacemaker -y
7.2 corosync默认配置文件解释
corosync程序安装后配置文件所在目录为/etc/corosync目录下,启动程序/etc/init.d/corosync.
[root@node-02 ~]# cd /etc/corosync/ [root@node-02 corosync]# ls corosync.conf.example corosync.conf.example.udpu service.d uidgid.d [root@node-02 corosync]# cpcorosync.conf.example corosync.conf [root@node-02 corosync]# egrep -v "#|^$" corosync.conf compatibility: whitetank #是否兼容whitetank totem { #用于定义底层信息层是如何通信的相关属性 version:2 #定义版本号 secauth:off #是否启用安全认证功能,启用后要使用corosync-keygen命令生成密钥 threads:0 #工作时所使用的线程数,“0”表示不基于线程模型,而是进程模型 interface{ #定义多个接口之间,基于哪个地址,哪个多播地址,监听什么端口完成多播通信; ringnumber:0 #环数,有点类型于TTL值对方是否回传 bindnetaddr:192.168.1.0 #多播地址监听的IP网络地址 mcastaddr:239.255.1.1 #多播地址 mcastport:5405 #多播地址监听的端口 ttl:1 #指明TTL值 } } logging { #定义日志相关属性 fileline:off # to_stderr:no #是否把日志输出为标准输出即屏幕 to_logfile:yes #开启记录在日志文件中 logfile:/var/log/cluster/corosync.log to_syslog:yes #是否发往系统的日志文件中 debug:off timestamp:on #是否在日志文件中开启时间戳功能,建议不开启 logger_subsys{ #日志文件是否记录子系统 subsys:AMF debug:off } }
7.3 配置pacemaker
pacemaker与corosync结合运行pacemaker的运行方式有两种,一种是作为corosync的插件运行,另一种是以独立的守护进程运行,以CentOS 6中建议以插件的方式运行,不过这样日志中可能会用警告,可以忽略的。在corosync.conf文件后面添加如下内容:
service { ver: 0 name: pacemaker use_mgmtd:yes } aisexec { user: root grout: root }
7.4为corosync提供密钥文件,它需要在/dev/random中读取1024个随机数
[root@node-02 corosync]# corosync-keygen Corosync Cluster Engine Authentication keygenerator. Gathering 1024 bits for key from/dev/random. Press keys on your keyboard to generateentropy. Press keys on your keyboard to generateentropy (bits = 176).
#此时已经卡住了,说没有这么多个随机数,可以在打开一个终端,不断的敲击键盘,不过这么有一点的久,你可以在ftp下载一个大的文件,这样会产生大量的IO。
7.5 corosync+pacemaker最终配置文件
[root@node-02 corosync]# egrep -v "#|^$" corosync.conf compatibility: whitetank totem { version:2 secauth:on threads:0 interface{ ringnumber:0 bindnetaddr:172.16.0.0 mcastaddr:239.255.9.9 mcastport:5405 ttl:1 } } logging { fileline:off to_stderr:no to_logfile:yes logfile:/var/log/cluster/corosync.log to_syslog:no debug:off timestamp:on logger_subsys{ subsys:AMF debug:off } } service { ver: 0 name: pacemaker use_mgmtd:yes } aisexec { user: root grout: root }
7.6 将配置文件和密钥文件同步至node3节点
[root@node-02 corosync]# scp authkeycorosync.conf node3:/etc/corosync/ authkey 100% 128 0.1KB/s 00:00 corosync.conf 100% 2794 2.7KB/s 00:00
7.7 启动corosync服务
[root@node-02 corosync]# service corosyncstart;ssh node3 'service corosync start' Starting Corosync Cluster Engine(corosync): [ OK ] Starting Corosync Cluster Engine(corosync): [ OK ]
7.8安装crmsh
把准备好的程序直接使用yum进行安装,这样可以解决依赖关系,在生产环境中只需要选择一台节点上进行安装,在这里我们在两个节点上都进行安装,以方便测试。
[root@node-02 ~]# yum installcrmsh-2.1-1.6.x86_64.rpm pssh-2.3.1-2.el6.x86_64.rpm -y
八、配置高可用MySQL服务
8.1 初始化配置
[root@node-02 ~]# crm #切换至crm命令提示符 crm(live)# configure #切换至配置模式 crm(live)configure# propertystonith-enabled=false #禁用stonith设备,因为我们这里没有stonith设备所有要禁用 crm(live)configure# propertyno-quorum-policy=ignore #忽略集群中当节点数小于等于quorum,节点数将无法运行,默认是stop crm(live)configure# verify #检验语法 crm(live)configure# commit #提交并保存服务立即生效
8.2 配置VIP资源
crm(live)configure# primitive mysqlipocf:heartbeat:IPaddr \ params ip=172.16.9.100 nic=eth0cidr_netmask=16 \ op monitor interval=10s timeout=20s crm(live)configure# verify #primitive :配置主资源即基本资源 #mysqlip :资源名,为VIP的地址 # ocf:heartbeat:IPaddr :表示为ocf风格的heartbeat中的IPaddr,用于设置IP地址 #parmas :参数,即ocf:heartbeat:IPaddr选项中的要进行配置的值 #ip=172.16.9.100 :设置IP地址为172.6.9.100 #nic :把VIP设置在哪块网卡上,可省 #cidr_netmask=16 :使用cidr风格的子网掩码格式 #op :表示此资源带的选项 #monitor :为监控操作 # interval :每隔多少时间监控一次 # timeout :每次监控超时时间
8.3 配置nfs挂载资源
crm(live)configure# primitive mysqlnfsocf:heartbeat:Filesystem \ paramsdevice="172.16.9.84:/mydata/data" directory="/data"fstype=nfs \ op monitor interval=20s timeout=40s opstart timeout=60s op stop timeout=60s crm(live)configure# verify # ocf:heartbeat:Filesystem : 示为ocf风格的heartbeat中的文件系统 # device="172.16.9.84:/mydata/data" :设备路径 # directory="/data" :挂载点 # fstype=nfs :文件系统类型
8.4 配置mysql服务资源
crm(live)configure# primitive mysqlserverlsb:mysqld op monitor interval=20s timeout=40s crm(live)configure# verify
8.5 定义资源之间的启动顺序
只有在先启动VIP地址才能挂载NFS文件系统,挂载成功后才能启动MySQL服务,这里通过group的方式来进行定义资源之间的启动顺序。
crm(live)configure# group mysqlservicemysqlip mysqlnfs mysqlserver crm(live)configure# verify crm(live)configure# commit
九、测试资源
9.1查看集群资源的运行状态
[root@node-02 ~]# crm status Last updated: Wed Jun 3 11:18:21 2015 Last change: Wed Jun 3 11:15:35 2015 Stack: classic openais (with plugin) Current DC: node-02 - partition with quorum #当前DC,拥不拥法定票数 Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes #有几个节点,有几票 3 Resources configured #当前配置的资源数 Online: [ node-02 node-03 ] #在线的节点 Resource Group: mysqlservice mysqlip (ocf::heartbeat:IPaddr): Started node-02 mysqlnfs (ocf::heartbeat:Filesystem): Started node-02 mysqlserver (lsb:mysqld): Started node-02
9.2 测试MySQL服务
[root@node-02 ~]# /usr/local/mysql/bin/mysql Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 2 Server version: 5.5.43-MariaDB-log MariaDBServer Copyright (c) 2000, 2015, Oracle, MariaDBCorporation Ab and others. Type 'help;' or '\h' for help. Type '\c' toclear the current input statement. MariaDB [(none)]> create databasetestnode2; #创建测试数据库testnode2 Query OK, 1 row affected (0.02 sec) MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | node2 | | performance_schema | | test | | testnode2 | +--------------------+ 6 rows in set (0.13 sec)
9.2 将节点node2设置为备用节点
把节点node2设置为备用节点后资源将会转移至node3节点,然后将node2节点上线,此时资源并不会转移至node2节点上,因为没有配置资源对节点的倾向性或资源之间在一起的参数。
[root@node-02 ~]# crm node standby [root@node-02 ~]# crm status Last updated: Wed Jun 3 11:25:28 2015 Last change: Wed Jun 3 11:25:21 2015 Stack: classic openais (with plugin) Current DC: node-02 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 3 Resources configured Node node-02: standby #节点node2已经是standby的状态 Online: [ node-03 ] Resource Group: mysqlservice #资源都已经转移至node3节点 mysqlip (ocf::heartbeat:IPaddr): Started node-03 mysqlnfs (ocf::heartbeat:Filesystem): Started node-03 mysqlserver (lsb:mysqld): Started node-03
将node2上线,通过状态发现资源并没有转移回至node3节点
[root@node-02 ~]# crm node online [root@node-02 ~]# crm status Last updated: Wed Jun 3 11:27:13 2015 Last change: Wed Jun 3 11:27:10 2015 Stack: classic openais (with plugin) Current DC: node-02 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 3 Resources configured Online: [ node-02 node-03 ] Resource Group: mysqlservice mysqlip (ocf::heartbeat:IPaddr): Started node-03 mysqlnfs (ocf::heartbeat:Filesystem): Started node-03 mysqlserver (lsb:mysqld): Started node-03
9.3 在节点node3上进行测试MySQL服务
[root@node-03 ~]#/usr/local/mysql/bin/mysql Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 2 Server version: 5.5.43-MariaDB-log MariaDBServer Copyright (c) 2000, 2015, Oracle, MariaDBCorporation Ab and others. Type 'help;' or '\h' for help. Type '\c' toclear the current input statement. MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | node2 | | performance_schema | | test | | testnode2 | +--------------------+ 6 rows in set (0.04 sec) MariaDB [(none)]> create databasetestnode3; Query OK, 1 row affected (0.07 sec) MariaDB [(none)]> flush privileges; Query OK, 0 rows affected (0.01 sec) MariaDB [(none)]>
通过上面的测试MySQL已经在节点node2、node3上能实现高可用,在节点node2上创建的数据库文件在挂载至节点node3上并没有丢失。
9.4 模拟数据库不小心关闭服务
这里通过直接把mysqld进程直接killall,检查HA会不会再次启动MySQL服务
[root@node-03 ~]# ss -tanp|grep":3306" #查看MySQL进程 LISTEN 0 50 *:3306 *:* users:(("mysqld",5781,15)) [root@node-03 ~]# killall mysqld #killall所有的MySQL进程 [root@node-03 ~]# killall mysqld mysqld: no process killed [root@node-03 ~]# ss -tanp|grep":3306" [root@node-03 ~]# ss -tanp|grep":3306" [root@node-03 ~]# ss -tanp|grep ":3306" [root@node-03 ~]# crm status #查看集群的状态,MySQL资源已经停止 Last updated: Wed Jun 3 11:41:45 2015 Last change: Wed Jun 3 11:41:24 2015 Stack: classic openais (with plugin) Current DC: node-02 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 3 Resources configured Online: [ node-02 node-03 ] Resource Group: mysqlservice mysqlip (ocf::heartbeat:IPaddr): Started node-03 mysqlnfs (ocf::heartbeat:Filesystem): Started node-03 mysqlserver (lsb:mysqld): Stopped [root@node-03 ~]# ss -tanp|grep":3306" [root@node-03 ~]# ss -tanp|grep":3306" [root@node-03 ~]# ss -tanp|grep":3306" [root@node-03 ~]# ss -tanp|grep":3306" [root@node-03 ~]# ss -tanp|grep":3306" #先行几秒后MySQL服务再次自动的被集群启动 LISTEN 0 50 *:3306 *:* users:(("mysqld",11361,15)) [root@node-03 ~]# crm status Last updated: Wed Jun 3 11:37:36 2015 Last change: Wed Jun 3 11:27:10 2015 Stack: classic openais (with plugin) Current DC: node-02 - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 3 Resources configured Online: [ node-02 node-03 ] Resource Group: mysqlservice mysqlip (ocf::heartbeat:IPaddr): Started node-03 mysqlnfs (ocf::heartbeat:Filesystem): Started node-03 mysqlserver (lsb:mysqld): Started node-03 [root@node-03 ~]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.5.43-MariaDB-log MariaDBServer Copyright (c) 2000, 2013, Oracle and/or itsaffiliates. All rights reserved. Oracle is a registered trademark of OracleCorporation and/or its affiliates. Other names may be trademarksof their respective owners. Type 'help;' or '\h' for help. Type '\c' toclear the current input statement. mysql> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | node2 | | performance_schema | | test | | testnode2 | | testnode3 | +--------------------+ 7 rows in set (0.02 sec)
小结:
一个简单的MySQL高可用就配置完毕,NFS服务器将是集群节点中的一个单点故障所在,NFS所能实现的并发访问量也是有现的,整个架构并不能实现完整的高可用,还有很多地方需要改进。
欢迎各位观客为小乌提出宝贵的意见,小乌等待你。。。。。
奋斗的年纪,绝不能怠慢自己!