MHA:Master HA,对主节点进行监控,可实现自动故障转 移至其它从节点;通过提升某一从节点为新的主节点,基于主 从复制实现,还需要客户端配合实现,目前MHA主要支持一 主多从的架构,要搭建MHA,要求一个复制集群中必须最少有 三台数据库服务器,一主二从,即一台充当master,一台充 当备用master,另外一台充当从库,如果财大气粗,也可以用一台专门的服务器来当MHA监控管理服务器

MHA工作原理

  • 1 从宕机崩溃的master保存二进制日志事件(binlog events)
  • 2 识别含有最新更新的slave
  • 3 应用差异的中继日志(relay log)到其他的slave
  • 4 应用从master保存的二进制日志事件(binlog events)
  • 5 提升一个slave为新的master
  • 6 使其他的slave连接新的master进行复制
  • 注意:MHA需要基于ssh,key验证登入方法

相关软件包

  • MHA监控服务器安装:mha4mysql-manager-0.55-1.el5.noarch,mha4mysql-node-0.54-1.el5.noarch
  • 其他主从集群服务器安装:mha4mysql-node-0.54-1.el5.noarch

下面是实验环境

  • 四台centos-7主机,一台搭建MHA管理服务器,另外三台,做一主二从架构
  • 为了更容易理解实验步骤我将服务器名称分别用简化字母表示:MHA服务器:MHA,master服务器:M1 ,slave服务器:S1 ,s2

首先去官网下载MHA软件包,注意:需要×××

  • 官网: https://code.google.com/archive/p/mysql-master-ha/
  • mha4mysql-manager-0.55-1.el5.noarch
  • mha4mysql-node-0.54-1.el5.noarch

一,配置主从复制集群架构

(一),配置master服务器

1,修改配置文件

    vim /etc/my.cnf
    在[mysqld]语句块中添加如下配置
    [mysqld]
    log-bin                 #开启二进制日志
    server_id=1               #设置所有节点中唯一的id编号
    innodb_file_per_table   #开启数据于表结构分离,两个文件存放
    skip_name_resolve=1     #跳过DNS解析

2,在M1上安装刚才下载的MHA软件包:mha4mysql-node-0.54-1.el5.noarch

    yum install mha4mysql-node-0.54-1.el5.noarch   

3,启动mariadb服务

    systemctl start mariadb
    systemctl enable mariadb

4,运行数据库安全程序

    "mysql_secure_installation"

        第一项问你:输入root密码  回车即可,因为没有
        第二项问你:需要设置root密码么,
        第三项问你:需要删除空账号用户么,
        第四项问你:禁止root用户远程登入么,
        第五项问你:需要删除test测试数据库么,
        第六项问你:现在重新加载权限表吗 ,

5,创建有复制权限的用户账号 ,以及用作与MHA服务器管理数据库的用户,该用户必须拥有所有权限

    1,创建拥有复制权限的用户账号
        GRANT REPLICATION SLAVE  ON *.* TO 'repluser'@'HOST' IDENTIFIED BY 'replpass'; 

            命令解析:
                'repluser'@'HOST' :设置用户名即可登入的主机ip或网段,网段用%表示 例如10.0.0.%
                IDENTIFIED BY:设置密码
                *.* :表示所有数据库,所有表
                GRANT REPLCATION SLAVE:就是允许该用户复制数据

            该命令作用就是授权repluser能拷贝数据库的所有内容
    2,创建MHA管理用户
        GRANT REPLICATION SLAVE  ON *.* TO 'mha'@'HOST' IDENTIFIED BY 'replpass';  

            命令解析:
                'mha'@'HOST' :设置用户名即可登入的主机ip或网段,网段用%表示 例如10.0.0.%
                IDENTIFIED BY:设置密码
                *.* :表示所有数据库,所有表
                GRANT ALL :就表示该用户拥有所有权限的意思

(二),配置S1服务器

1,修改配文件

   vim /etc/my.cnf
    server_id=2
    read_only                   #普通用户只有读权限,对超级用户没有限制
    log-bin
    relay_log_purge=0           #不自动清理日志
    skip_name_resolve=1
    innodb_file_per_table
    "log-bin"             
    "注意:正常来讲MySQL主从复制,从服务器是不需要启用二进制日志的,
    这里为什么从服务器要启用二进制日志呢?因为基于MHA管理,当原master服务器down机了,
    MHA会自动提升一台数据变化不是很大的从服务器为新的主,因为主必须开启二进制日志所以必须添加"

2,在M1上安装刚才下载的MHA软件包:mha4mysql-node-0.54-1.el5.noarch

    yum install mha4mysql-node-0.54-1.el5.noarch   

3,启动mariadb服务

    systemctl start mariadb
    systemctl enable mariadb

4,运行数据库安全程序

    "mysql_secure_installation"

        第一项问你:输入root密码  回车即可,因为没有
        第二项问你:需要设置root密码么,
        第三项问你:需要删除空账号用户么,
        第四项问你:禁止root用户远程登入么,
        第五项问你:需要删除test测试数据库么,
        第六项问你:现在重新加载权限表吗 ,

6,,使用有复制权限的用户账号连接至主服务器,并启动复制线程

    1,使用有复制权限的用户账号连接至主服务器
        CHANGE MASTER TO 
             MASTER_HOST='master_host',         #指定master主机IP
             MASTER_USER='repluser',            #指定master被授权的用户名
             MASTER_PASSWORD='replpass',        #指定被master授权的用户密码 MASTER_LOG_FILE='mysql-bin.xxxxx', #指定master服务器的哪个二进制日志开始复制
             MASTER_LOG_POS=#;                  #二进制日志位置,可以在master服务器上执行该命令查看,show master logs;

    2,启动复制线程IO_THREAD和SQL_THREAD
         START SLAVE; 

7,查看slave服务器线程状态

    MariaDB [(none)]> show slave status\G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: 192.168.68.17
                      Master_User: repluser
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: mariadb-bin.000001
              Read_Master_Log_Pos: 557
                   Relay_Log_File: mariadb-relay-bin.000002
                    Relay_Log_Pos: 843
            Relay_Master_Log_File: mariadb-bin.000001
                 Slave_IO_Running: Yes  "重点关注如果是NO表示线程没起来"
                Slave_SQL_Running: Yes "重点关注 如果是NO表示该线程没起来"
                  Replicate_Do_DB: 
              Replicate_Ignore_DB: 
               Replicate_Do_Table: 
           Replicate_Ignore_Table: 
          Replicate_Wild_Do_Table: 
      Replicate_Wild_Ignore_Table: 
                       Last_Errno: 0
                       Last_Error: 
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 557
                  Relay_Log_Space: 1139
                  Until_Condition: None
                   Until_Log_File: 
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File: 
               Master_SSL_CA_Path: 
                  Master_SSL_Cert: 
                Master_SSL_Cipher: 
                   Master_SSL_Key: 
            Seconds_Behind_Master: 0 "该项表示同步时间 0表示即使同步"
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 0
                    Last_IO_Error: 
                   Last_SQL_Errno: 0
                   Last_SQL_Error: 
      Replicate_Ignore_Server_Ids: 
                 Master_Server_Id: 1

(三),在S2服务器上在把S1上的操作,按照同样的步骤做一遍,完成S2的配置

主从复制测试

    1,在M1上创建数据库
    MariaDB [(none)]> create database a1;
    Query OK, 1 row affected (0.00 sec)

    M1 [(none)]> show databases;
    +--------------------+
    | Database           |
    +--------------------+
    | information_schema |
    | a1                 |
    | mysql              |
    | performance_schema |
    | test               |
    +--------------------+
    5 rows in set (0.00 sec)

    2,在S1,S2上查看同步情况。
    S1 [(none)]> show databases;
    +--------------------+
    | Database           |
    +--------------------+
    | information_schema |
    | a1                 |
    | mysql              |
    | performance_schema |
    | test               |
    +--------------------+
    5 rows in set (0.01 sec)

    S2 [(none)]> show databases;
    +--------------------+
    | Database           |
    +--------------------+
    | information_schema |
    | a1                 |
    | mysql              |
    | performance_schema |
    | test               |
    +--------------------+
    5 rows in set (0.01 sec)

(四),配置MHA服务器

1,实现所有主机SSH,基于key验证登入方法,注意此方法只能使用于局域网

①,在任意一台主机上生成一对公私钥,例如,在M1上生成

    ssh-keygen
        这是候会在本机生成一个.ssh/目录
            ls .ssh/
            id_rsa  id_rsa.pub 

②,然后再把整个.ssh/目录复制给本机M1

    ssh-copy-id  M1_IP
        这时候本机.ssh目录还会多两个文件
            ls .ssh/
            authorized_keys  id_rsa  id_rsa.pub  known_hosts

③,将整个.ssh目录复制到其他所有主机

    ssh-copy-id MHA_IP
    ssh-copy-id S1_IP
    ssh-copy-id S2_IP

④,ssh连接测试,不用输入密码表示成功

[root@MHA ~]# ssh 192.168.68.7
Last login: Fri Mar 30 13:21:52 2018 from 192.168.68.1
[root@master ~]#

原理:所有主机都是用的一个公钥,其实这些主机会以为就是一台主机,所以这个.ssh文件千万不要泄露了,如果泄露了就意味着可以不需要密码登入任意一台主机,这就是为什么只能在局域网中使用的原因,如果,想在公网使用的话,就需要在每一台主机上生成公私钥,在每一台主机都互相复制一遍,或者是都复制到一台主机上,再把这台主机上的.ssh目录下的authorized_keys 这个文件复制到其他的所有主机上

2,安装两个包:

  • mha4mysql-manager-0.55-1.el5.noarch.rpm
  • mha4mysql-node-0.54-1.el5.noarch

注意:需要配置epel源,用阿里云的epel源就可以

   yum install mha4mysql-manager-0.55-1.el5.noarch.rpm mha4mysql-node-0.54-1.el5.noarch.rpm 

3,创建一个存放MHA配置的目录

    mkdir -pv /etc/mha/

4,在MHA配置目录创建一个管理配置文件,文件名没有要求。

    vim /etc/mha/app1.cnf

#添加如下项目

    [server default]                                          #默认规则
    user=mha                                                  #mhauser(mysql内配置的用来管理数据库的用户)
    password=123123                                           #密码
    manager_workdir=/data/mha/test/                           #工作目录
    manager_log=/data/mha/test/manager.log                    #日志文件
    remote_workdir=/data/mha/test/                            #节点的工作目录
    ssh_user=root                                             #ssh用户
    repl_user=reluser                                         #主从复制用户(mysql内配置的用来复制数据库的用户)
    repl_password=123123                                      #密码
    ping_interval=1                                           #心跳检测间隔(秒)

    [server1]                                                 #节点名称
    hostname=172.18.30.1                                      #节点地址
    candidate_master=1                                        #表示允许提升为主服务器
    [server2]                                                 #节点名称
    hostname=172.18.30.2                                      #节点地址
    candidate_master=1                                        #表示允许提升为主服务器
    [server3]                                                 #节点名称
    hostname=172.18.30.4                                      #节点地址

5,利用MHA自带的SSH检测的脚本检测SSH是否连接正常

    [root@test ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf
    Sat Mar 31 19:18:47 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Sat Mar 31 19:18:47 2018 - [info] Reading application default configuration from /etc/mha/app1.cnf..
    Sat Mar 31 19:18:47 2018 - [info] Reading server configuration from /etc/mha/app1.cnf..
    Sat Mar 31 19:18:47 2018 - [info] Starting SSH connection tests..
    Sat Mar 31 19:18:49 2018 - [debug] 
    Sat Mar 31 19:18:47 2018 - [debug]  Connecting via SSH from [email protected](172.18.30.107:22) to [email protected](172.18.30.108:22)..
    Sat Mar 31 19:18:48 2018 - [debug]   ok.
    Sat Mar 31 19:18:48 2018 - [debug]  Connecting via SSH from [email protected](172.18.30.107:22) to [email protected](172.18.30.109:22)..
    Sat Mar 31 19:18:49 2018 - [debug]   ok.
    Sat Mar 31 19:18:50 2018 - [debug] 
    Sat Mar 31 19:18:47 2018 - [debug]  Connecting via SSH from [email protected](172.18.30.108:22) to [email protected](172.18.30.107:22)..
    Sat Mar 31 19:18:49 2018 - [debug]   ok.
    Sat Mar 31 19:18:49 2018 - [debug]  Connecting via SSH from [email protected](172.18.30.108:22) to [email protected](172.18.30.109:22)..
    Sat Mar 31 19:18:49 2018 - [debug]   ok.
    Sat Mar 31 19:18:50 2018 - [debug] 
    Sat Mar 31 19:18:48 2018 - [debug]  Connecting via SSH from [email protected](172.18.30.109:22) to [email protected](172.18.30.107:22)..
    Sat Mar 31 19:18:49 2018 - [debug]   ok.
    Sat Mar 31 19:18:49 2018 - [debug]  Connecting via SSH from [email protected](172.18.30.109:22) to [email protected](172.18.30.108:22)..
    Sat Mar 31 19:18:50 2018 - [debug]   ok.
    Sat Mar 31 19:18:50 2018 - [info] All SSH connection tests passed successfully.

6,利用MHA自带的检测复制状况,检测是否复制正常

    [root@test ~]# masterha_check_repl --conf=/etc/mha/app1.cnf 
    Sat Mar 31 19:19:26 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Sat Mar 31 19:19:26 2018 - [info] Reading application default configuration from /etc/mha/app1.cnf..
    Sat Mar 31 19:19:26 2018 - [info] Reading server configuration from /etc/mha/app1.cnf..
    Sat Mar 31 19:19:26 2018 - [info] MHA::MasterMonitor version 0.56.
    Sat Mar 31 19:19:28 2018 - [info] GTID failover mode = 0
    Sat Mar 31 19:19:28 2018 - [info] Dead Servers:
    Sat Mar 31 19:19:28 2018 - [info] Alive Servers:
    Sat Mar 31 19:19:28 2018 - [info]   172.18.30.107(172.18.30.107:3306)
    Sat Mar 31 19:19:28 2018 - [info]   172.18.30.108(172.18.30.108:3306)
    Sat Mar 31 19:19:28 2018 - [info]   172.18.30.109(172.18.30.109:3306)
    Sat Mar 31 19:19:28 2018 - [info] Alive Slaves:
    Sat Mar 31 19:19:28 2018 - [info]   172.18.30.108(172.18.30.108:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled
    Sat Mar 31 19:19:28 2018 - [info]     Replicating from 172.18.30.107(172.18.30.107:3306)
    Sat Mar 31 19:19:28 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
    Sat Mar 31 19:19:28 2018 - [info]   172.18.30.109(172.18.30.109:3306)  Version=5.5.56-MariaDB (oldest major version between slaves) log-bin:enabled
    Sat Mar 31 19:19:28 2018 - [info]     Replicating from 172.18.30.107(172.18.30.107:3306)
    Sat Mar 31 19:19:28 2018 - [info]     Primary candidate for the new Master (candidate_master is set)
    Sat Mar 31 19:19:28 2018 - [info] Current Alive Master: 172.18.30.107(172.18.30.107:3306)
    Sat Mar 31 19:19:28 2018 - [info] Checking slave configurations..
    Sat Mar 31 19:19:28 2018 - [info]  read_only=1 is not set on slave 172.18.30.108(172.18.30.108:3306).
    Sat Mar 31 19:19:28 2018 - [warning]  relay_log_purge=0 is not set on slave 172.18.30.108(172.18.30.108:3306).
    Sat Mar 31 19:19:28 2018 - [info]  read_only=1 is not set on slave 172.18.30.109(172.18.30.109:3306).
    Sat Mar 31 19:19:28 2018 - [warning]  relay_log_purge=0 is not set on slave 172.18.30.109(172.18.30.109:3306).
    Sat Mar 31 19:19:28 2018 - [info] Checking replication filtering settings..
    Sat Mar 31 19:19:28 2018 - [info]  binlog_do_db= , binlog_ignore_db= 
    Sat Mar 31 19:19:28 2018 - [info]  Replication filtering check ok.
    Sat Mar 31 19:19:28 2018 - [info] GTID (with auto-pos) is not supported
    Sat Mar 31 19:19:28 2018 - [info] Starting SSH connection tests..
    Sat Mar 31 19:19:31 2018 - [info] All SSH connection tests passed successfully.
    Sat Mar 31 19:19:31 2018 - [info] Checking MHA Node version..
    Sat Mar 31 19:19:32 2018 - [info]  Version check ok.
    Sat Mar 31 19:19:32 2018 - [info] Checking SSH publickey authentication settings on the current master..
    Sat Mar 31 19:19:33 2018 - [info] HealthCheck: SSH to 172.18.30.107 is reachable.
    Sat Mar 31 19:19:34 2018 - [info] Master MHA Node version is 0.56.
    Sat Mar 31 19:19:34 2018 - [info] Checking recovery script configurations on 172.18.30.107(172.18.30.107:3306)..
    Sat Mar 31 19:19:34 2018 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql,/var/log/mysql --output_file=/data/mastermha/app1//save_binary_logs_test --manager_version=0.56 --start_file=mariadb-bin.000001 
    Sat Mar 31 19:19:34 2018 - [info]   Connecting to [email protected](172.18.30.107:22).. 
      Creating /data/mastermha/app1 if not exists..    ok.
      Checking output directory is accessible or not..
       ok.
      Binlog found at /var/lib/mysql, up to mariadb-bin.000001
    Sat Mar 31 19:19:34 2018 - [info] Binlog setting check done.
    Sat Mar 31 19:19:34 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
    Sat Mar 31 19:19:34 2018 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=172.18.30.108 --slave_ip=172.18.30.108 --slave_port=3306 --workdir=/data/mastermha/app1/ --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
    Sat Mar 31 19:19:34 2018 - [info]   Connecting to [email protected](172.18.30.108:22).. 
      Checking slave recovery environment settings..
        Opening /var/lib/mysql/relay-log.info ... ok.
        Relay log found at /var/lib/mysql, up to mariadb-relay-bin.000002
        Temporary relay log file is /var/lib/mysql/mariadb-relay-bin.000002
        Testing mysql connection and privileges.. done.
        Testing mysqlbinlog output.. done.
        Cleaning up test file(s).. done.
    Sat Mar 31 19:19:35 2018 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=172.18.30.109 --slave_ip=172.18.30.109 --slave_port=3306 --workdir=/data/mastermha/app1/ --target_version=5.5.56-MariaDB --manager_version=0.56 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
    Sat Mar 31 19:19:35 2018 - [info]   Connecting to [email protected](172.18.30.109:22).. 
      Checking slave recovery environment settings..
        Opening /var/lib/mysql/relay-log.info ... ok.
        Relay log found at /var/lib/mysql, up to mariadb-relay-bin.000002
        Temporary relay log file is /var/lib/mysql/mariadb-relay-bin.000002
        Testing mysql connection and privileges.. done.
        Testing mysqlbinlog output.. done.
        Cleaning up test file(s).. done.
    Sat Mar 31 19:19:36 2018 - [info] Slaves settings check done.
    Sat Mar 31 19:19:36 2018 - [info] 
    172.18.30.107(172.18.30.107:3306) (current master)
     +--172.18.30.108(172.18.30.108:3306)
     +--172.18.30.109(172.18.30.109:3306)

    Sat Mar 31 19:19:36 2018 - [info] Checking replication health on 172.18.30.108..
    Sat Mar 31 19:19:36 2018 - [info]  ok.
    Sat Mar 31 19:19:36 2018 - [info] Checking replication health on 172.18.30.109..
    Sat Mar 31 19:19:36 2018 - [info]  ok.
    Sat Mar 31 19:19:36 2018 - [warning] master_ip_failover_script is not defined.
    Sat Mar 31 19:19:36 2018 - [warning] shutdown_script is not defined.
    Sat Mar 31 19:19:36 2018 - [info] Got exit code 0 (Not master dead).

    MySQL Replication Health is OK.

7,启动MHA,注意运行这个命令是前台运行,终端会一直卡着

    [root@test ~]# masterha_manager --conf=/etc/mha/app1.cnf
    Sat Mar 31 19:23:26 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Sat Mar 31 19:23:26 2018 - [info] Reading application default configuration from /etc/mha/app1.cnf..
    Sat Mar 31 19:23:26 2018 - [info] Reading server configuration from /etc/mha/app1.cnf..

8,主从切换测试

我们手动宕掉M1主机,在理论上来说MHA会将S1机器该接替Master服务器,并自动踢除M1主机,最终S1为master,S2为slave服务器。

1,宕掉M1上的mariadb服务

    systemctl  stop mariadb

2,登入S1上查看状况

    MariaDB [(none)]> show master logs;
    +--------------------+-----------+
    | Log_name           | File_size |
    +--------------------+-----------+
    | mariadb-bin.000001 |       715 |
    | mariadb-bin.000002 |       245 |
    +--------------------+-----------+
    2 rows in set (0.00 sec)

    MariaDB [(none)]> show slave status;
    Empty set (0.00 sec)

    MariaDB [(none)]> show variables like '%read_only%';
    +------------------+-------+
    | Variable_name    | Value |
    +------------------+-------+
    | innodb_read_only | OFF   |
    | read_only        | OFF   |
    | tx_read_only     | OFF   |
    +------------------+-------+
    3 rows in set (0.00 sec)
    "我们可以看出,MHA已经完成了切换,之前在B主机设置的read_only选项也已经关闭了。"

3,登入S2上查看情况

        MariaDB [(none)]> show slave status\G
    *************************** 1. row ***************************
                   Slave_IO_State: Waiting for master to send event
                      Master_Host: 172.18.30.108
                      Master_User: repluser
                      Master_Port: 3306
                    Connect_Retry: 60
                  Master_Log_File: mariadb-bin.000002
              Read_Master_Log_Pos: 245
                   Relay_Log_File: mariadb-relay-bin.000002
                    Relay_Log_Pos: 531
            Relay_Master_Log_File: mariadb-bin.000002
                 Slave_IO_Running: Yes
                Slave_SQL_Running: Yes
                  Replicate_Do_DB: 
              Replicate_Ignore_DB: 
               Replicate_Do_Table: 
           Replicate_Ignore_Table: 
          Replicate_Wild_Do_Table: 
      Replicate_Wild_Ignore_Table: 
                       Last_Errno: 0
                       Last_Error: 
                     Skip_Counter: 0
              Exec_Master_Log_Pos: 245
                  Relay_Log_Space: 827
                  Until_Condition: None
                   Until_Log_File: 
                    Until_Log_Pos: 0
               Master_SSL_Allowed: No
               Master_SSL_CA_File: 
               Master_SSL_CA_Path: 
                  Master_SSL_Cert: 
                Master_SSL_Cipher: 
                   Master_SSL_Key: 
            Seconds_Behind_Master: 0
    Master_SSL_Verify_Server_Cert: No
                    Last_IO_Errno: 0
                    Last_IO_Error: 
                   Last_SQL_Errno: 0
                   Last_SQL_Error: 
      Replicate_Ignore_Server_Ids: 
                 Master_Server_Id: 2
    1 row in set (0.00 sec)

    "可以看出,S2主机已经将同步服务器改为了S1主机

    测试完成。"

搭建MHA排错过程

错误1

    [root@test ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf
    Sat Mar 31 20:14:26 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
    Sat Mar 31 20:14:26 2018 - [info] Reading application default configuration from /etc/mha/app1.cnf..
    Sat Mar 31 20:14:26 2018 - [info] Reading server configuration from /etc/mha/app1.cnf..
    Sat Mar 31 20:14:26 2018 - [error][/usr/share/perl5/vendor_perl/MHA/Config.pm, ln383] Block name "server" is invalid. Block name must be "server default" or start from "server"(+ non-whitespace characters).
    Block name "server" is invalid. Block name must be "server default" or start from "server"(+ non-whitespace characters). at /usr/share/perl5/vendor_perl/MHA/SSHCheck.pm line 148.

    "注意:这个问题出在自己创建的MHA配置文件中的被管理的节点没有排序[server]下面是错误文件"
    [server default]
    user=mha
    password=centos
    manager_workdir=/data/mastermha/app1/
    manager_log=/data/mastermha/app1/manager.log
    remote_workdir=/data/mastermha/app1/
    ssh_user=root
    repl_user=repluser
    repl_password=centos
    ping_interval=1

    "[server]" <-
    hostname=172.18.30.107
    candidate_master=1

    "[server]" <-
    hostname=172.18.30.108
    candidate_master=1

    "[server]" <-
    hostname=172.18.30.109
    candidate_master=1

    "正确配置,仔细查看有什么配置不同"

    [server default]
    user=mha
    password=centos
    manager_workdir=/data/mastermha/app1/
    manager_log=/data/mastermha/app1/manager.log
    remote_workdir=/data/mastermha/app1/
    ssh_user=root
    repl_user=repluser
    repl_password=centos
    ping_interval=1

    "[server1]"
    hostname=172.18.30.107
    candidate_master=1

    "[server2]"
    hostname=172.18.30.108
    candidate_master=1

    "[server3]"
    hostname=172.18.30.109
    candidate_master=1

错误2

如果出现下面的问题说明是有一个库找不到,需要在每一个节点都创建一个软连接
    [root@centos-MHA ~]# masterha_check_ssh  --conf=/etc/mha/app1.cnf
    Can't locate MHA/SSHCheck.pm in @INC (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /usr/bin/masterha_check_ssh line 25.
    BEGIN failed--compilation aborted at /usr/bin/masterha_check_ssh line 25.

    [root@centos-MHA ~]# ln -s /usr/lib/perl5/vendor_perl/MHA/ /usr/lib64/perl5/vendor_perl/
    [root@centos-M1 ~]# ln -s /usr/lib/perl5/vendor_perl/MHA/ /usr/lib64/perl5/vendor_perl/
    [root@centos-S1 ~]# ln -s /usr/lib/perl5/vendor_perl/MHA/ /usr/lib64/perl5/vendor_perl/
    [root@centos-S2 ~]# ln -s /usr/lib/perl5/vendor_perl/MHA/ /usr/lib64/perl5/vendor_perl/