MySQL:MHA测试

测试配置:
[server1]
hostname=db42
candidate_master=1
check_repl_delay=0
port=3307

[server2]
hostname=dbo_100_241 
candidate_master=1
check_repl_delay=0
port=3307

[server3]
hostname=db121
port=3307
#ignore_fail=1

no_master=1


情况1:

主、备主正常,从IO进程down,主down了之后
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
可以正常切换至
From:
db42(192.168.100.42:3307) (current master)
 +--dbo_100_241(192.168.100.241:3307)
 +--db121(192.168.100.121:3307)

To:
dbo_100_241(192.168.100.241:3307) (new master)
 +--db121(192.168.100.121:3307)		


情况2:

主、备主正常,从IO和SQL进程down,主down了之后
             Slave_IO_Running: No
            Slave_SQL_Running: No	
可以正常切换至
From:
dbo_100_241(192.168.100.241:3307) (current master)
 +--db42(192.168.100.42:3307)
 +--db121(192.168.100.121:3307)

To:
db42(192.168.100.42:3307) (new master)
 +--db121(192.168.100.121:3307)


情况3:

主、备主正常,从down了,主down了之后
Server db121(192.168.100.121:3307) is dead, but must be alive! Check server settings.	
MHA不进行切换

情况4:

对从增加参数ignore_fail=1
主、备主正常,从down了,主down了之后
From:
db42(192.168.100.42:3307) (current master)
 +--dbo_100_241(192.168.100.241:3307)

To:
dbo_100_241(192.168.100.241:3307) (new master)		
正常切换	


情况5:

主、备主、从正常,此时备主SQL进程有延迟(>100M),主down
[warning]  Slave dbo_100_241(192.168.100.241:3307) SQL Thread delays too much. Latest log file:mysql-bin.000027:268227005, Current log file:mysql-bin.000027:107. This server is not selected as a new master because recovery will take long time.(文档上说100M,因为备主延迟大于100M,从加了参数no_master=1,所以找不到可用的主,切换失败)

----- Failover Report -----
app1: MySQL Master failover db42(192.168.100.42:3307)
Master db42(192.168.100.42:3307) is down!
Check MHA Manager logs at dbo_100_241:/etc/mha/data/app1.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on db42(192.168.100.42:3307)
The latest slave db121(192.168.100.121:3307) has all relay logs for recovery.
None of existing slaves matches as a new master. Maybe preferred node is misconfigured or all slaves are too  far behind.
Got Error so couldn't continue failover from here.

情况6:

主、备主、从正常,此时备主SQL进程有延迟(<100M),主down

----- Failover Report -----
app1: MySQL Master failover db42(192.168.100.42:3307) to dbo_100_241(192.168.100.241:3307) succeeded
Master db42(192.168.100.42:3307) is down!
Check MHA Manager logs at dbo_100_241:/etc/mha/data/app1.log for details.
Started automated(non-interactive) failover.
Invalidated master IP address on db42(192.168.100.42:3307)
The latest slave dbo_100_241(192.168.100.241:3307) has all relay logs for recovery.
Selected dbo_100_241(192.168.100.241:3307) as a new master.
dbo_100_241(192.168.100.241:3307): OK: Applying all logs succeeded.
dbo_100_241(192.168.100.241:3307): OK: Activated master IP address.
db121(192.168.100.121:3307): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
db121(192.168.100.121:3307): OK: Applying all logs succeeded. Slave started, replicating from dbo_100_241(192.168.100.241:3307)
dbo_100_241(192.168.100.241:3307): Resetting slave info succeeded.
Master failover to dbo_100_241(192.168.100.241:3307) completed successfully.
Fri Feb 26 14:22:10 2016 - [info] Sending mail..
正常切换

情况7:

主、备主、从正常,主有压力情况下,手动维护切主
masterha_master_switch --conf=/etc/mha/app1.cnf --master_state=alive --new_master_host=db42 --new_master_port=3307 --orig_master_is_new_slave --running_updates_limit=10000 --interactive=0

情况8:

后台突然出现网络波动
Sat Feb 27 10:58:30 2016 - [warning] Got timeout on MySQL Ping(SELECT) child process and killed it! at /usr/local/share/perl5/MHA/HealthCheck.pm line 431.
Sat Feb 27 10:58:30 2016 - [info] Executing secondary network check script: /usr/local/bin/masterha_secondary_check -s dbo_100_241 -s db121 -s db42 --user=root --master_host=db42 --master_ip=192.168.100.42 --master_port=3307  --user=root  --master_host=db42  --master_ip=192.168.100.42  --master_port=3307 --master_user=root --master_password=Qe091cs74 --ping_type=SELECT
Sat Feb 27 10:58:30 2016 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/oracle/mysqllog/mysql3307 --output_file=/etc/mha/data/save_binary_logs_test --manager_version=0.57 --binlog_prefix=mysql-bin
Sat Feb 27 10:58:31 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.100.42' (4))
Sat Feb 27 10:58:31 2016 - [warning] Connection failed 2 time(s)..
Sat Feb 27 10:58:32 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.100.42' (4))
Sat Feb 27 10:58:32 2016 - [warning] Connection failed 3 time(s)..
Sat Feb 27 10:58:33 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.100.42' (4))
Sat Feb 27 10:58:33 2016 - [warning] Connection failed 4 time(s)..
Sat Feb 27 10:58:35 2016 - [warning] HealthCheck: Got timeout on checking SSH connection to db42! at /usr/local/share/perl5/MHA/HealthCheck.pm line 342.
Monitoring server dbo_100_241 is reachable, Master is not reachable from dbo_100_241. OK.
ssh: connect to host db121 port 22: No route to host
Monitoring server db121 is NOT reachable!
Sat Feb 27 10:58:38 2016 - [warning] At least one of monitoring servers is not reachable from this script. This is likely a network problem. Failover should not happen.
Sat Feb 27 10:58:38 2016 - [warning] Secondary network check script returned errors. Failover should not start so checking server status again. Check network settings for details.
.
.
.
Sat Feb 27 10:59:06 2016 - [warning] Secondary network check script returned errors. Failover should not start so checking server status again. Check network settings for details.
Sat Feb 27 10:59:07 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.100.42' (4))
Sat Feb 27 10:59:07 2016 - [warning] Connection failed 1 time(s)..
Sat Feb 27 10:59:07 2016 - [info] Executing secondary network check script: /usr/local/bin/masterha_secondary_check -s dbo_100_241 -s db121 -s db42 --user=root --master_host=db42 --master_ip=192.168.100.42 --master_port=3307  --user=root  --master_host=db42  --master_ip=192.168.100.42  --master_port=3307 --master_user=root --master_password=Qe091cs74 --ping_type=SELECT
Sat Feb 27 10:59:07 2016 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/oracle/mysqllog/mysql3307 --output_file=/etc/mha/data/save_binary_logs_test --manager_version=0.57 --binlog_prefix=mysql-bin
Sat Feb 27 10:59:08 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.100.42' (4))
Sat Feb 27 10:59:08 2016 - [warning] Connection failed 2 time(s)..
Sat Feb 27 10:59:09 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.100.42' (4))
Sat Feb 27 10:59:09 2016 - [warning] Connection failed 3 time(s)..
Sat Feb 27 10:59:10 2016 - [warning] Got error on MySQL connect: 2013 (Lost connection to MySQL server at 'reading initial communication packet', system error: 113)
Sat Feb 27 10:59:10 2016 - [warning] Connection failed 4 time(s)..
Sat Feb 27 10:59:12 2016 - [warning] HealthCheck: Got timeout on checking SSH connection to db42! at /usr/local/share/perl5/MHA/HealthCheck.pm line 342.
Master is reachable from dbo_100_241!
Sat Feb 27 10:59:12 2016 - [warning] Master is reachable from at least one of other monitoring servers. Failover should not happen.
Sat Feb 27 10:59:12 2016 - [warning] Secondary network check script returned errors. Failover should not start so checking server status again. Check network settings for details.
Sat Feb 27 10:59:12 2016 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..


问题总结:

Keepalived在主和备主只能起一个。
切换完注意检查,从的read_only,relay_log_purge

切换步骤总结:

Phase 1: Configuration Check Phase..
Phase 2: Dead Master Shutdown Phase..
         Forcing shutdown so that applications never connect to the current master..
         /etc/mha/scripts/master_ip_failover --orig_master_host=db42 --orig_master_ip=192.168.100.42 --orig_master_port=3307 --command=stopssh --ssh_user=root
         Disabling the VIP on old master: db42			
Phase 3: Master Recovery Phase..		
Phase 3.1: Getting Latest Slaves Phase..
           The latest binary log file/position on all slaves is mysql-bin.000038:940
		   Latest slaves (Slaves that received relay log files to the latest):
		   ...
		   The oldest binary log file/position on all slaves is mysql-bin.000038:848
		   Oldest slaves:
		   ...
Phase 3.2: Saving Dead Master's Binlog Phase..
           Executing command on the dead master save_binary_logs --command=save...
		   scp from dead master...
Phase 3.3: Determining New Master Phase..
           Finding the latest slave that has all relay logs for recovering other slaves..
		   Executing command: apply_diff_relay_logs --command=find --latest_mlf=mysql-bin.000038 --latest_rmlp=940
Phase 3.3: New Master Diff Log Generation Phase..
           Sending binlog..
		   scp from local:/etc/mha/data/...
Phase 3.4: Master Log Apply Phase..
           Applying differential binary/relay
		   Enabling the VIP -  on the new master -  
Phase 4: Slaves Recovery Phase..         
Phase 4.1: Starting Parallel Slave Diff Log Generation Phase..
           Generating relay diff files from the latest slave succeeded.
Phase 4.2: Starting Parallel Slave Log Apply Phase..
           All new slave servers recovered successfully.
Phase 5: New master cleanup phase..	
	     Resetting slave info on the new master.


命令杂:

masterha_manager -conf=/etc/mha/app1.cnf --ignore_last_failover > /tmp/app.log 2>&1 &

/usr/local/mysql55/bin/mysql -uroot -pQe091cs74 -h127.0.0.1 -P3307
 CHANGE MASTER TO MASTER_HOST='192.168.100.241', MASTER_PORT=3307, MASTER_LOG_FILE='mysql-bin.000012', MASTER_LOG_POS=107, MASTER_USER='repl', MASTER_PASSWORD='repl';
 start slave;
show slave status\G		

/usr/local/mysql55/bin/mysql -uroot -pQe091cs74 -h127.0.0.1 -P3307
 CHANGE MASTER TO MASTER_HOST='192.168.100.42', MASTER_PORT=3307, MASTER_LOG_FILE='mysql-bin.000025', MASTER_LOG_POS=107, MASTER_USER='repl', MASTER_PASSWORD='repl';
 start slave;
show slave status\G	

masterha_master_switch --conf=/etc/mha/app1.cnf --master_state=alive --new_master_host=db42 --new_master_port=3307 --orig_master_is_new_slave --running_updates_limit=10000 --interactive=0

time sysbench --test=oltp --mysql-user=root --mysql-host=192.168.100.199 --mysql-password=Qe091cs74 --mysql-port=3307 --mysql-db=test\
--oltp-test-mode=complex --mysql-table-engine=innodb --oltp-table-size=100000000 --mysql-db=test \
--oltp-table-name=INNODB_T7 --max-time=600 prepare &





你可能感兴趣的:(MySQL)