中午监控报警,某从机ab复制中断,登陆上去

show slave status\G

看到

Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.

sql线程无法读取relay-log,估计relay-log损坏!

尝试重启slave

stop slave;
start slave;

查看状态仍旧是该错误!

查看错误日志

140220 22:26:49 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000173' at position 939494307, relay log '/var/www/logs/mysql/mysql-relay-bin.000535' position: 359274967
140220 22:26:49 [ERROR] Error in Log_event::read_log_event(): 'Event too small', data_len: 0, event_type: 0
140220 22:26:49 [ERROR] Error reading relay log event: slave SQL thread aborted because of I/O error
140220 22:26:49 [ERROR] Slave SQL: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave. Error_code: 1594
140220 22:26:49 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000173' position 939494307
140220 22:26:49 [Note] Slave I/O thread: connected to master '[email protected]:3306',replication started in log 'mysql-bin.000174' at position 623712958

大致意思是主从复制的sql线程已经抛错,建议尝试重启salve复制进程!

主从复制io线程已经到了

Slave I/O thread: connected to master '[email protected]:3306',replication started in log 'mysql-bin.000174' at position 623712958

因为是sql线程出现错误,所以我们可以忽略到此信息,重点关注sql线程的位置

slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'mysql-bin.000173' position 939494307

由于尝试多次重启复制进程仍然无法解决该问题,再次尝试重新change master .(file和position就用上面错误日志中的信息)

stop slave;
 change master to master_log_file='mysql-bin.000173',master_log_pos=939494307;
start slave;

再次查看slave状态,已经正常!