上班收到报警邮件,线上一台slave复制报错导致备份失败,查看复制状态:
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Queueing master event to the relay log
Master_Host: 192.xx.xxx.xxx
Master_User: xxx
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: xxxx-bin.000554
Read_Master_Log_Pos: 184008708
Relay_Log_File: relay-bin.000577
Relay_Log_Pos: 164421592
Relay_Master_Log_File:xxx-bin.000538
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1594
Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Skip_Counter: 0
Exec_Master_Log_Pos: 164421444
Relay_Log_Space: 17364819187
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1594
Last_SQL_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2018128
1 row in set (0.00 sec)
根据报错提示,slave sql 进程解析relay log异常。根据提示使用mysqlbinlog 工具解析对应的relay log,同样报异常,很显然relay log文件损坏。
通过复制重新配置予以解决,但是这个slave本身落后master有一段距离,从show slave status看有几个log_file,和pos。重新配置复制到底从哪个
log_file和pos开始配置呢,先看相应参数的含义:
Master_Log_File
The name of the master binary log file from which the I/O thread is currently reading.
slave的IO线程当前正在读取的master二进制日志文件名。
Relay_Master_Log_File
The name of the master binary log file containing the most recent event executed by the SQL thread.
slave的Sql线程最近执行的master二进制日志文件名。(该文件有可能是滞后于IO线程正在读取的二进制日志文件)
Read_Master_Log_Pos
The position in the current master binary log file up to which the I/O thread has read.
Exec_Master_Log_Pos
The position in the current master binary log file to which the SQL thread has read and executed, marking the start of the next transaction or event to be processed. You can use this value with the CHANGE MASTER TO statement's MASTER_LOG_POS option when starting a new slave from an existing slave, so that the new slave reads from this point. The coordinates given by (Relay_Master_Log_File, Exec_Master_Log_Pos) in the master's binary log correspond to the coordinates given by (Relay_Log_File, Relay_Log_Pos) in the relay log.
slave的Sql线程已经读并且执行的master二进制日志文件的位置,标记下一个被执行的事务或事件的开始位置。
原则上,如果slave不落后master的话,Master_Log_File和Relay_Master_Log_File 应该是一致的,而且Read_Master_Log_Pos 和Exec_Master_Log_Pos 也应该是一致的。不落后的情况下,我们使用Master_Log_File,Read_Master_Log_Pos
重新配置复制即可。
但是这里已经落后,很显然不能使用上面两个值来配置复制了。那我们就可以用目前slave SQL线程执行到的log和pos也就是(Relay_Master_Log_File、Exec_Master_Log_Pos)来进行配置,
从这个位置重新去主库拉取日志。当然之前传过来的日志会自动被清理掉。
因此这里就使用下面的语句进行复制重新配置:
change master to master_host='192.xxx.xxx.xxx',master_port=3306, master_user='xxx',master_password='xxxx',master_log_file='xxx-bin.000538',master_log_pos=164421444;
start slave后,依旧落后很多。慢慢追吧。故障得以处理。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/22418990/viewspace-1302120/,如需转载,请注明出处,否则将追究法律责任。