主从复制1062错误解决方法

当复制中断的时候,我们常用的方法是跳过错误,比如SET GLOBAL SQL_SLAVE_SKIP_COUNTER =1或者直接slave-skip-errors=1062这样确实解决了问题,恢复了复制。但是久而久之主从数据相差就很大了。对于复制正常以后,我们还需要使用pt-table-checksum数据效验,以及pt-table-sync同步数据,今天线上一个从库中断,查看了错误,发现是主键冲突,至于为何会发现这些情况,请阅读MySQL Replication需要注意的问题

Last_Error: Error 'Duplicate entry '192442' for key 'PRIMARY'' on query. Default database: 'xxxxxxxxxxx'. Query: 'INSERT INTO xxxxxxxxxxxxx(playerId, `type`, `count`)
  VALUES( NAME_CONST('pPlayerId',629014986),  NAME_CONST('pType',8), 0)'
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1499475
              Relay_Log_Space: 4829077512
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1062
               Last_SQL_Error: Error 'Duplicate entry '192442' for key 'PRIMARY'' on query. Default database: 'xxxxxxxxxxxxxx'. Query: 'INSERT INTO xxxxxxxxxxx(playerId, `type`, `count`)
  VALUES( NAME_CONST('pPlayerId',629014986),  NAME_CONST('pType',8), 0)'
1 row in set (0.00 sec)

于是根据提示,以及查看表结构,发现果然是主键冲突了,但是表的记录却不一致,所以往往我们跳过错误就会导致主从数据不一致的问题。

主库上的记录:

mysql> select * from xxxx.xxxx where id=192442;
+--------+-----------+------+-------+-------+---------------+
| id     | playerId  | type | count | total | lastResetTime |
+--------+-----------+------+-------+-------+---------------+
| 192442 | 629014986 |    8 |     0 |     0 |             0 |
+--------+-----------+------+-------+-------+---------------+
1 row in set (0.00 sec)

mysql>

从库上的记录:

mysql> select * from xxxx.xxxx where id=192442;
+--------+-----------+------+-------+-------+---------------+
| id     | playerId  | type | count | total | lastResetTime |
+--------+-----------+------+-------+-------+---------------+
| 192442 | 629015414 |    8 |     0 |     0 |             0 |
+--------+-----------+------+-------+-------+---------------+
1 row in set (0.00 sec)

mysql>

有时候我会手动删除从库上提示的相应记录,但是往往是比较麻烦的。因为可能不止一条记录重复,可能N条,所以我就简单的写了一个脚本,删除主键冲突的记录,这个脚本只适合主键冲突的情况。

#!/bin/bash
#Delete duplicate records primary key conflict
#Write by yayun 2014-05-17

mysql=/usr/local/mysql-5.1.66/bin/mysql
sock=/data/mysql-slave-3311/mysql.sock
passwd=123456

while true
do
    SQL_THREAD=`$mysql -uroot -p$passwd -S $sock -e 'show slave status\G' | egrep 'Slave_SQL_Running' | awk '{print $2}'`
    LAST_ERROR=`$mysql -uroot -p$passwd -S $sock -e 'show slave status\G' | egrep Last_Errno | awk '{print $2}'`
    duplicate=`$mysql -uroot -p$passwd -S $sock -e 'show slave status\G' | grep Last_Error | awk '/Duplicate entry/{print $5}' | awk -F "'" '{print $2}'`
    DATABASE=`$mysql -uroot -p$passwd -S $sock -e 'show slave status\G' | grep Last_Error | awk '{print $13}' | awk -F "'" '{print $2}'`
    TABLE=`$mysql -uroot -p$passwd -S $sock -e 'show slave status\G' | grep Last_Error | awk -F ":" '{print $4}' | awk -F "(" '{print $1}' | awk '{print $NF}'`

    $mysql -uroot -p$passwd -S $sock -e 'show slave status\G' | grep HA_ERR_FOUND_DUPP_KEY
    if [ $? -eq 1 ]
    then
        if [ "$SQL_THREAD" == No ] && [ "$LAST_ERROR" == 1062 ]
        then
            FILED=`$mysql -uroot -p$passwd -S $sock -Nse "desc $DATABASE.$TABLE" | grep PRI | awk '{print $1}'`
            $mysql -uroot -p$passwd -S $sock -e "delete from $DATABASE.$TABLE where $FILED=$duplicate"
            $mysql -uroot -p$passwd -S $sock -e "start slave sql_thread"
        else
            echo "====================== ok ========================"
            $mysql -uroot -p$passwd -S $sock -e 'show slave status\G' | egrep 'Slave_.*_Running'
            echo "====================== ok ========================"
            break
        fi
    fi
done

脚本随便写的。大家可以再自己相应的扩展一下。^_^

你可能感兴趣的:(方法)