mysql GTID模式跳过错误GTID事务的正确方法

    最近遇到一个Mysql由于create table时由于从库表表空间问题,从库sql重放进程执行sql失败,导致从库的sql应用进程挂掉。客户反馈,可以跳过失败的create table事务,本博文展示mysql 基于GITD模式跳过错误CTID事务的正确方法。

    1.0 明确错误原因

root@mysqldb 15:53:  [(none)]> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.56.2
                  Master_User: repl
                  Master_Port: 9506
                Connect_Retry: 60
              Master_Log_File: mybinlog.006091
          Read_Master_Log_Pos: 768927745
               Relay_Log_File: TEST-DB-2-relay-bin.009546
                Relay_Log_Pos: 543586669
        Relay_Master_Log_File: mybinlog.006090
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1813
                   Last_Error: Error 'Tablespace '`test`.`p_gov_......_temp`' exists.' on query. Default database: 'test'. Query: 'CREATE TABLE `p_gov_......_temp` (
 ......
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 543586536
              Relay_Log_Space: 10836318250
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 1813
               Last_SQL_Error: Error 'Tablespace '`test`.`p_gov_......_temp`' exists.' on query. Default database: 'test'. Query: 'CREATE TABLE `p_gov_......_temp` (
......
             Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 9506002
                  Master_UUID: 4efcbd9f-c1e2-11eb-b710-0050569e2112
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: 
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 230607 15:40:49
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 4efcbd9f-c1e2-11eb-b710-0050569e2112:1217191452-1442042781
            Executed_Gtid_Set: 24fce019-edde-11ec-8f9e-0050569e562f:1-26407,
4efcbd9f-c1e2-11eb-b710-0050569e2112:1-1441877106:1441877108-1441877109,
54c6f166-c1e4-11eb-b023-0050569e562f:1
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
       Master_public_key_path: 
        Get_master_public_key: 0
            Network_Namespace: 
1 row in set (0.00 sec)

root@mysqldb 15:53:  [(none)]> 

    通过观察show slave status可以看到从库192.168.56.2的Slave_Sql由于1813错误,具体原因是表空间问题导致create table失败,引起从库Slave_Sql进程挂起。

    2.0 问题处理方法

    经过沟通,允许跳过失败的GTID事务。至于如何跳过GITD事务,网上有很多资料,有的从show slave status输出结果中跳过Retrieved_Gtid_Set,有的是Executed_Gtid_Set拼凑的,可以说均是错误的。另外,有些说从binlog里边挖,其实没有那么麻烦,要跳过失败事务的GTID可以通过sql直接查询到。

root@mysqldb 16:20:  [(none)]> select * from performance_schema.replication_applier_status_by_worker\G
*************************** 1. row ***************************
                                           CHANNEL_NAME: 
                                              WORKER_ID: 0
                                              THREAD_ID: NULL
                                          SERVICE_STATE: OFF
                                      LAST_ERROR_NUMBER: 1813
                                     LAST_ERROR_MESSAGE: Error 'Tablespace '`test`.`p_gov_......_temp`' exists.' on query. Default database: 'test'. Query: 'CREATE TABLE `p_gov_......_temp` (
 ......
 															 LAST_ERROR_TIMESTAMP: 2023-06-07 16:11:59.433420
                               LAST_APPLIED_TRANSACTION: 
     LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
    LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 0000-00-00 00:00:00.000000
         LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP: 0000-00-00 00:00:00.000000
           LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP: 0000-00-00 00:00:00.000000
                                   APPLYING_TRANSACTION: 4efcbd9f-c1e2-11eb-b710-0050569e2112:1441877107 《《《《《《《《《《《要跳过的事务
         APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2023-06-07 14:42:22.913515
        APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2023-06-07 14:42:22.913515
             APPLYING_TRANSACTION_START_APPLY_TIMESTAMP: 2023-06-07 16:11:59.392908
                 LAST_APPLIED_TRANSACTION_RETRIES_COUNT: 0
   LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_NUMBER: 0
  LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_MESSAGE: 
LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
                     APPLYING_TRANSACTION_RETRIES_COUNT: 0
       APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_NUMBER: 0
      APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_MESSAGE: 
    APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
1 row in set (0.00 sec)

查询结果中 APPLYING_TRANSACTION: 4efcbd9f-c1e2-11eb-b710-0050569e2112:1441877107 就是要跳过的事务GTID号。

3.0 GTID跳过失败事务执行过程

root@mysqldb 16:31:  [(none)]>  SET GTID_NEXT='4efcbd9f-c1e2-11eb-b710-0050569e2112:1441877107';
Query OK, 0 rows affected (0.00 sec)

root@mysqldb 16:33:  [(none)]> select @@gtid_next;
+-------------------------------------------------+
| @@gtid_next                                     |
+-------------------------------------------------+
| 4efcbd9f-c1e2-11eb-b710-0050569e2112:1441877107 |
+-------------------------------------------------+
1 row in set (0.00 sec)

root@mysqldb 16:33:  [(none)]> BEGIN; COMMIT;
Query OK, 0 rows affected (0.01 sec)

Query OK, 0 rows affected (0.01 sec)

root@mysqldb 16:33:  [(none)]> select @@gtid_next;
+-------------------------------------------------+
| @@gtid_next                                     |
+-------------------------------------------------+
| 4efcbd9f-c1e2-11eb-b710-0050569e2112:1441877107 |
+-------------------------------------------------+
1 row in set (0.00 sec)

root@mysqldb 16:33:  [(none)]> SET SESSION GTID_NEXT = AUTOMATIC;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb 16:34:  [(none)]> select @@gtid_next;
+-------------+
| @@gtid_next |
+-------------+
| AUTOMATIC   |
+-------------+
1 row in set (0.00 sec)

root@mysqldb 16:34:  [(none)]> 
root@mysqldb 16:34:  [(none)]> 
root@mysqldb 16:34:  [(none)]> start slave;
Query OK, 0 rows affected (0.00 sec)

root@mysqldb 16:34:  [(none)]> 

root@mysqldb 16:34:  [(none)]> show slave statuS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.56.2
                  Master_User: repl
                  Master_Port: 9506
                Connect_Retry: 60
              Master_Log_File: mybinlog.006092
          Read_Master_Log_Pos: 726558126
               Relay_Log_File: TEST-DB-2-relay-bin.009546
                Relay_Log_Pos: 640919944
        Relay_Master_Log_File: mybinlog.006090
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 640919811
              Relay_Log_Space: 11867704299
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 6329
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 9506002
                  Master_UUID: 4efcbd9f-c1e2-11eb-b710-0050569e2112
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Reading event from the relay log
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 4efcbd9f-c1e2-11eb-b710-0050569e2112:1217191452-1442167357
            Executed_Gtid_Set: 24fce019-edde-11ec-8f9e-0050569e562f:1-26407,
4efcbd9f-c1e2-11eb-b710-0050569e2112:1-1441891442,
54c6f166-c1e4-11eb-b023-0050569e562f:1
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
       Master_public_key_path: 
        Get_master_public_key: 0
            Network_Namespace: 
1 row in set (0.00 sec)

root@mysqldb 16:34:  [(none)]> 

你可能感兴趣的:(mysql,数据库,GTID,1813,slave)