不知道mysql数据库从哪个版本开始,开始提供半同步复制。
MySQL复制默认使用异步复制。主库将events写入binlog文件但是并不判断其备库是否收到这个event并且是否处理了它。
在异步复制中,如果主备宕机,那么被提交了的事务有可能没有被复制到备库。而此时,如果failover之后,应用请求切换到备库上,那么就会导致数据丢失。
而半同步复制是这样的,主库在接收到客户端的commit请求之后,会将事务event写入binlog,并且发送到slave端的io线程,然后等待。
而slave端成功收到这个event并且将这个event写入磁盘上的重演文件中,再发回一个ack确认信息给主库端,主库收到这个ack信息之后就认为
slave端成功接收并磁盘化,此时主库就会返回客户端commit成功。
根据主库端 rpl_semi_sync_master_wait_for_slave_count = N (默认为1)参数的配置,主库会等待n个slave返回ack信息时才认为成功commit并返回给客户端。
不过根据文档描述,如果主库在等待ack消息时超时,那么半同步复制会自动转为异步复制。(这个功能不tm扯淡了!!!)
下图中红色圈住的ack就是slave的io线程返回的信号:
下面看半同步复制的配置步骤及相关实验:
在master端添加semisync_master.dll插件
mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.dll';
Query OK, 0 rows affected (0.28 sec)
mysql>
在slave端添加semisync_slave.dll插件
mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.dll';
Query OK, 0 rows affected (0.15 sec)
mysql>
然后关闭master和slave,然后在master端添加如下参数
rpl_semi_sync_master_enabled = 1 #启用半同步复制
rpl_semi_sync_master_timeout = 1000 #master端等待slave的ack的超时时间,单位毫秒
rpl_semi_sync_master_trace_level = 16 #master端半同步复制的debug信息,默认32。这里16是因为需要看更多的信息
rpl_semi_sync_master_wait_for_slave_count = 1 #master端等待几个slave端返回ack,默认1个
rpl_semi_sync_master_wait_no_slave = on #master在接收ack信号超时后转为异步复制,该参数指定当slave赶上master之后,是否转为半同步复制。off就是不转换,一直为异步复制
rpl_semi_sync_master_wait_point = AFTER_SYNC #表示master在收到ack信号后才提交更改到存储引擎并返回客户端,另一个配置是AFTER_COMMIT,表示master先提交到存储引擎,再等待接收ack信号并返回给客户端
在slave端添加如下参数
rpl_semi_sync_slave_enabled = 1 #同master一样,slave端也要启用半同步复制
rpl_semi_sync_slave_trace_level = 16 #和master一样
另外我的master端和slave端还加入如下参数
gtid_mode = ON #启动gtids复制
enforce_gtid_consistency = ON
表示启用gtid复制。
启动master和slave数据库,然后配置复制环境,master端:
mysql> flush tables with read lock;
Query OK, 0 rows affected (0.03 sec)
mysql> show master status\G
*************************** 1. row ***************************
File: mysql-bin.000027
Position: 194
Binlog_Do_DB:
Binlog_Ignore_DB: cms
Executed_Gtid_Set: e2e2f927-e75c-11e5-ac89-5c260a17ccde:1-7369
1 row in set (0.04 sec)
mysql> unlock tables;
Query OK, 0 rows affected (0.00 sec)
mysql>
mysql> show status like 'rpl_semi%';
+--------------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients | 1 | #表示有一个slave
| Rpl_semi_sync_master_net_avg_wait_time | 0 |
| Rpl_semi_sync_master_net_wait_time | 0 |
| Rpl_semi_sync_master_net_waits | 0 |
| Rpl_semi_sync_master_no_times | 0 |
| Rpl_semi_sync_master_no_tx | 0 |
| Rpl_semi_sync_master_status | ON | #表示当前是半同步复制
| Rpl_semi_sync_master_timefunc_failures | 0 |
| Rpl_semi_sync_master_tx_avg_wait_time | 0 |
| Rpl_semi_sync_master_tx_wait_time | 0 |
| Rpl_semi_sync_master_tx_waits | 0 |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0 |
| Rpl_semi_sync_master_wait_sessions | 0 |
| Rpl_semi_sync_master_yes_tx | 0 |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)
mysql> show variables like 'rpl_semi%';
+-------------------------------------------+------------+
| Variable_name | Value |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled | ON |
| rpl_semi_sync_master_timeout | 1000 |
| rpl_semi_sync_master_trace_level | 16 |
| rpl_semi_sync_master_wait_for_slave_count | 1 |
| rpl_semi_sync_master_wait_no_slave | ON |
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
+-------------------------------------------+------------+
6 rows in set, 1 warning (0.00 sec)
mysql>
slave端:
mysql> show slave status\G
Empty set (0.02 sec)
mysql> CHANGE MASTER TO MASTER_HOST='192.168.1.77',MASTER_PORT=3306,MASTER_USER='backup',MASTER_PASSWORD='backup',MASTER_AUTO_POSITION=1;
Query OK, 0 rows affected, 2 warnings (0.44 sec)
mysql> start slave;
Query OK, 0 rows affected (0.13 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.77
Master_User: backup
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000027
Read_Master_Log_Pos: 194
Relay_Log_File: host_name-relay-bin.000004
Relay_Log_Pos: 407
Relay_Master_Log_File: mysql-bin.000027
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 194
Relay_Log_Space: 871
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: e2e2f927-e75c-11e5-ac89-5c260a17ccde
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: e2e2f927-e75c-11e5-ac89-5c260a17ccde:7359-7369
Executed_Gtid_Set: 1cced57b-e75e-11e5-b742-5c260a17ccde:1-21,
e2e2f927-e75c-11e5-ac89-5c260a17ccde:1-7369
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
mysql>
mysql> show status like 'rpl_semi%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| Rpl_semi_sync_slave_status | ON |
+----------------------------+-------+
1 row in set (0.06 sec)
mysql> show variables like 'rpl_semi%';
+---------------------------------+-------+
| Variable_name | Value |
+---------------------------------+-------+
| rpl_semi_sync_slave_enabled | ON |
| rpl_semi_sync_slave_trace_level | 16 |
+---------------------------------+-------+
2 rows in set, 1 warning (0.04 sec)
mysql>
基于gtid的半同步复制环境配置成功。
下来测试ack信号超时的情况
在slave端停止io线程:
mysql> stop slave io_thread;
Query OK, 0 rows affected (0.11 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.1.77
Master_User: backup
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000027
Read_Master_Log_Pos: 194
Relay_Log_File: host_name-relay-bin.000004
Relay_Log_Pos: 407
Relay_Master_Log_File: mysql-bin.000027
Slave_IO_Running: No #io线程被停止
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 194
Relay_Log_Space: 871
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: e2e2f927-e75c-11e5-ac89-5c260a17ccde
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: e2e2f927-e75c-11e5-ac89-5c260a17ccde:7359-7369
Executed_Gtid_Set: 1cced57b-e75e-11e5-b742-5c260a17ccde:1-21,
e2e2f927-e75c-11e5-ac89-5c260a17ccde:1-7369
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
mysql>
slave端io线程停止后,在master端执行一个insert语句,看看效果:
mysql> INSERT INTO `test`.`t1` (`id`, `category`, `num`) VALUES ('12', '4', '12')
Query OK, 1 row affected (1.08 sec) #这里有1秒的卡顿
mysql> show status like 'rpl_semi%';
+--------------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients | 0 | #slave端数量为0
| Rpl_semi_sync_master_net_avg_wait_time | 0 |
| Rpl_semi_sync_master_net_wait_time | 0 |
| Rpl_semi_sync_master_net_waits | 0 |
| Rpl_semi_sync_master_no_times | 1 |
| Rpl_semi_sync_master_no_tx | 1 |
| Rpl_semi_sync_master_status | OFF | #可见,复制已经转为异步复制
| Rpl_semi_sync_master_timefunc_failures | 0 |
| Rpl_semi_sync_master_tx_avg_wait_time | 0 |
| Rpl_semi_sync_master_tx_wait_time | 0 |
| Rpl_semi_sync_master_tx_waits | 0 |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0 |
| Rpl_semi_sync_master_wait_sessions | 0 |
| Rpl_semi_sync_master_yes_tx | 0 |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)
mysql> show variables like 'rpl_semi%';
+-------------------------------------------+------------+
| Variable_name | Value |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled | ON |
| rpl_semi_sync_master_timeout | 1000 |
| rpl_semi_sync_master_trace_level | 16 |
| rpl_semi_sync_master_wait_for_slave_count | 1 |
| rpl_semi_sync_master_wait_no_slave | ON |
| rpl_semi_sync_master_wait_point | AFTER_SYNC |
+-------------------------------------------+------------+
6 rows in set, 1 warning (0.00 sec)
mysql>
此时看看master的err日志:
2016-03-24T02:32:07.561507Z 2 [Warning] Timeout waiting for reply of binlog (file: mysql-bin.000027, pos: 457), semi-sync up to file , position 4.
2016-03-24T02:32:07.562507Z 2 [Note] Semi-sync replication switched OFF.
很明显,master端由于超时,主动将半同步复制转为OFF,即异步复制。
下面看看slave端io线程启动后,是否自动转为半同步复制:
mysql> start slave io_thread;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.77
Master_User: backup
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000027
Read_Master_Log_Pos: 457
Relay_Log_File: host_name-relay-bin.000005
Relay_Log_Pos: 670
Relay_Master_Log_File: mysql-bin.000027
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 457
Relay_Log_Space: 1134
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: e2e2f927-e75c-11e5-ac89-5c260a17ccde
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set: e2e2f927-e75c-11e5-ac89-5c260a17ccde:7359-7370
Executed_Gtid_Set: 1cced57b-e75e-11e5-b742-5c260a17ccde:1-21,
e2e2f927-e75c-11e5-ac89-5c260a17ccde:1-7370
Auto_Position: 1
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
mysql>
看看master的状态:
mysql> show status like 'rpl_semi%';
+--------------------------------------------+-------+
| Variable_name | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients | 1 | #slave端数量为1
| Rpl_semi_sync_master_net_avg_wait_time | 0 |
| Rpl_semi_sync_master_net_wait_time | 0 |
| Rpl_semi_sync_master_net_waits | 1 |
| Rpl_semi_sync_master_no_times | 1 |
| Rpl_semi_sync_master_no_tx | 1 |
| Rpl_semi_sync_master_status | ON | #已经成功转换为半同步复制
| Rpl_semi_sync_master_timefunc_failures | 0 |
| Rpl_semi_sync_master_tx_avg_wait_time | 0 |
| Rpl_semi_sync_master_tx_wait_time | 0 |
| Rpl_semi_sync_master_tx_waits | 0 |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0 |
| Rpl_semi_sync_master_wait_sessions | 0 |
| Rpl_semi_sync_master_yes_tx | 0 |
+--------------------------------------------+-------+
14 rows in set (0.00 sec)
mysql>
而同样master端有如下日志:
2016-03-24T02:35:25.716840Z 5 [Note] Start binlog_dump to master_thread_id(5) slave_server(1), pos(, 4)
2016-03-24T02:35:25.717840Z 5 [Note] ReplSemiSyncMaster::reportReplyBinlog: Got reply at (, 4)
2016-03-24T02:35:25.718841Z 5 [Note] Start semi-sync binlog_dump to slave (server_id: 1), pos(, 4)
2016-03-24T02:35:25.719841Z 5 [Note] ReplSemiSyncMaster::updateSyncHeader: server(1), (mysql-bin.000027, 194) sync(0), repl(0)
2016-03-24T02:35:25.720841Z 5 [Note] ReplSemiSyncMaster::updateSyncHeader: server(1), (mysql-bin.000027, 259) sync(0), repl(0)
2016-03-24T02:35:25.722841Z 5 [Note] ReplSemiSyncMaster::updateSyncHeader: server(1), (mysql-bin.000027, 331) sync(0), repl(0)
2016-03-24T02:35:25.723841Z 5 [Note] ReplSemiSyncMaster::updateSyncHeader: server(1), (mysql-bin.000027, 380) sync(0), repl(0)
2016-03-24T02:35:25.725841Z 5 [Note] ReplSemiSyncMaster::updateSyncHeader: server(1), (mysql-bin.000027, 426) sync(0), repl(0)
2016-03-24T02:35:25.726841Z 5 [Note] ReplSemiSyncMaster::updateSyncHeader: server(1), (mysql-bin.000027, 457) sync(1), repl(0)
2016-03-24T02:35:25.929853Z 0 [Note] ReplSemiSyncMaster::reportReplyPacket: Got reply(mysql-bin.000027, 457) from server 1
2016-03-24T02:35:25.930853Z 0 [Note] Semi-sync replication switched ON at (mysql-bin.000027, 457)
生产上切记将rpl_semi_sync_master_trace_level 设置为32,不然semi-rpl日志太大。
总结,半同步复制:
0. 半同步复制不能配置到多源复制中,就是说如果当前复制环境是多源复制,那么只有默认通道上的复制才能配置为半同步复制;
1. 这个功能好。因为master会等待(即使超时也commit给客户端)slave的io线程返回ack信号,所以event被传输到slave端(虽然没有被sql重演),故而称着“半同步复制”;
2. 由于master端会判断超时,并且超时后还是会成功返回给客户端,而超时后event并没有成功传输到slave端,所以这个机制其实是坏作用的。应该是master没有收到ack时,返回给客户端事务失败的信息,并做相关回滚(但是由于binlog和engine的设计,这可能变得非常困难);
3. 如果io线程再等待sql线程的ack信号(就是sql线程成功重演event并提交到engine后),那么半同步复制实际就是同步复制了;
4. 期待半同步复制变成同步复制,但是回顾5.5到5.7,其实半同步一直存在。而为何一直无法实现同步复制呢?是不想实现,还是实现不了?大家猜猜吧。