首页 > 技术文章 > rpl_semi_sync_master_wait_no_slave 参数研究实验

konggg 2020-01-17 14:08 原文

最近在研究MySQL,刚学到半同步。

半同步的配置中,关于这两个参数:

  rpl_semi_sync_master_wait_no_slave

  rpl_semi_sync_master_wait_for_slave_count

发现很不好搞懂,请教了一些老师,也做了一些资料搜索,每个人给我的答案都不同:

  

中文网络上一些典型的说法,这些都是错误的!!!
rpl_semi_sync_master_wait_for_slave_count:
  控制slave应答的数量,默认是1,表示master接收到几个slave应答后才commit。
   rpl_semi_sync_master_wait_no_slave :
  1.需要等待几个slave节点的ACK,否则一直waiting。
  2.当一个事务被提交,但是Master没有Slave连接,这时M不可能收到任何确认信息,但M会在时间限制范围内继续等待。如果没有Slave链接,会切换到异步复制。是否允许master每个事务提交后都要等待slave的接收确认信号。默认为on,每一个事务都会等待。如果为off,则slave追赶上后,也不会开启半同步复制模式,需要手工开启。

 

好奇心驱使我通过实验来验证大家的说法,可惜的是——都是错的。通过官方文档,看的云里雾里的,只好自己探索。

 

探索内容

  半同步参数

  • rpl_semi_sync_master_wait_no_slave(重点)
  • rpl_semi_sync_master_wait_for_slave_count

环境信息

role

ip

port

hostname

etc

master

192.168.188.101

4306

mysqlvm1

提示符为mysql>

 

 

 

 

 

slave

192.168.188.201

4306

mysqlvm1-1

提示符为mysql4306>

 

 

5306

 

提示符为mysql5306>

 

 

6306

 

提示符为mysql6306>

 

 

7306

 

提示符为mysql7306>

 

MySQL版本

  5.7.26

 

前置条件

  已配置好主从复制。

 

配置增强半同步

  1.加载lib,所有主从节点都要配置。

    主库&从库:

install plugin rpl_semi_sync_master soname 'semisync_master.so';

install plugin rpl_semi_sync_slave soname 'semisync_slave.so';   

 

      

  2.查看,确保所有节点都成功加载。

    mysql> show plugins;

      | rpl_semi_sync_master       | ACTIVE   | REPLICATION        | semisync_master.so | GPL     |

      | rpl_semi_sync_slave        | ACTIVE   | REPLICATION        | semisync_slave.so  | GPL     |

 

 

  3.启用半同步

    1.先启用从库上的参数,最后启用主库的参数。

    从库:

set global rpl_semi_sync_slave_enabled = 1;   # 1:启用,0:禁止

 

    主库:

        

set global rpl_semi_sync_master_enabled = 1;   # 1:启用,0:禁止

set global rpl_semi_sync_master_timeout = 60000;       # 60秒,时间长些便于实验

 

    2.从库重启io_thread

    stop slave io_thread;

    start slave io_thread;

 

    查看主库参数

      

mysql> show global variables like "%sync%"; show global status like "%sync%";

+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| binlog_group_commit_sync_delay            | 100        |
| binlog_group_commit_sync_no_delay_count   | 10         |
| innodb_flush_sync                         | ON         |
| innodb_sync_array_size                    | 1          |
| innodb_sync_spin_loops                    | 30         |
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 60000      |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 3          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_semi_sync_slave_enabled               | ON         |
| rpl_semi_sync_slave_trace_level           | 32         |
| sync_binlog                               | 1          |
| sync_frm                                  | ON         |
| sync_master_info                          | 10000      |
| sync_relay_log                            | 10000      |
| sync_relay_log_info                       | 10000      |
+-------------------------------------------+------------+
18 rows in set (0.00 sec)

 

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Innodb_data_fsyncs                         | 664      |
| Innodb_data_pending_fsyncs                 | 0        |
| Innodb_os_log_fsyncs                       | 413      |
| Innodb_os_log_pending_fsyncs               | 0        |
| Rpl_semi_sync_master_clients               | 4        |
| Rpl_semi_sync_master_net_avg_wait_time     | 0        |
| Rpl_semi_sync_master_net_wait_time         | 0        |
| Rpl_semi_sync_master_net_waits             | 199      |
| Rpl_semi_sync_master_no_times              | 21       |
| Rpl_semi_sync_master_no_tx                 | 48       |
| Rpl_semi_sync_master_status                | ON       |
| Rpl_semi_sync_master_timefunc_failures     | 0        |
| Rpl_semi_sync_master_tx_avg_wait_time      | 3280008  |
| Rpl_semi_sync_master_tx_wait_time          | 72160195 |
| Rpl_semi_sync_master_tx_waits              | 22       |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0        |
| Rpl_semi_sync_master_wait_sessions         | 0        |
| Rpl_semi_sync_master_yes_tx                | 20       |
| Rpl_semi_sync_slave_status                 | OFF      |
+--------------------------------------------+----------+

19 rows in set (0.00 sec)

 

 

============================我是分割线=====================================

 

 

实验1

场景:

  rpl_semi_sync_master_wait_no_slave=ON

  rpl_semi_sync_master_wait_for_slave_count=3

  slave存活1,其他slave停止,会发生什么?

步骤:

  1.只保留一个slave,停止其他3个slave

mysql5306> stop slave;
mysql6306> stop slave;
mysql7306> stop slave;

 

  2.立即查询master

    为减少信息干扰,只截取了需要关注的数据

mysql> show global variables like "%sync%"; show global status like "%sync%";

+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 60000      |
| rpl_semi_sync_master_wait_for_slave_count | 3          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_semi_sync_slave_enabled               | ON         |
+-------------------------------------------+------------+
18 rows in set (0.00 sec)

 

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 4        |
| Rpl_semi_sync_master_status                | ON       |
| Rpl_semi_sync_slave_status                 | OFF      |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

 

  3.等待一分钟,再查询master

mysql> show global variables like "%sync%"; show global status like "%sync%";

+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 60000      |
| rpl_semi_sync_master_wait_for_slave_count | 3          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_semi_sync_slave_enabled               | ON         |
+-------------------------------------------+------------+
18 rows in set (0.00 sec)

 

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 1        |
| Rpl_semi_sync_master_status                | ON       |
| Rpl_semi_sync_slave_status                 | OFF      |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

 

  4.另外开启一个master会话,开始一个事务并提交

mysql> insert into new values(4);
[挂起……]

 

  5.查看master

    发现无变化

mysql> show global variables like "%sync%"; show global status like "%sync%";

+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 60000      |
| rpl_semi_sync_master_wait_for_slave_count | 3          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_semi_sync_slave_enabled               | ON         |
+-------------------------------------------+------------+
18 rows in set (0.00 sec)

 

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 1        |
| Rpl_semi_sync_master_status                | ON       |
| Rpl_semi_sync_slave_status                 | OFF      |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

 

  6.查看存活的slave

    发现事务已经被应用

mysql> select * from new;

+------+
| name |
+------+
|    2 |
+------+
1 rows in set (0.00 sec)

 

 

  7.等待master事务超时后完成。

     这里master提交到收到结果,用了1分钟0.01秒,该事务因slave的ack应答不够,发生了等待。

mysql> insert into new values(4);
Query OK, 1 row affected (1 min 0.01 sec)

 

 

  8.查看master

    可以发现,事务超时后,master已经转为异步复制(Rpl_semi_sync_master_status=OFF)

 

mysql> show global variables like "%sync%"; show global status like "%sync%";

+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 60000      |
| rpl_semi_sync_master_wait_for_slave_count | 3          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_semi_sync_slave_enabled               | ON         |
+-------------------------------------------+------------+
18 rows in set (0.00 sec)

 

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 1        |
| Rpl_semi_sync_master_status                | OFF      |
| Rpl_semi_sync_slave_status                 | OFF      |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

 

  9.依次启动已经停止的slave,每启动一个,便立即查看一次master状态

    可以看到,启动slave的动作会立即被master接收到,并且master会自动切换回半同步模式(Rpl_semi_sync_master_status=ON)

 

mysql5306> start slave;

mysql> show global status like "%sync%";

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 2        |
| Rpl_semi_sync_master_status                | OFF      |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

mysql6306> start slave;

mysql> show global status like "%sync%";

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 3        |
| Rpl_semi_sync_master_status                | ON       |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

mysql7306> start slave;

mysql> show global status like "%sync%";

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 4        |
| Rpl_semi_sync_master_status                | ON       |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

 

============================我是分割线=====================================

 

实验2

场景:

  rpl_semi_sync_master_wait_no_slave=OFF

  rpl_semi_sync_master_wait_for_slave_count=3

  slave存活1,其他slave停止,会发生什么?

步骤:

  1.继续实验1的环境,首先设置参数并查看master状态

    为减少信息干扰,只截取了需要关注的数据 

mysql> set global rpl_semi_sync_master_wait_no_slave=OFF;
Query OK, 0 rows affected (0.00 sec)

mysql> show global variables like "%sync%"; show global status like "%sync%";

+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | ON         |
| rpl_semi_sync_master_timeout              | 60000      |
| rpl_semi_sync_master_wait_for_slave_count | 3          |
| rpl_semi_sync_master_wait_no_slave        | OFF        |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_semi_sync_slave_enabled               | ON         |
+-------------------------------------------+------------+
18 rows in set (0.00 sec)

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 4        |
| Rpl_semi_sync_master_status                | ON       |
| Rpl_semi_sync_slave_status                 | OFF      |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

 

  2.只保留一个slave,依次停止其他3个slave

mysql5306> stop slave;
mysql6306> stop slave;
mysql7306> stop slave;

 

 

  3.等待一分钟,查看master状态。(在等待期间可以通过反复查看master状态,来观察master数据的变化)

    master已经转为异步模式复制。 

mysql> show global status like "%sync%";

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 1        |
| Rpl_semi_sync_master_status                | OFF      |
| Rpl_semi_sync_slave_status                 | OFF      |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

 

  4.使用另一个master的会话,开始一个事务并提交。 这里master提交到收到结果,该事务没有发生等待。

mysql> insert into new values(5);
Query OK, 1 row affected (0.02 sec)

 

 

  5.立即查看存活的slave

     事务已经被应用。

mysql> select * from new;

+------+
| name |
+------+
|    4 |
|    5 |
+------+
2 rows in set (0.00 sec)

 

 

  6.依次启动其他slave,每启动一个便立即查看master状态

    可以看到,启动slave的动作会立即被master接收到,并且master会自动切换回半同步模式(Rpl_semi_sync_master_status=ON)

mysql5306> start slave;

mysql> show global status like "%sync%";

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 2        |
| Rpl_semi_sync_master_status                | OFF      |
| Rpl_semi_sync_slave_status                 | OFF      |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

mysql6306> start slave;

mysql> show global status like "%sync%";

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 3        |
| Rpl_semi_sync_master_status                | ON       |
| Rpl_semi_sync_slave_status                 | OFF      |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

mysql7306> start slave;

mysql> show global status like "%sync%";

+--------------------------------------------+----------+
| Variable_name                              | Value    |
+--------------------------------------------+----------+
| Rpl_semi_sync_master_clients               | 4        |
| Rpl_semi_sync_master_status                | ON       |
| Rpl_semi_sync_slave_status                 | OFF      |
+--------------------------------------------+----------+
19 rows in set (0.00 sec)

 

 

============================我是分割线=====================================

结论:

使用直白简短好理解的语言:

 

【rpl_semi_sync_master_wait_for_slave_count】

  1.master提交后所需的应答数量!如果slave clients数量大于等于这个值,那么master会一路畅行无阻;如果低于这个值,master可能会在事务提交阶段发生一次超时等待,当等待超过参数(rpl_semi_sync_master_timeout)设定时,master就转为异步模式(原理见下一个参数)。

  2.master将这个参数值作为标杆,用来和【Rpl_semi_sync_master_clients】参数做比较。

【rpl_semi_sync_master_wait_no_slave】

  1.为OFF时,只要master发现【Rpl_semi_sync_master_clients】小于【rpl_semi_sync_master_wait_for_slave_count】,则master立即转为异步模式。

  2.为ON时,空闲时间(无事务提交)里,即使master发现【Rpl_semi_sync_master_clients】小于【rpl_semi_sync_master_wait_for_slave_count】,也不会做任何调整。只要保证在事务超时之前,master收到大于等于【rpl_semi_sync_master_wait_for_slave_count】值的ACK应答数量,master就一直保持在半同步模式;如果在事务提交阶段(master等待ACK)超时,master才会转为异步模式。

 

无论【rpl_semi_sync_master_wait_no_slave】为ON还是OFF,当slave上线到【rpl_semi_sync_master_wait_for_slave_count】值时,master都会自动由异步模式转为半同步模式。

 

粗略看来,设置为OFF时,master不会因为slave的离线而造成事务等待,这似乎是一个更合适的选择,但是为什么5.7中默认参数为ON呢?目前我还不得而知。

另外,本次实验建立在空闲场景(或者说极微小负载场景)下,在高并发的场景下,这个参数又会导致怎样的结果,这块也尚需后续探索。

 

 

 ============================我是分割线=====================================

官方文档关于参数rpl_semi_sync_master_wait_no_slave的解释:

 

Controls whether the master waits for the timeout period configured by rpl_semi_sync_master_timeout to expire, even if the slave count drops to less than the number of slaves configured by pl_semi_sync_master_wait_for_slave_count during the timeout period.

 

When the value of rpl_semi_sync_master_wait_no_slave is ON (the default), it is permissible for the slave count to drop to less than rpl_semi_sync_master_wait_for_slave_count during the timeout period. As long as enough slaves acknowledge the transaction before the timeout period expires, semisynchronous replication continues.

 

When the value of rpl_semi_sync_master_wait_no_slave is OFF, if the slave count drops to less than the number configured in rpl_semi_sync_master_wait_for_slave_count at any time during the timeout period configured by rpl_semi_sync_master_timeout, the master reverts to normal replication.

 

This variable is available only if the master-side semisynchronous replication plugin is installed.

 

官方文档关于参数rpl_semi_sync_master_wait_for_slave_count的解释:

 

The number of slave acknowledgments the master must receive per transaction before proceeding. By default rpl_semi_sync_master_wait_for_slave_count is 1, meaning that semisynchronous replication proceeds after receiving a single slave acknowledgment. Performance is best for small values of this variable.

 

For example, if rpl_semi_sync_master_wait_for_slave_count is 2, then 2 slaves must acknowledge receipt of the transaction before the timeout period configured by rpl_semi_sync_master_timeout for semisynchronous replication to proceed. If less slaves acknowledge receipt of the transaction during the timeout period, the master reverts to normal replication.

 

 

============================我是结束线=====================================

EOF

推荐阅读