mariadb - MyISAM 表删除后 MariaDB Galera Closter 崩溃
问题描述
我们有 5 个节点的 MariaDB Galera 集群。在我们的例子中,MyISAM 表被删除,之后整个集群崩溃。我已经阅读了其他博客/问题,但需要一些关于这种情况下内部工作的知识。
**我需要了解 - 1. 启动了删除表命令,从日志中很明显,在其他节点中找不到被删除的记录(因为它是 MyISAM),因此一致性受到损害,因此集群去了下。MyISAM 复制 DDL 语句,那么为什么它在所有节点上检查表的内部记录 - 它不应该直接删除表而不比较其他节点上的记录,因为它是 drop 而不是 delete 语句。(仅供参考 - 该表存在于所有节点上,但记录不同)
建议避免这种情况或应用任何补丁/错误修复?
我遗漏的任何其他点**
MariaDB 版本 - 10.1.33
日志 -
2018-08-16 15:55:26 140366938635008 [Note] InnoDB: Online DDL : End of reading clustered index of the table and create temporary files
**2018-08-16 16:24:27 140366938635008 [ERROR] Slave SQL: Error 'Unknown table 'DB_TOTAL4DEV_P.users_last_login_2'' on query. Default database: ''. Query: 'drop table `DB_TOTAL4DEV_P`.`users_last_login_2`', Internal MariaDB error code: 1051**
**2018-08-16 16:24:27 140366938635008 [Warning] WSREP: RBR event 1 Query apply warning: 1, 309050572
2018-08-16 16:24:27 140366938635008 [Warning] WSREP: Ignoring error for TO isolated action: source: 4533077b-9f57-11e8-94aa-6b070f22a4b5 version: 3 local: 0 state: APPLYING flags: 65 conn_id: 492382 trx_id: -1 seqnos (l: 805550, g: 309050572, s: 309050571, d: 309050571, ts: 223731321595519)**
2018-08-16 16:24:28 140366938635008 [ERROR] Slave SQL: Error 'Unknown table 'DB_TOTAL4DEV_P.users_last_login_2'' on query. Default database: ''. Query: 'drop table `DB_TOTAL4DEV_P`.`users_last_login_2`', Internal MariaDB error code: 1051
2018-08-16 16:24:28 140366938635008 [Warning] WSREP: RBR event 1 Query apply warning: 1, 309050575
2018-08-16 16:24:28 140366938635008 [Warning] WSREP: Ignoring error for TO isolated action: source: 56daeaa4-9e8e-11e8-b891-6f203bac329a version: 3 local: 0 state: APPLYING flags: 65 conn_id: 708893 trx_id: -1 seqnos (l: 805553, g: 309050575, s: 309050574, d: 309050574, ts: 310032366214700)
2018-08-16 16:28:45 140366938635008 [ERROR] mysqld: Can't find record in 'users_last_login'
2018-08-16 16:28:45 140366938635008 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table DB_TOTAL4DEV_P.users_last_login; Can't find record in 'users_last_login', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1862, Internal MariaDB error code: 1032
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: RBR event 3 Delete_rows_v1 apply warning: 120, 309051933
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: Failed to apply app buffer: seqno: 309051933, status: 1
2018-08-16 16:28:45 140366938635008 [ERROR] mysqld: Can't find record in 'users_last_login'
2018-08-16 16:28:45 140366938635008 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table DB_TOTAL4DEV_P.users_last_login; Can't find record in 'users_last_login', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1862, Internal MariaDB error code: 1032
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: RBR event 3 Delete_rows_v1 apply warning: 120, 309051933
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: Failed to apply app buffer: seqno: 309051933, status: 1
2018-08-16 16:28:45 140366938635008 [ERROR] mysqld: Can't find record in 'users_last_login'
2018-08-16 16:28:45 140366938635008 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table DB_TOTAL4DEV_P.users_last_login; Can't find record in 'users_last_login', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1862, Internal MariaDB error code: 1032
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: RBR event 3 Delete_rows_v1 apply warning: 120, 309051933
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: Failed to apply app buffer: seqno: 309051933, status: 1
2018-08-16 16:28:45 140366938635008 [ERROR] mysqld: Can't find record in 'users_last_login'
2018-08-16 16:28:45 140366938635008 [ERROR] Slave SQL: Could not execute Delete_rows_v1 event on table DB_TOTAL4DEV_P.users_last_login; Can't find record in 'users_last_login', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 1862, Internal MariaDB error code: 1032
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: RBR event 3 Delete_rows_v1 apply warning: 120, 309051933
2018-08-16 16:28:45 140366938635008 [ERROR] WSREP: Failed to apply trx: source: 808e3422-9dcd-11e8-bbc0-3ab028afd1b7 version: 3 local: 0 state: APPLYING flags: 1 conn_id: 893106 trx_id: 15068161908 seqnos (l: 806931, g: 309051933, s: 309051932, d: 309051868, ts: 393116532516870)
**2018-08-16 16:28:45 140366938635008 [ERROR] WSREP: Failed to apply trx 309051933 4 times
2018-08-16 16:28:45 140366938635008 [ERROR] WSREP: Node consistency compromised, aborting...**
2018-08-16 16:28:45 140366938635008 [Note] WSREP: Closing send monitor...
2018-08-16 16:28:45 140366938635008 [Note] WSREP: Closed send monitor.
2018-08-16 16:28:45 140366938635008 [Note] WSREP: gcomm: terminating thread
2018-08-16 16:28:45 140366938635008 [Note] WSREP: gcomm: joining thread
2018-08-16 16:28:45 140366938635008 [Note] WSREP: gcomm: closing backend
2018-08-16 16:28:45 140366938635008 [Note] WSREP: (590a3f66, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.134.18.25:4567 tcp://10.134.18.5:4567
2018-08-16 16:28:45 140366938635008 [Note] WSREP: declaring 7d266759 at tcp://10.134.19.4:4567 stable
2018-08-16 16:28:45 140366938635008 [Note] WSREP: declaring 808e3422 at tcp://10.134.18.4:4567 stable
2018-08-16 16:28:45 140366938635008 [Note] WSREP: forgetting 4533077b (tcp://10.134.18.5:4567)
2018-08-16 16:28:45 140366938635008 [Note] WSREP: forgetting 56daeaa4 (tcp://10.134.18.25:4567)
2018-08-16 16:28:45 140366938635008 [Note] WSREP: (590a3f66, 'tcp://0.0.0.0:4567') turning message relay requesting off
2018-08-16 16:28:45 140366938635008 [Note] WSREP: Node 590a3f66 state prim
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: user message in state LEAVING
2018-08-16 16:28:45 140366938635008 [Warning] WSREP: 590a3f66 sending install message failed: Transport endpoint is not connected
2018-08-16 16:28:48 140366938635008 [Note] WSREP: (590a3f66, 'tcp://0.0.0.0:4567') connection to peer 808e3422 with addr tcp://10.134.18.4:4567 timed out, no messages seen in PT3S
2018-08-16 16:28:48 140366938635008 [Note] WSREP: (590a3f66, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.134.18.4:4567
2018-08-16 16:28:49 140366938635008 [Note] WSREP: (590a3f66, 'tcp://0.0.0.0:4567') reconnecting to 808e3422 (tcp://10.134.18.4:4567), attempt 0
2018-08-16 16:28:50 140366938635008 [Note] WSREP: cleaning up 56daeaa4 (tcp://10.134.18.25:4567)
2018-08-16 16:28:50 140366938635008 [Note] WSREP: cleaning up 4533077b (tcp://10.134.18.5:4567)
2018-08-16 16:28:50 140366938635008 [Note] WSREP: evs::proto(590a3f66, LEAVING, view_id(REG,590a3f66,613)) suspecting node: 7d266759
2018-08-16 16:28:50 140366938635008 [Note] WSREP: evs::proto(590a3f66, LEAVING, view_id(REG,590a3f66,613)) suspecting node: 808e3422
2018-08-16 16:28:50 140366938635008 [Note] WSREP: evs::proto(590a3f66, LEAVING, view_id(REG,590a3f66,613)) suspected node without join message, declaring inactive
2018-08-16 16:28:50 140366938635008 [Note] WSREP: view(view_id(NON_PRIM,590a3f66,613) memb {
2018-08-16 16:28:50 140366938635008 [Note] WSREP: view((empty))
2018-08-16 16:28:50 140366938635008 [Note] WSREP: gcomm: closed
2018-08-16 16:28:50 140366636377856 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Flow-control interval: [16, 16]
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Trying to continue unpaused monitor
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Received NON-PRIMARY.
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 309051933)
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Received self-leave message.
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Flow-control interval: [0, 0]
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Trying to continue unpaused monitor
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Received SELF-LEAVE. Closing connection.
2018-08-16 16:28:50 140366636377856 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 309051933)
2018-08-16 16:28:50 140366636377856 [Note] WSREP: RECV thread exiting 0: Success
2018-08-16 16:28:50 140366938635008 [Note] WSREP: recv_thread() joined.
2018-08-16 16:28:50 140366938635008 [Note] WSREP: Closing replication queue.
2018-08-16 16:28:50 140366938635008 [Note] WSREP: Closing slave action queue.
2018-08-16 16:28:50 140366938635008 [Note] WSREP: /usr/sbin/mysqld: Terminated.
谢谢你的帮助。:) 请原谅我的基本 mariaDB/Galera 知识。
解决方案
我们成功地在测试中重现了该问题。问题是 MyISAM 表更改为 INNODB,然后表被删除。由于表有不同的数据,MySQL 在所有其他节点上停止,Galera 崩溃。
更改为 INNODB 后,将复制任何新数据,但跨节点不同的现有数据也将保留在表中。现在,如果我们从表中删除表或未复制的记录,则该记录在其他节点上不可用,因此集群崩溃。所以在更改表存储引擎之前需要确定这一点。
推荐阅读
- javascript - 如何保持状态不断出现在 JavaScript 函数到 HTML
- mysql - 我在尝试创建表时不断收到 mysql 中的 1064 错误
- reactjs - 如何在 React 中使用 react-date-range 发布请求日期?
- c# - Xamarin GUI 更新未发生异步
- powershell - *已解决* PowerShell 脚本来实现 ipsec 规则
- postgresql - PostgreSQL 中时间格式的常见做法
- perl - html2text 命令行 打破 html
- r - 如何根据条件删除一列的值,增加其他列的数量
- python - 如何在特定列中的单元格值低于 1000 亿的 CSV 中删除带有熊猫的行?
- python - 将源拆分为资源名称/区域/项目的正确方法是什么?