首页 > 解决方案 > MariaDB 备份工具 Mariabackup 失败并出现错误

问题描述

我们最近从 MariaDB 5.5 升级到 10.2,并从 innobackupex 交换到 Mariadbackup(xtrabackup 的一个分支)。尝试进行完整备份总是失败。

我正在运行备份:

sudo mariabackup --backup --target-dir /mnt/database_backups/test --user backups --password REDACTED

该命令的输出如下:

180501 11:53:30 Connecting to MySQL server host: localhost, user: backups, password: set, port: 3306, socket: /var/run/mysqld/mysqld.sock
Using server version 10.2.14-MariaDB-10.2.14+maria~trusty-log
mariabackup based on MariaDB server 10.2.14-MariaDB debian-linux-gnu (x86_64)
mariabackup: uses posix_fadvise().
mariabackup: cd to /var/lib/mysql/
mariabackup: open files limit requested 0, set to 1024
mariabackup: using the following InnoDB configuration:
mariabackup:   innodb_data_home_dir = .
mariabackup:   innodb_data_file_path = ibdata1:12M:autoextend
mariabackup:   innodb_log_group_home_dir = ./
mariabackup: using O_DIRECT
2018-05-01 11:53:30 140057835345792 [Note] InnoDB: Number of pools: 1
mariabackup: Generating a list of tablespaces
2018-05-01 11:53:30 140057835345792 [Warning] InnoDB: Allocated tablespace ID 2997 for warehouse/warehouses, old maximum was 0
180501 11:53:34 >> log scanned up to (2154583932391)
180501 11:53:34 [01] Copying ./ibdata1 to /mnt/database_backups/test/ibdata1
180501 11:53:35 >> log scanned up to (2154583953963)
180501 11:53:35 [01]        ...done
180501 11:53:35 [01] Copying ./warehouse/warehouses.ibd to /mnt/database_backups/test/warehouse/warehouses.ibd
180501 11:53:35 [01]        ...done

-- MORE Copying... ...done lines

180501 12:09:59 [01] Copying ./vioadmin/amazon__product_blacklist.ibd to /mnt/database_backups/test/vioadmin/amazon__product_blacklist.ibd
180501 12:09:59 [01]        ...done
180501 12:09:59 Executing FLUSH NO_WRITE_TO_BINLOG TABLES...
180501 12:09:59 Executing FLUSH TABLES WITH READ LOCK...
180501 12:09:59 Starting to backup non-InnoDB tables and files
180501 12:09:59 [01] Copying ./warehouse/warehouse_actions_archive.frm to /mnt/database_backups/test/warehouse/warehouse_actions_archive.frm
180501 12:09:59 [01]        ...done

-- MORE Copying... ...done lines

180501 12:10:01 Finished backing up non-InnoDB tables and files
180501 12:10:01 [01] Copying aria_log_control to /mnt/database_backups/test/aria_log_control
180501 12:10:01 [01]        ...done
180501 12:10:01 [01] Copying aria_log.00000001 to /mnt/database_backups/test/aria_log.00000001
180501 12:10:01 [01]        ...done
180501 12:10:01 [00] Writing xtrabackup_binlog_info
180501 12:10:01 [00]        ...done
180501 12:10:01 Executing FLUSH NO_WRITE_TO_BINLOG ENGINE LOGS...
mariabackup: The latest check point (for incremental): '2154617240383'
mariabackup: Stopping log copying thread.

2018-05-01 12:10:01 140057835345792 [Note] InnoDB: Read redo log up to LSN=2154583953920
180501 12:10:01 >> log scanned up to (2154583953920)
180501 12:10:01 Executing UNLOCK TABLES
180501 12:10:01 All tables unlocked
180501 12:10:01 [00] Copying ib_buffer_pool to /mnt/database_backups/test/ib_buffer_pool
180501 12:10:01 [00]        ...done
180501 12:10:01 Backup created in directory '/mnt/database_backups/test/'
MySQL binlog position: filename 'mariadb-bin.005485', position '28883329', GTID of the last change '0-1-241386'
180501 12:10:01 [00] Writing backup-my.cnf
180501 12:10:01 [00]        ...done
180501 12:10:01 [00] Writing xtrabackup_info
180501 12:10:01 [00]        ...done
mariabackup: Redo log (from LSN 2154583538384 to 2154583953920) was copied.
mariabackup: error: failed to copy enough redo log (LSN=2154583953920; checkpoint LSN=2154617240383).

有人可以阐明问题可能是什么吗?我们的数据目录大约有 94G,所以数据库非常大,你可以看到持续时间大约是 17 分钟。这与我们之前使用 innobackupex 的情况类似。

从上面的日志中可以看出,在备份的中途有一些以Executing FLUSH NO_WRITE_TO_BINLOG TABLES. 我不确定这些是否正常或有问题,但它们分散在数百Copying行之间似乎有点奇怪。下面列出的表实际上都是 InnoDB,尽管它说Starting to backup non-InnoDB tables and files.

谢谢您的帮助。

标签: mysqldatabasebackupmariadbinnobackupex

解决方案


我创建了 Mariabackup 10.2。重做日志解析代码与 Percona xtrabackup 和 Mariabackup 10.1 有所不同。

您能否分享完整的日志,以便我们找出它失败的确切原因?如果您在https://jira.mariadb.org/提交了一个新的 MDEV 错误并在那里分享了详细信息,并在此处发布了指向该问题的链接,这对我们来说将是最方便的。

我有两个假设。由于错误,重做日志的 Copy_online 过早停止,或者有大量 InnoDB 后台活动导致FLUSH TABLES WITH READ LOCK在备份接近结束时发出重做日志后写入重做日志。

无论哪种方式,Copy_last 阶段似乎都无法复制剩余的日志,因为循环重做日志文件在其间被覆盖。

编辑:其他人为此问题提交了https://jira.mariadb.org/browse/MDEV-16367 。如果您有更多信息,请在此处提交。


推荐阅读