MySQL 5.7, alongside other many improvements, brought bulk load for creating an index (WL#7277 to be specific), which made ADD INDEX operations much faster by disabling redo logging and making the changes directly to tablespace files. This change requires extra care for backup tools. To block DDL statements on an instance, Percona Server for MySQL implemented LOCK TABLES FOR BACKUP. Percona XtraBackup (PXB) uses this lock for the duration of the backup. This lock does not affect DML statements.
MySQL 5.7 doesn’t have an option to block an instance against DDL and to allow all MDL’s. Thus, Percona XtraBackup has also implemented –lock-ddl-per-table. Before we go into other details, let’s understand how –lock-ddl-per-table works up to now:
This approach works on the promise that if an MLOG_INDEX_LOAD event (Redo log event generated by bulk load operations to notify backup tools that changes to data files have been omitted from redo log) is encountered by the redo follow thread, it’s safe to continue as tablespaces that we have already copied are protected by MDL and the MLOG_INDEX_LOAD event is for a tablespace that is yet to be copied.
This promise is not always correct and can lead to inconsistent backup; here are a few examples:
Full-Text Index has its own tablespace:
|
1 |
mysql> SELECT table_id, name, space from INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE name LIKE '%FTS%';<br>+----------+----------------------------------------------------+-------+<br>| table_id | name | space |<br>+----------+----------------------------------------------------+-------+<br>| 1169 | test/FTS_000000000000002e_0000000000000508_INDEX_1 | 1157 |<br>| 1170 | test/FTS_000000000000002e_0000000000000508_INDEX_2 | 1158 |<br>| 1171 | test/FTS_000000000000002e_0000000000000508_INDEX_3 | 1159 |<br>| 1172 | test/FTS_000000000000002e_0000000000000508_INDEX_4 | 1160 |<br>| 1173 | test/FTS_000000000000002e_0000000000000508_INDEX_5 | 1161 |<br>| 1174 | test/FTS_000000000000002e_0000000000000508_INDEX_6 | 1162 |<br>| 1175 | test/FTS_000000000000002e_BEING_DELETED | 1163 |<br>| 1176 | test/FTS_000000000000002e_BEING_DELETED_CACHE | 1164 |<br>| 1177 | test/FTS_000000000000002e_CONFIG | 1165 |<br>| 1178 | test/FTS_000000000000002e_DELETED | 1166 |<br>| 1179 | test/FTS_000000000000002e_DELETED_CACHE | 1167 |<br>+----------+----------------------------------------------------+-------+<br>11 rows in set (0.01 sec) |
With the current approach, PXB will try to run a SELECT on FTS_000000000000002e_0000000000000508_INDEX_1 for example, which is not something we can do. Here the underlying table that this FTS belongs to may or may not have been protected by MDL, which can cause the FTS index to be copied without protection.
Full-Text Index has a defined name schema, on the above situation we can easily extract the table ID by translating the first 16 characters after FTS_ from hex to decimal, which would give us the underlying table that the FTS belongs to:
|
1 |
session 1> CREATE FULLTEXT INDEX full_index on joinit2 (s);<br>session 2> SELECT table_id, name, space from INFORMATION_SCHEMA.INNODB_SYS_TABLES WHERE table_id >= 1319;<br>+----------+----------------------------------------------------+-------+<br>| table_id | name | space |<br>+----------+----------------------------------------------------+-------+<br>| 1321 | test/#sql-ib1320-2000853746 | 1309 |<br>| 1322 | test/FTS_0000000000000529_00000000000005b6_INDEX_1 | 1310 |<br>| 1323 | test/FTS_0000000000000529_00000000000005b6_INDEX_2 | 1311 |<br>| 1324 | test/FTS_0000000000000529_00000000000005b6_INDEX_3 | 1312 |<br>| 1325 | test/FTS_0000000000000529_00000000000005b6_INDEX_4 | 1313 |<br>| 1326 | test/FTS_0000000000000529_00000000000005b6_INDEX_5 | 1314 |<br>| 1327 | test/FTS_0000000000000529_00000000000005b6_INDEX_6 | 1315 |<br>| 1328 | test/FTS_0000000000000529_BEING_DELETED | 1316 |<br>| 1329 | test/FTS_0000000000000529_BEING_DELETED_CACHE | 1317 |<br>| 1330 | test/FTS_0000000000000529_CONFIG | 1318 |<br>| 1331 | test/FTS_0000000000000529_DELETED | 1319 |<br>| 1332 | test/FTS_0000000000000529_DELETED_CACHE | 1320 |<br>| 1320 | test/joinit2 | 1308 |<br>+----------+----------------------------------------------------+-------+ |
FTS_0000000000000529 translates to table_id 1321, however, as you can see above, the FTS in question belongs to a temporary table. That is because when FTS is created for the first time, it has to rebuild the table to create the FTS_DOC_ID column. Again, it is not possible to copy the files under MDL protection.
Because of the per-table MDL acquisition, if a table has been created after PXB has gathered the list of tablespaces, it’s .ibd will not be copied. Instead, it will be recreated as part of the –prepare phase based on the data added to redo logs. As you can imagine, if the changes are redo-skipped after recreating the table based on redo information, the table will be incomplete.
Once a shared tablespace is parsed by the lock-ddl-per-table function, it will get a list of all tables created on that tablespace and acquire the MDL on those tables, however, there is no tablespace level MDL, which means there is nothing blocking a new table to be created on this tablespace. If the tablespace has already been copied, this will follow the previous point and be recreated at –prepare phase by parsing redo logs.
The outcome of such inconsistent backups can be unknown. In the best-case scenario, you will get a crash either in the backup/prepare phase or when using the server. Yes, a crash is a best-case scenario because you will notice the issue right away.
In the worst-case scenario, data will be missed without you noticing it. Here instead of explaining let’s reproduce it.
Get a brand-new instance of MySQL/Percona Server for MySQL 5.7 and download Percona XtraBackup 2.4 prior to 2.4.21 (same can be reproducible with MySQL/PSMySQL 8 and PXB8).
In order to reproduce this scenario, we will be using gdb to pause PXB execution of the backup at a certain time, however, one can do the same by having a big table, which will take some time to copy.
|
1 |
gdb xtrabackup ex 'set args --backup --lock-ddl-per-table --target-dir=/tmp/backup' -ex 'b mdl_lock_table' -ex 'r' |
This will start the backup. On a separate session, connect to MySQL and execute:
|
1 |
CREATE DATABASE a;<br>USE a;<br>CREATE TABLE tb1 (ID INT PRIMARY KEY, name CHAR(1));<br>INSERT INTO tb1 VALUES (3,'c'), (4, 'd'), (5, 'e');<br>CREATE INDEX n_index ON tb1(name); |
Back to gdb session, execute:
|
1 |
disa 1<br>c<br>quit |
Your backup will complete. You can now prepare it and restore it on MySQL. Now try to query a.tb1 using the below queries:
|
1 |
USE a;<br>SELECT * FROM tb1;<br>SELECT * FROM tb1 FORCE INDEX(PRIMARY);<br>SELECT * FROM tb1 FORCE INDEX(n_index);<br>SELECT * FROM tb1 WHERE name = 'd'; |
Here is an example:
|
1 |
mysql> SELECT * FROM tb1;<br>Empty set (0.00 sec)<br><br>mysql> SELECT * FROM tb1 FORCE INDEX(PRIMARY);<br>+----+------+<br>| ID | name |<br>+----+------+<br>| 3 | c |<br>| 4 | d |<br>| 5 | e |<br>+----+------+<br>3 rows in set (0.00 sec)<br><br>mysql> SELECT * FROM tb1 FORCE INDEX(n_index);<br>Empty set (0.00 sec)<br><br>mysql> SELECT * FROM tb1 WHERE name = 'd';<br>Empty set (0.00 sec) |
As you can see, querying using the PK shows that data is indeed present on the data. The index n_index exists but the corresponding index entries are not present, leading to inconsistent results, which may cause a lot of harm before you notice something is wrong.
With the above points in mind, –lock-ddl-per-table has been reworked in order to guarantee a consistent backup. Percona XtraBackup 2.4.21 and Percona XtraBackup 8.0.22 have the below main changes:
Other improvements have also been performed:
Summary
As described throughout this post, WL#7277 brought some real challenges that can affect backups in different ways, some of them not easy to spot. Percona XtraBackup will always favor consistency and has been enhanced to work around those limitations when possible (taking MDL earlier in the process) and aborting the backup when an inconsistency is inevitable.
Percona Server for MySQL users and MySQL 8 should always use –lock-ddl which is a more robust and safe lock for those types of situations.
Resources
RELATED POSTS