pt-online-schema-change is an amazing tool for assisting in table modifications in cases where ONLINE ALTER is not an option. But if you have foreign keys, this could be an interesting and important read for you.
Tables with foreign keys are always complicated, and they have to be handled with care. The use case I am going to discuss here is one such example that became a source of trouble. We have to be sure of using and understanding the options we use for specific changes and table design in order to better handle the changes.
Setting up the lab with similar tables (removing extra fields)
1 2 3 4 5 |
mysql> CREATE TABLE `parent` ( -> `recordId` bigint(20) NOT NULL AUTO_INCREMENT, -> PRIMARY KEY (`recordId`) -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=COMPRESSED; Query OK, 0 rows affected (0.01 sec) |
1 2 3 4 5 6 7 8 |
mysql> CREATE TABLE `child_2` ( -> `recordId` bigint(20) NOT NULL AUTO_INCREMENT, -> `parentIdFK` bigint(20), -> PRIMARY KEY (`recordId`), -> KEY `idx_fk` (`parentIdFK`), -> CONSTRAINT `__FK_8yv6wpmlradek8p72sg6020w` FOREIGN KEY (`parentIdFK`) REFERENCES `parent` (`recordId`) -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AVG_ROW_LENGTH=377 ROW_FORMAT=COMPRESSED; Query OK, 0 rows affected (0.02 sec) |
1 2 3 4 5 6 7 8 |
mysql> CREATE TABLE `child1` ( -> `record_id` bigint(20) NOT NULL AUTO_INCREMENT, -> `parent_id_fk` bigint(20) , -> PRIMARY KEY (`record_id`), -> KEY `FK_parent_to_child1s` (`parent_id_fk`), -> CONSTRAINT `__FK_parent_to_child1s` FOREIGN KEY (`parent_id_fk`) REFERENCES `parent` (`recordId`) -> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; Query OK, 0 rows affected (0.01 sec) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
mysql> insert into parent values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10) ; Query OK, 10 rows affected (0.01 sec) Records: 10 Duplicates: 0 Warnings: 0 mysql> insert into child1 select NULL, recordId from parent; Query OK, 10 rows affected (0.01 sec) Records: 10 Duplicates: 0 Warnings: 0 mysql> insert into child_2 select NULL, recordId from parent; Query OK, 10 rows affected (0.01 sec) Records: 10 Duplicates: 0 Warnings: 0 mysql> select * from parent; +----------+ | recordId | +----------+ | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | +----------+ 10 rows in set (0.00 sec) mysql> select * from child_2; +----------+---------+ | recordId | parentIdFK | +----------+---------+ | 1 | 1 | | 2 | 2 | | 3 | 3 | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 7 | | 8 | 8 | | 9 | 9 | | 10 | 10 | +----------+---------+ 10 rows in set (0.00 sec) mysql> select * from child1; +-----------+-----------+ | record_id | parent_id_fk | +-----------+-----------+ | 1 | 1 | | 2 | 2 | | 3 | 3 | | 4 | 4 | | 5 | 5 | | 6 | 6 | | 7 | 7 | | 8 | 8 | | 9 | 9 | | 10 | 10 | +-----------+-----------+ 10 rows in set (0.00 sec) |
Now, we execute the pt-osc to make the change. We are adding a new column to the ‘parent’ table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 |
[root@centos_3 ~]# pt-online-schema-change > h=localhost,D=test,t=parent > --alter="ADD COLUMN extra_parent_info LONGTEXT COLLATE utf8_unicode_ci DEFAULT NULL" > --pause-file=/tmp/pt-osc_dbkv_parent.pause > --new-table-name=_%T_kv_new > --print --progress time,10 > --no-swap-tables --no-drop-old-table --no-drop-new-table --no-drop-triggers > --tries create_triggers:10000:1,drop_triggers:10000:1,copy_rows:10000:1 > --set-vars foreign_key_checks=0,tx_isolation='READ-COMMITTED',lock_wait_timeout=5 > --critical-load threads_running=99999 --max-load Threads_running=50 > --chunk-time=1 > --alter-foreign-keys-method=rebuild_constraints > --execute No slaves found. See --recursion-method if host centos_3.localdomain has slaves. ... # A software update is available: Operation, tries, wait: analyze_table, 10, 1 copy_rows, 10000, 1 create_triggers, 10000, 1 drop_triggers, 10000, 1 swap_tables, 10, 1 update_foreign_keys, 10, 1 Child tables: `test`.`child_2` (approx. 10 rows) `test`.`child1` (approx. 10 rows) Will use the rebuild_constraints method to update foreign keys. Altering `test`.`parent`... Creating new table... CREATE TABLE `test`.`_parent_kv_new` ( `recordId` bigint(20) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`recordId`) ) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=COMPRESSED Created new table test._parent_kv_new OK. Altering new table... ALTER TABLE `test`.`_parent_kv_new` ADD COLUMN extra_parent_info LONGTEXT COLLATE utf8_unicode_ci DEFAULT NULL Altered `test`.`_parent_kv_new` OK. 2024-05-24T19:23:44 Creating triggers... ----------------------------------------------------------- Event : DELETE Name : pt_osc_test_parent_del SQL : CREATE TRIGGER `pt_osc_test_parent_del` AFTER DELETE ON `test`.`parent` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; DELETE IGNORE FROM `test`.`_parent_kv_new` WHERE `test`.`_parent_kv_new`.`recordid` <=> OLD.`recordid`; END Suffix: del Time : AFTER ----------------------------------------------------------- ----------------------------------------------------------- Event : UPDATE Name : pt_osc_test_parent_upd SQL : CREATE TRIGGER `pt_osc_test_parent_upd` AFTER UPDATE ON `test`.`parent` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; DELETE IGNORE FROM `test`.`_parent_kv_new` WHERE !(OLD.`recordid` <=> NEW.`recordid`) AND `test`.`_parent_kv_new`.`recordid` <=> OLD.`recordid`; REPLACE INTO `test`.`_parent_kv_new` (`recordid`) VALUES (NEW.`recordid`); END Suffix: upd Time : AFTER ----------------------------------------------------------- ----------------------------------------------------------- Event : INSERT Name : pt_osc_test_parent_ins SQL : CREATE TRIGGER `pt_osc_test_parent_ins` AFTER INSERT ON `test`.`parent` FOR EACH ROW BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; REPLACE INTO `test`.`_parent_kv_new` (`recordid`) VALUES (NEW.`recordid`);END Suffix: ins Time : AFTER ----------------------------------------------------------- 2024-05-24T19:23:44 Created triggers OK. 2024-05-24T19:23:44 Copying approximately 10 rows... INSERT LOW_PRIORITY IGNORE INTO `test`.`_parent_kv_new` (`recordid`) SELECT `recordid` FROM `test`.`parent` LOCK IN SHARE MODE /*pt-online-schema-change 1294 copy table*/ 2024-05-24T19:23:44 Copied rows OK. 2024-05-24T19:23:44 Rebuilding foreign key constraints... ALTER TABLE `test`.`child_2` DROP FOREIGN KEY `__FK_8yv6wpmlradek8p72sg6020w`, ADD CONSTRAINT `FK_8yv6wpmlradek8p72sg6020w` FOREIGN KEY (`parentIdFK`) REFERENCES `test`.`_parent_kv_new` (`recordId`) ALTER TABLE `test`.`child1` DROP FOREIGN KEY `__FK_parent_to_child1s`, ADD CONSTRAINT `FK_parent_to_child1s` FOREIGN KEY (`parent_id_fk`) REFERENCES `test`.`_parent_kv_new` (`recordId`) 2024-05-24T19:23:44 Rebuilt foreign key constraints OK. Not dropping old table because --no-drop-triggers was specified. Not dropping triggers because --no-drop-triggers was specified. To drop the triggers, execute: DROP TRIGGER IF EXISTS `test`.`pt_osc_test_parent_del` DROP TRIGGER IF EXISTS `test`.`pt_osc_test_parent_upd` DROP TRIGGER IF EXISTS `test`.`pt_osc_test_parent_ins` Not dropping the new table `test`.`_parent_kv_new` because --no-drop-new-table was specified. To drop the new table, execute: DROP TABLE IF EXISTS `test`.`_parent_kv_new`; Successfully altered `test`.`parent`. [root@centos_3 ~]# |
Let us verify that the new table is created, data moved, and child tables are pointing to the new table:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
mysql> show create table _parent_kv_newG *************************** 1. row *************************** Table: _parent_kv_new Create Table: CREATE TABLE `_parent_kv_new` ( `recordId` bigint(20) NOT NULL AUTO_INCREMENT, `extra_parent_info` longtext COLLATE utf8_unicode_ci, PRIMARY KEY (`recordId`) ) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=COMPRESSED 1 row in set (0.00 sec) mysql> show create table child_2G *************************** 1. row *************************** Table: child_2 Create Table: CREATE TABLE `child_2` ( `recordId` bigint(20) NOT NULL AUTO_INCREMENT, `parentIdFK` bigint(20) DEFAULT NULL, PRIMARY KEY (`recordId`), KEY `idx_fk` (`parentIdFK`), CONSTRAINT `FK_8yv6wpmlradek8p72sg6020w` FOREIGN KEY (`parentIdFK`) REFERENCES `_parent_kv_new` (`recordId`) ) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci AVG_ROW_LENGTH=377 ROW_FORMAT=COMPRESSED 1 row in set (0.00 sec) mysql> show create table child1G *************************** 1. row *************************** Table: child1 Create Table: CREATE TABLE `child1` ( `record_id` bigint(20) NOT NULL AUTO_INCREMENT, `parent_id_fk` bigint(20) DEFAULT NULL, PRIMARY KEY (`record_id`), KEY `FK_parent_to_child1s` (`parent_id_fk`), CONSTRAINT `FK_parent_to_child1s` FOREIGN KEY (`parent_id_fk`) REFERENCES `_parent_kv_new` (`recordId`) ) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci 1 row in set (0.00 sec) |
Note that the following triggers are still active since we used --no-drop-triggers
above:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
mysql> show triggersG *************************** 1. row *************************** Trigger: pt_osc_test_parent_ins Event: INSERT Table: parent Statement: BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; REPLACE INTO `test`.`_parent_kv_new` (`recordid`) VALUES (NEW.`recordid`);END Timing: AFTER Created: 2024-05-24 19:23:44.32 sql_mode: ONLY_FULL_GROUP_BY,NO_AUTO_VALUE_ON_ZERO,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION Definer: root@localhost character_set_client: latin1 collation_connection: latin1_swedish_ci Database Collation: latin1_swedish_ci *************************** 2. row *************************** Trigger: pt_osc_test_parent_upd Event: UPDATE Table: parent Statement: BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; DELETE IGNORE FROM `test`.`_parent_kv_new` WHERE !(OLD.`recordid` <=> NEW.`recordid`) AND `test`.`_parent_kv_new`.`recordid` <=> OLD.`recordid`; REPLACE INTO `test`.`_parent_kv_new` (`recordid`) VALUES (NEW.`recordid`); END Timing: AFTER Created: 2024-05-24 19:23:44.32 sql_mode: ONLY_FULL_GROUP_BY,NO_AUTO_VALUE_ON_ZERO,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION Definer: root@localhost character_set_client: latin1 collation_connection: latin1_swedish_ci Database Collation: latin1_swedish_ci *************************** 3. row *************************** Trigger: pt_osc_test_parent_del Event: DELETE Table: parent Statement: BEGIN DECLARE CONTINUE HANDLER FOR 1146 begin end; DELETE IGNORE FROM `test`.`_parent_kv_new` WHERE `test`.`_parent_kv_new`.`recordid` <=> OLD.`recordid`; END Timing: AFTER Created: 2024-05-24 19:23:44.31 sql_mode: ONLY_FULL_GROUP_BY,NO_AUTO_VALUE_ON_ZERO,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION Definer: root@localhost character_set_client: latin1 collation_connection: latin1_swedish_ci Database Collation: latin1_swedish_ci |
Now that we have completed setting up the table, executing the online alter, let us work on generating the issue at hand.
Let’s issue a DELETE:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
mysql> select * from parent; +----------+ | recordId | +----------+ | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | +----------+ 10 rows in set (0.00 sec) mysql> delete from parent where recordId=1; Query OK, 1 row affected (0.00 sec) mysql> select * from parent where recordId=1; Empty set (0.00 sec) |
Executes successfully, but look at ENGINE INNODB STATUS:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
mysql> show engine innodb statusG ... ------------------------ <strong>LATEST FOREIGN KEY ERROR </strong>------------------------ 2024-05-24 19:27:37 0x7f597424b700 Transaction: TRANSACTION 4393, ACTIVE 0 sec updating or deleting mysql tables in use 2, locked 2 6 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 2 MySQL thread id 9, OS thread handle 140022177380096, query id 1846840 localhost root updating DELETE IGNORE FROM `test`.`_parent_kv_new` WHERE `test`.`_parent_kv_new`.`recordid` <=> OLD.`recordid` Foreign key constraint fails for table `test`.`child_2`: , CONSTRAINT `FK_8yv6wpmlradek8p72sg6020w` FOREIGN KEY (`parentIdFK`) REFERENCES `_parent_kv_new` (`recordId`) <strong>Trying to delete or update in parent table</strong>, in index PRIMARY tuple: DATA TUPLE: 6 fields; 0: len 8; hex 8000000000000001; asc ;; 1: len 6; hex 000000001129; asc );; 2: len 7; hex 330000019502d4; asc 3 ;; 3: SQL NULL; 4: SQL NULL; 5: SQL NULL; But in child table `test`.`child_2`, in index idx_fk, there is a record: PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 8; hex 8000000000000001; asc ;; 1: len 8; hex 8000000000000001; asc ;; |
This Foreign Key error is reported because of the foreign key dependency the record from _new table is not deleted:
1 2 3 4 5 6 7 |
mysql> select * from _parent_kv_new where recordId=1; +----------+-------------------+ | recordId | extra_parent_info | +----------+-------------------+ | 1 | NULL | +----------+-------------------+ 1 row in set (0.00 sec) |
Now, let’s try to UPDATE a record:
1 2 3 4 5 6 7 8 9 10 11 |
mysql> UPDATE parent set recordID=22 where recordId=2; Query OK, 1 row affected (0.01 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> select * from parent where recordId=22; +----------+ | recordId | +----------+ | 22 | +----------+ 1 row in set (0.00 sec) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
mysql> show engine innodb statusG ... ------------------------ LATEST FOREIGN KEY ERROR ------------------------ 2024-05-24 19:31:24 0x7f597424b700 Transaction: TRANSACTION 4395, ACTIVE 0 sec updating or deleting mysql tables in use 2, locked 2 6 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 3 MySQL thread id 9, OS thread handle 140022177380096, query id 1846845 localhost root updating DELETE IGNORE FROM `test`.`_parent_kv_new` WHERE !(OLD.`recordid` <=> NEW.`recordid`) AND `test`.`_parent_kv_new`.`recordid` <=> OLD.`recordid` Foreign key constraint fails for table `test`.`child_2`: , CONSTRAINT `FK_8yv6wpmlradek8p72sg6020w` FOREIGN KEY (`parentIdFK`) REFERENCES `_parent_kv_new` (`recordId`) Trying to delete or update in parent table, in index PRIMARY tuple: DATA TUPLE: 6 fields; 0: len 8; hex 8000000000000002; asc ;; 1: len 6; hex 00000000112b; asc +;; 2: len 7; hex 340000018914e1; asc 4 ;; 3: SQL NULL; 4: SQL NULL; 5: SQL NULL; But in child table `test`.`child_2`, in index idx_fk, there is a record: PHYSICAL RECORD: n_fields 2; compact format; info bits 0 0: len 8; hex 8000000000000002; asc ;; 1: len 8; hex 8000000000000002; asc ;; |
So, what happened to the new table in this case?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
mysql> select * from _parent_kv_new where recordId=2; +----------+-------------------+ | recordId | extra_parent_info | +----------+-------------------+ | 2 | NULL | +----------+-------------------+ 1 row in set (0.00 sec) mysql> select * from _parent_kv_new where recordId=22; +----------+-------------------+ | recordId | extra_parent_info | +----------+-------------------+ | 22 | NULL | +----------+-------------------+ 1 row in set (0.00 sec) |
We saw the UPDATEd value in the “recordID” column propagated, meaning record ID 22 is present in the _parent_kv_new table, but we also have the old recordId value of “2”!
Reasoning
When we execute the DELETE or UPDATE on the parent table, the triggers are invoked for the _parent_kv_new table. Both delete and update triggers have “DELETE IGNORE” query in them:
Update trigger:
1 |
DELETE IGNORE FROM `test`.`_parent_kv_new` WHERE !(OLD.`recordid` <=> NEW.`recordid`) AND `test`.`_parent_kv_new`.`recordid` <=> OLD.`recordid`; |
Delete trigger:
1 |
DELETE IGNORE FROM `test`.`_parent_kv_new` WHERE `test`.`_parent_kv_new`.`recordid` <=> OLD.`recordid`; END |
The _parent_kv_new has child tables child_2 and child1, which refer to the record being deleted. Thus we generate the error “Foreign key constraint fails” because the child table has the data being deleted from the parent.
Now, triggers are executed within the same transaction, and upon their failure, the invoking transaction should also be rolled back.
“With transactional engines, triggers are executed in the same transaction as the statement that invoked them.”
We do not see this happening because the triggers are using “DELETE IGNORE,” and the errors are being ignored here. This causes the transactions to commit on the parent table, though they do not get exactly replicated on the child table.
That’s why you see that in UPDATE event, the recordID 22 has been created but recordID 2 was not deleted, instead it was silently ignored.
Analysis and conclusion
The options used in pt-online-schema-change tool “–no-swap-tables –no-drop-new-table –no-drop-triggers” played a crucial part in this issue which caused the triggers to move away but tables were not swapped yet! The errors were specific to the changes being made on parent-table getting transferred via triggers to the _parent_kv_new causing constraints to fail.
This exercise exposes the nature of online schema change tools working with tables having foreign key constraints. It is better not to use pt-online-schema-change (or any other online schema change tool) for the tables with foreign keys, specifically with additional constraints.
One may argue about letting the tool swap the table along with the changes instead of doing the swap manually, and that would work. But remember, if the table is too busy, the rename table operation can also fail by timing out, and we’ll be at the same stage.
Even though foreign keys are theoretically attractive, their real-world implementation is cumbersome to handle. If possible, push the FK constraint logic to the application.
The best approach is to perform schema changes by doing direct ALTER on the tables with foreign keys. If you choose pt-online-schema-change, consider the correct options and choose a low-traffic time so that the tool can handle the table swap.
Improvements
The issue here is that the pt-osc is executed with options where the source table was not swapped along with the foreign key change.
1 |
--no-swap-tables --no-drop-old-table --no-drop-new-table --no-drop-triggers |
Note that in the case of alter-foreign-keys-method=drop_swap, we’re restricting the inclusion of –no-drop-new-table with an error:
1 |
" * --alter-foreign-keys-method=drop_swap does not work with --no-drop-new-table." |
1 2 3 |
if ( $alter_fk_method eq 'drop_swap' && !$o->get('drop-new-table') ) { $o->save_error("--alter-foreign-keys-method=drop_swap does not work with --no-drop-new-table."); } |
We should ensure the same restrictions for “rebuild_constraints.” It should not progress if we have—-no-swap-tables—-no-drop-new-table—- no-drop-triggers. This has already been reported as a bug and is being tracked by Percona Devs.
Percona Distribution for MySQL is a complete, stable, scalable, and secure, open source MySQL solution, delivering enterprise-grade database environments for your most critical business applications. Deploy anywhere and implement easily with one-to-one compatibility with MySQL Community Edition.
Try Percona Distribution for MySQL today!