pt-online-schema-change is an amazing tool for assisting in table modifications in cases where ONLINE ALTER is not an option. But if you have foreign keys, this could be an interesting and important read for you.

Tables with foreign keys are always complicated, and they have to be handled with care. The use case I am going to discuss here is one such example that became a source of trouble. We have to be sure of using and understanding the options we use for specific changes and table design in order to better handle the changes.

Setting up the lab with similar tables (removing extra fields)

Now, we execute the pt-osc to make the change. We are adding a new column to the ‘parent’ table:

Let us verify that the new table is created, data moved, and child tables are pointing to the new table:

Note that the following triggers are still active since we used --no-drop-triggers above:

Now that we have completed setting up the table, executing the online alter, let us work on generating the issue at hand.

Let’s issue a DELETE:

Executes successfully, but look at ENGINE INNODB STATUS:

This Foreign Key error is reported because of the foreign key dependency the record from _new table is not deleted:

Now, let’s try to UPDATE a record:

 

So, what happened to the new table in this case?

We saw the UPDATEd value in the “recordID” column propagated, meaning record ID 22 is present in the _parent_kv_new table, but we also have the old recordId value of “2”!

Reasoning

When we execute the DELETE or UPDATE on the parent table, the triggers are invoked for the _parent_kv_new table. Both delete and update triggers have “DELETE IGNORE” query in them:

Update trigger:

Delete trigger:

The _parent_kv_new has child tables child_2 and child1, which refer to the record being deleted. Thus we generate the error “Foreign key constraint fails” because the child table has the data being deleted from the parent.

Now, triggers are executed within the same transaction, and upon their failure, the invoking transaction should also be rolled back.

“With transactional engines, triggers are executed in the same transaction as the statement that invoked them.”

We do not see this happening because the triggers are using “DELETE IGNORE,” and the errors are being ignored here. This causes the transactions to commit on the parent table, though they do not get exactly replicated on the child table.
That’s why you see that in UPDATE event, the recordID 22 has been created but recordID 2 was not deleted, instead it was silently ignored.

Analysis and conclusion

The options used in pt-online-schema-change tool “–no-swap-tables –no-drop-new-table –no-drop-triggers” played a crucial part in this issue which caused the triggers to move away but tables were not swapped yet! The errors were specific to the changes being made on parent-table getting transferred via triggers to the _parent_kv_new causing constraints to fail.

This exercise exposes the nature of online schema change tools working with tables having foreign key constraints. It is better not to use pt-online-schema-change (or any other online schema change tool) for the tables with foreign keys, specifically with additional constraints.

One may argue about letting the tool swap the table along with the changes instead of doing the swap manually, and that would work. But remember, if the table is too busy, the rename table operation can also fail by timing out, and we’ll be at the same stage.

Even though foreign keys are theoretically attractive, their real-world implementation is cumbersome to handle. If possible, push the FK constraint logic to the application.

The best approach is to perform schema changes by doing direct ALTER on the tables with foreign keys. If you choose pt-online-schema-change, consider the correct options and choose a low-traffic time so that the tool can handle the table swap.

Improvements

The issue here is that the pt-osc is executed with options where the source table was not swapped along with the foreign key change.

Note that in the case of alter-foreign-keys-method=drop_swap, we’re restricting the inclusion of –no-drop-new-table with an error:

https://github.com/percona/percona-toolkit/blob/aa1ac0e6889168fddf73c3a72d447e9ea0c0c63b/bin/pt-online-schema-change#L8750C1-L8752C8

We should ensure the same restrictions for “rebuild_constraints.” It should not progress if we have—-no-swap-tables—-no-drop-new-table—- no-drop-triggers. This has already been reported as a bug and is being tracked by Percona Devs.


Percona Distribution for MySQL is a complete, stable, scalable, and secure, open source MySQL solution, delivering enterprise-grade database environments for your most critical business applications. Deploy anywhere and implement easily with one-to-one compatibility with MySQL Community Edition.

 

Try Percona Distribution for MySQL today!

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments