The pt-online-schema-change tool runs DDL queries (ALTER) online so that the table is not locked for reads and writes. It is a commonly used tool by community users and customers. Using it on Amazon RDS requires knowing about some specific details. First, a high-level explanation of how the tool works.
This is an example from the documentation:pt-online-schema-change --alter "ADD COLUMN c1 INT" D=sakila,t=actor
The tool runs an ALTER on the table “actor” from the database “sakila.” The alter adds a column named “c1” of type “integer.” In the background, the tool creates a new empty table similar to “actor” but with the new column already added. It then creates triggers on the original table to update the corresponding rows in the new table. After, it starts copying rows to the new table (this is the phase that takes the longest amount of time). When the copy is done, the tables are swapped, triggers removed and the old table dropped.
As we can see, it is a tool that uses the basic features of MySQL. You can run it on MySQL, Percona Server, MariaDB, Amazon RDS and so on. But when using Amazon, there is a hidden issue: you don’t have SUPER privileges. This means that if you try to run the tool on an RDS with binary logs enabled, you could get the following error:DBD::mysql::db do failed: You do not have the SUPER privilege and binary logging is enabled (you might want to use the less safe log_bin_trust_function_creators variable) [for Statement "CREATE TRIGGER `pt_osc_db_table_del` AFTER DELETE ON `db`.`table` FOR EACH ROW DELETE IGNORE FROM `db`.`_table_new` WHERE `db`.`_table_new`.`table_id` <=> OLD.`table_id` AND `db`.`_table_new`.`account_id` <=> OLD.`account_id`"] at /usr/bin/pt-online-schema-change line 10583.
The following documentation page explains the reason for this message:
The bottom line is creating triggers on a server with binary logs enabled requires a user with SUPER privileges (which is impossible in Amazon RDS). The error message specifies the workaround. We need to enable the variable log_bin_trust_function_creators. Enabling it is like saying to the server:
“I trust regular users’ triggers and functions, and that they won’t cause problems, so allow my users to create them.”
Since the database functionality won’t change, it becomes a matter of trusting your users. log_bin_trust_function_creators is a global variable that can be changed dynamically:mysql> SET GLOBAL log_bin_trust_function_creators = 1;
Run the tool again. This time, it will work. After the ALTER is done, you can change the variable to 0 again.
After you’re done with the ALTER process, you can change the variable to “0” again.
This article discusses how to salvage a crashed pt-online-schema-change by leveraging pt-archiver and executing queries to ensure that the data gets accurately migrated. I will show you how to continue the data copy process, and how to safely close out the pt-online-schema-change via manual operations such as RENAME TABLE and DROP TRIGGER commands. The normal process to recover from a crashed pt-online-schema-change is to drop the triggers on your original table and drop the new table created by the script. Then you would restart pt-online-schema-change. In this case, this wasn’t possible.
A customer recently needed to add a primary key column to a very busy table (with around 200 million rows). The table only had a unique key on one column (called our_id below). The customer had concerns about slave lag, and wanted to ensure there was little or no lag. This, as well as the fact that you can’t add a primary key as an online DDL in MySQL and Percona Server 5.6, meant the obvious answer was using pt-online-schema-change.
Due to the sensitivity of their environment, they could only afford one short window for the initial metadata locks, and needed to manually do the drop swap that pt-online-schema-change normally does automatically. This is where no-drop-triggers and no-swap-tables come in. The triggers will theoretically run indefinitely to keep the new and old tables in sync once pt-online-schema-change is complete. We crafted the following command:pt-online-schema-change --execute --alter-foreign-keys-method=auto --max-load Threads-running=30 --critical-load Threads_running=55 --check-slave-lag mysql-slave1,mysql-slave2,mysql-slave3 --max−lag=10 --chunk-time=0.5 --set-vars=lock_timeout=1 --tries="create_triggers:10:2,drop_triggers:10:2" --no-drop-new-table --no-drop-triggers --no-swap-tables --chunk-index "our_id" --alter "ADD newcol BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY FIRST" D=website,t=largetable --nocheck-plan
You can see some of the specifics of other flags and why we used them in the Percona Toolkit Manual.
Once we ran the command the customer got concerned, as their monitoring tools weren’t showing any work done (which is by design, pt-online-schema-change doesn’t want to hurt your running environment). The customer ran strace -p to verify it was working. This wasn’t a great choice as it crashed pt-online-schema-change.
At this point, we knew that the application (and management) would not allow us to take new metadata locks to create triggers on the table, as we had passed our metadata lock window.
So how do we recover?
First, let’s start with a clean slate. We issued the following commands to create a new table, where __largetable_new is the table created by pt-online-schema-change:CREATE TABLE mynewlargetable LIKE __largetable_new; RENAME TABLE __largetable_new TO __largetable_old, mynewlargetable TO __largetable_new; DROP TABLE __largetable_old;
Now the triggers on the original table, largetable are updating the new empty table that has our new schema.
Now let’s address the issue of actually moving the data that’s already in largetable to __largetable_new. This is where pt-archiver comes in. We crafted the following command:pt-archiver --execute --max-lag=10 --source D=website,t=largetable,i=our_id --dest D=website,t=__largetable_new --where "1=1" --no-check-charset --no-delete --no-check-columns --txn-size=500 --limit=500 --ignore --statistics
We use pt-archiver to slowly copy records non-destructively to the new table based on our_id and WHERE 1=1 (all records). At this point, we periodically checked the MySQL data directory over the course of a day with ls -l to compare table sizes.
Once the table files were close to the same size, we ran counts on the tables. We noticed something interesting: the new table had thousands more records than the original table.
This concerned us. We wondered if our “hack” was a mistake. At this point we ran some verification queries:select min(our_id) from __largetable_new; select max(our_id) from __largetable_new; select min(our_id) from largetable; select max(our_id) from largetable;
We learned that there were older records that didn’t exist in the live table. This means that pt-archiver and the DELETE trigger may have missed each other (i.e., pt-archiver was already in a transaction but hadn’t written records to the new table until after the DELETE trigger already fired).
We verified with more queries:SELECT COUNT(*) FROM largetable l WHERE NOT EXISTS (SELECT our_id FROM __largetable_new n WHERE n.our_id=l.our_id);
They returned nothing.SELECT COUNT(*) FROM __largetable_new n WHERE NOT EXISTS (SELECT our_id FROM largetable l WHERE n.our_id=l.our_id);
Our result showed 4000 extra records in the new table. This shows that we ended up with extra records that were deleted from the original table. We ran other queries based on their data to verify as well.
This wasn’t a huge issue for our application, and it could have been easily dealt with using a simple DELETE query based on the unique index (i.e., if it doesn’t exist in the original table, delete it from the new one).
Now to complete the pt-online-schema-change actions. All we need to do is the atomic rename or drop swap. This should be done as soon as possible to avoid running in a degraded state, where all writes to the old table are duplicated on the new one.RENAME TABLE largetable TO __largetable_old , __largetable_new TO largetable;
Then drop the triggers for safety:DROP TRIGGER pt_osc_website_largetable_ins; DROP TRIGGER pt_osc_website_largetable_upd; DROP TRIGGER pt_osc_website_largetable_del;
At this point it is safer to wait for the old table to clear out of the buffer pool before dropping it, just to ensure there is no impact on the server (maybe a week to be safe). You can check information_schema for a more accurate reading on this:SELECT COUNT(*) FROM INFORMATION_SCHEMA.INNODB_BUFFER_PAGE WHERE TABLE_NAME = '`website`.`__largetable_old`'; +----------+ | count(*) | +----------+ | 279175 | +----------+ 1 row in set (8.94 sec)
Once this goes to 0 you can issue:DROP TABLE __largetable_old;
For general inquiries, please send us your question and someone will contact you.