Resume Your Failed pt-online-schema-change Job

June 14, 2024
Author
Sveta Smirnova
Share this Post:

Starting from Percona Toolkit 3.6.0, you can resume pt-online-schema-change if it was interrupted. This blog describes the prerequisites and usage of the new –resume option.

To restart the job, you need to know where it failed. This is why the first option you must use is –history. It instructs pt-online-schema-change to store its progress in the history table. The history table name is percona.pt_osc_history by default. You can overwrite this name with the option –history-table. The history table structure is as follows:

Where

  • job_id is the unique identifier of the pt-online-schema-change run
  • db is the name of the database
  • tbl is the name of the altered table
  • new_table_name is the name of the table copy that pt-online-schema-change creates to perform its job
  • altr is the ALTER TABLE command that you specify as the –alter argument
  • args are the other arguments passed to the pt-online-schema-change command
  • lower_boundary is the lower boundary of the latest proceed chunk
  • upper_boundary is the upper boundary of the latest proceed chunk
  • done is the indicator of whether the job is finished or not
  • ts is the timestamp recorded when the job entry is updated. Practically, this is the timestamp when the latest chunk has been processed.

If your table index includes binary columns, such as BLOB or BINARY, create a history table with the option –binary-index. In this case, lower_boundary and upper_boundary will be of the BLOB datatype instead of TEXT. The same idea is used for the –replicate table in pt-table-checksum.

Once you run pt-online-schema-change with the –history option, it will create a record with the job details. If the job fails, you can easily find its unique identifier either in the tool output or in the history table and restart it.

Here is an example of the percona.pt_osc_history table content after a few pt-online-schema-change runs:

In the output above,

  •  Job 1 finished successfully:
  •  Jobs 2 and 3 did not even start:
  •  Job 4 was interrupted in the middle:

To restart the job, run pt-online-schema-change with option –resume=ID_OF_YOUR_PREVIOUSLY_FAILED_JOB. The tool will check if the previously created table copy and its triggers are still in place and will start the job from the chunk where it failed before.

You can use option –history together with –resume or you can skip it. Note, that if you do not use option –history, you won’t be able to restart the job if it fails one more time.

Another prerequisite for option –resume is –no-drop-new-table and –no-drop-triggers. If a new table and triggers are dropped, pt-online-schema-change cannot restart the job because no previously copied data is present.

Summary

If you want to resume pt-online-schema-change after the failure, use it with options –history –no-drop-new-table –no-drop-triggers. If the job fails, you can resume it by providing the job ID as an argument for the option –resume. You will find the ID of the failed job in the pt-online-schema-change output and the history table.

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved