October 26, 2014

How to monitor ALTER TABLE progress in MySQL

While working on a recent support issue as a Percona Support Engineer,  I got one question from a customer asking how to monitor ALTER TABLE progress. Actually, for MySQL 5.5 and prior versions, it’s quite difficult to ALTER the table in a running production environment especially for large tables (with millions records). Because it will rebuild and lock the table affecting the performance as well as our users. Therefore even if we start ALTER it’s really important to know when it will finish. Even while creating the index, ALTER TABLE will not rebuild the table if fast_index_creation is ON but still it might lock the table.

fast_index_creation feature was introduced in MySQL 5.5 and higher versions. Also available in MySQL 5.1 with the InnoDB Plugin

From MySQL 5.6 and later, “Online DDL” feature has been introduced, which is enhancing many other types of ALTER TABLE operations to avoid “copying the table” and “locking.” It  also allows SELECT, INSERT, UPDATE, and DELETE statements to run while the table is being altered. So in latest version, we can control the copy of file and locking by using ALGORITHM and LOCK options. But even in MySQL 5.6, there are some ALTER operations which still required table rebuild i.e ADD/DROP column, change data type, ADD/DROP primary key etc. You can check table here for more details.

Summary of Online Status for DDL Operations”  http://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html

So in any case, if required, we can check the ALTER TABLE progress with below solutions. 

One of the solutions is to use the Percona Toolkit utility, pt-online-schema-change which will ALTER the table without locking it for long time and show the progress. i.e

While using pt-online-schema-change, many times I get questions such as: “What will happen to those data changes (INSERT/UPDATE/DELETE) which are performing during the ALTER TABLE?”

Here, I would like to explain briefly about it. While running pt-online-schema-change, if we will check data dir,

We can see that it will create triggers (separate trigger for INSERT, UPDATE and DELETE) on the original table to update corresponding rows to the new table( _nil_test_new). So any modifications happened to the data in original tables during the copy, will be reflected in the new table.

NOTE: This tool will not work if any triggers are already defined on the table.

But what if we don’t want to use pt-online-schema-change and run regular ALTER TABLE on mysql prompt? After some research I found many ways to calculate the progress of ALTER TABLE specially with innodb_file_per_table is ON. (innodb_file_per_table is ON by default in MySQL 5.6.)

  • Calculate the progress by checking the size of temporary tablespace. 

With innodb_file_per_table=1,when we are running ALTER TABLE, innodb creates a temporary tablespace within the same data directory with random name starting from #sql ending with .ibd like #sql-1c80_27.ibd. i.e

While altering table, innodb reads original ibd file like nil_test.ibd and writes new pages to #sql-1c80_27.ibd. So with file size of nil_test.ibd and temporary #sql-1c80_27.ibd , we can check the ALTER TABLE progress with something like, 

When we run ALTER on mysql, we can simply run this script in data dir and check the progress like,

There are couple of things which we need to consider while using this script.

  1. We have to change script with $TABLENAME as per your requirement. It will work for only one ALTER TABLE.
  2. Script should run from database dir (i.e /var/lib/mysql/nil)
  3. This calculations are an approximation, because new table can be a bit different in size. For example, deleted rows or fragmentation can cause accuracy problems. 

I would like to mention here that we cannot check the progress of every ALTER TABLE with the above formula. For example, with fast_index_creation, It will create the table without any secondary indexes, then adding the secondary indexes after the data is loaded so it will not create temp tablespace (.ibd file) while creating secondary index. This process is very interesting, let me explain here.

When we add secondary index with ‘fast_index_creation‘, it will only create #sql***.frm file to update table structure but not temporary .ibd file. It will do some file sorts and then add directly index to original table.ibd file. so when you monitors those files, initially you didn’t see that file size increasing, but after some time (after files sorts) file size starts increasing till the end of ALTER TABLE. i.e

Initially, you’ll see

Then suddenly .ibd file size starts increasing…till end.

This will happen if there is only one secondary index. If there are multiple secondary indexes then for each index, process will pause, do file sorts and then add index so the number of pauses in file size increases, will be equal to number of secondary indexes.

  • Calculate the progress by checking the records in information_schema.GLOBAL_TEMPORARY_TABLES 

When the ALTER TABLE is running, we can also check GLOBAL_TEMPORARY_TABLES in information_schema and it will give you records count which are inserted in temporary table. i.e

  •  Calculate the progress by checking the Handler_read_rnd_next status variable (global counters).  

While running ALTER TABLE, we can also check the handler_read_rnd_next status variable by “SHOW GLOBAL STATUS LIKE ‘Handler_read_rnd%’ OR mysqladmin extended. Check this 2008 post titled “How to estimate query completion time in MySQL” by Baron Schwartz. He describes it very well.

NOTE: This won’t need innodb_file_per_table = ON  but we can use this option only when there are no other parallel transactions running.

So normally in any case, it will become really difficult to find the ALTER TABLE progress unless MySQL itself will provide some feature like MariaDB is providing progress in “SHOW PROCESSLIST”  https://mariadb.com/kb/en/show-processlist/

I’ve installed MariaDB 5.5 locally and tried to check, along with “SHOW PROCESSLIST”, you can also check progress by running query against information_schema.processlist table. But I observed that progress time is different in both the output.  It looks like information_schema.processlist table gives accurate progress for ALTER TABLE.

Conclusion: As we know, some ALTER TABLE can cause problems like server hang or crash. So whenever we are planning to run it on productions servers, it has to be well planned. As MySQL doesn’t provide a “progress bar,” we can use the above explained methods to check progress. If anyone knows of any other method aside from the above then please share in the comments and I would glad to add it in this post.

About Nilnandan Joshi

Nilnandan officially started with Percona as a Support Engineer. Before joining Percona, he has worked as a MySQL Database administrator with different types of service based companies managing high-traffic websites and web applications. Nilnandan has extensive experience in database design and development, database management, client management, security/documentations/training, implementing DRM solutions, automating backups and high availability.

Nilnandan is based at Pune (India). In his spare time, he likes to listen Indian classical/semi-classical music, watching tv, playing cricket/badminton and hang out with his family.

Comments

  1. Kevin says:

    I’ve been working on this now for a little while, but is there any solution like the shell script for partitioned tables?

    I’ve been working on one in python but the tmp table ends up being far larger than the table I’m altering when using partitions.

  2. Nilnandan Joshi says:

    Hi Kevin,

    I think, same solution you can use for partitioned tables as it will create temp tablespace for each partition. But yes, as I said in the post “This calculations are an approximation, because new table can be a bit different in size.” so it can end up with large temp table size but it should not become very large. Have you checked if there are any bugs related to this? Please let us know which MySQL/PS version, you are using.

  3. Kevin says:

    Nilnandan,

    It seems I spoke too soon. The shell script doesn’t work for partitioned tables, but the python script I wrote does work… it just seems I had an alter die that left behind the temp table partitions of a different table. Sorry for the confusion.

  4. Amit Jain says:

    Any view on myisam table’s ALTER progress bar ?

  5. martin says:

    Is’t possible calculate required tmp (tmpdir) disk space for online DDL queries?
    As well in the case of online DDL with inplace operation (5.6 and higher) is used tmpdir too. But imho not within same size as whole table.

  6. Aaron Ross says:

    This is great. I used the temp table size trick to help me figure out if I need to start freaking out or not. And I didn’t! So thanks for sharing!

Speak Your Mind

*