EmergencyEMERGENCY? Get 24/7 Help Now!

Testing MySQL partitioning with pt-online-schema-change

 | September 9, 2015 |  Posted In: Benchmarks, MySQL, Percona Toolkit

PREVIOUS POST
NEXT POST

There are cases where we need to implement MySQL partitioning, or at least test if it is an option to solve an issue. However, how are we able to test it when the table on our production environment has hundreds of millions of rows, several composite indexes and/or the size on disk is hundreds of gigabytes?

Testing environments usually don’t have all the data that production has and if they have, probably you are not testing all the use-cases at a database level. Therefore, is it possible to test MySQL Partitioning on production impacting as less as possible?

When we execute pt-online-schema-change, it creates a table, triggers, and then copies the data. As we are going to test partitioning we are going to need both tables – with and without partitioning – and we are going to use triggers to keep both tables consistent. A good thing about changing a table to use partitioning is that, usually, you won’t need to change the structure of the row, which means that you are able to use practically the same statement to insert, update or delete on both tables.

Let’s suppose that we have this sysbench table:

If we want to partition it, we execute:

We will execute pt-online-schema-change like this:

But as we are going to test partitioning, we want to:

  • keep with the original table
  • do not swap the tables
  • do not drop the triggers

That is why we are going to execute pt-online-schema-change like this:

At the end we are going to have 2 tables, sbtest, which is not partitioned and _sbtest_new which is partitioned:

first

The next step that pt-osc was going to do was to swap the tables, but we used –no-swap-tables, so we are going to do it manually. But first, we are going to add the triggers to _sbtest_new, so that it can load the data to sbtest, which will be renamed to _sbtest_old. However, we need to create the trigger now, which are going to be very similar to the ones that already exists, but with the table name _sbtest_old, and that will end up in an error as _sbtest_old doesn’t exist yet. That is why we create the triggers handling the error:

The schema is now:

second
We are going to create a table _sbtest_diff which will be the table that is going to be renamed to _sbtest_new. It doesn’t need to have indexes or be partitioned, so that it is simple:

At this point we are able to swap the tables, the command to execute will be:

The rename table will do this:

third

Now you can test performance on the table.  If we want to return to the previous stage, we just execute:

With this two “RENAME TABLE” commands, we are able to back and forth to partition and non-partition table. Once you are satisfied with your testing, the remaining task is to clean up the triggers and the tables. At the end, there are 2 possible outcomes:

  • The partitioned table is working as expected. The cleanup commands will be:

  • We decided to keep the original table, which implies execute:

Conclusion:

With this procedure, you will have both tables – with and without partitioning – synchronized and you will be able to swap between them until you decide to keep one of them.

PREVIOUS POST
NEXT POST
David Ducos

David studied Computer Science in National University of La Plata and has worked as a DBA consultant since 2008. For the past 3 years he worked with a worldwide platform of free classifieds up until he joined Percona's consulting team in November 2014. David lives near Buenos Aires, Argentina and in his free time loves to spend time with his family.

3 Comments

  • very helpful!
    How does ‘delete’ trigger work if we were to truncate one of the partitions e.g.
    alter table _sbtest_new truncate partition p0;

  • Sounds like a good plan for doing ‘side-by-side’ performance testing. Please write up your findings together with the schema and the SELECTs that show that things got faster (or slower).

    I would be especially interested to see a use case where BY HASH is useful for performance. (I have not found one yet.)

  • Nice article.

    One correction in blog:

    AFTER “We decided to keep the original table, which implies execute:”

    There will be no _sbtest_old table to drop, change it to _sbtest1_diff.

    Thanks!

Leave a Reply