EmergencyEMERGENCY? Get 24/7 Help Now!

The small improvements of MySQL 5.6: Duplicate Index Detection

 | May 31, 2013 |  Posted In: Insight for DBAs, MySQL, Percona Toolkit


MySQL 5.6Here at the MySQL Performance Blog, we’ve been discussing the several new features that MySQL 5.6 brought: GTID-based replication, InnoDB Fulltext, Memcached integration, a more complete performance schema, online DDL and several other InnoDB and query optimizer improvements. However, I plan to focus on a series of posts on the small but handy improvements – changes and bug corrections – in MySQL 5.6 that can make our lives easier and have passed almost unnoticed by most (not all) DBAs.

Duplicate Index Detection

I commented about this on my last webinar, but did not have time to analyze it in-depth.  If you try to do something like this in MySQL 5.5, you will succeed without errors or warnings:

In previous versions of MySQL, you can create two indexes with the same columns (in the same order) and the server will not complain.

If we execute the same sentences in MySQL 5.6, the second ALTER will also succeed -and the index will be created-, but we will get a warning (note severity, to be exact):

As the message points correctly, this is a human mistake, as it is a waste of resources that could potentially impact our performance, and should be avoided. By the way, a good practice to avoid this is always naming your keys with a consistent pattern. This new behavior was introduced in 5.6.7 with the closing of this bug (although this was initially accepted as a bug as early as 2005!).

The report explains more in detail what the “will be disallowed in a future release” means. In MySQL 5.7 the checks will be stricter: in the default SQL mode, a duplicate index will throw a warning instead of a note. In strict mode, it will throw an error and the second ALTER will fail, preventing the creation of the duplicate index.

Does it mean that tools like pt-duplicate-key-checker will not be necessary for MySQL 5.6? Let’s have a look at the code implementing this feature. The warning will only be thrown if the index has not been created automatically, it is not a foreign key, and it has the exact column definition in the same order. In other words, it checks for duplicate keys, but not redundant ones. What is the difference? Let’s see an example. If we execute:

We get no warnings and no errors in 5.6, as the indexes are different. But as you may know, we can use the second index not only for filtering on both columns, but also for filtering by just the first one. Why can’t MySQL enforce this kind of constraints? For many reasons: the first one because it would break 99% of all applications out there that use MySQL, for which we at Percona tend to find redundant indexes. And second, because in some cases, we may need to have what at first seems redundant indexes but effectively they are not -for example, if one of the two indexes was unique or a foreign key.

This is the output of our tool when run on the same table, correctly identifying the redundancy:

Additionally, pt-duplicate-key-checker will detect subtle redundancies that are engine-dependent, like redundant suffixes for secondary keys in InnoDB. As some of this redundancies could be necessary, depending on the query optimizer and the MySQL version, we always recommend to check manually the optimizations proposed by Percona Toolkit. The MySQL server, of course, cannot risk to block directly all cases.

A set of MySQL utilities were introduced by Oracle recently, which includes mysqlindexcheck, similar to pt-duplicate-key-checker, but it does not detect all cases. For example:

By the way, if you want to get more familiar with this and other particularities of the latest MySQL GA release, have a look at our upcoming sessions for the “Moving to 5.6” training course in America (Austin, San Jose) and Europe (Manchester, Utrecht).

Jaime Crespo

Jaime is a former Percona employee. He was a Senior MySQL Instructor for Percona, providing training throughout Europe. He is a certified MySQL DBA, DEV and cluster DBA and his favourite areas are query and schema optimisation and solutions for HA.

Leave a Reply