EmergencyEMERGENCY? Get 24/7 Help Now!

How (not) to find unused indexes

 | October 16, 2009 |  Posted In: Insight for DBAs, MySQL


I’ve seen a few people link to an INFORMATION_SCHEMA query to be able to find any indexes that have low cardinality, in an effort to find out what indexes should be removed.  This method is flawed – here’s the first reason why:

The cardinality of status index is woeful, but provided that the application is always only sending query 1 to MySQL it’s actually a pretty good index!  It’s not always like this, but there are a lot of cases where applications have good selectivity with some queries despite what cardinality shows.

Not convinced?  Here’s reason number two:

The index on query 3 had high cardinality but should not be used since too many countries have a population greater than 1000.  An automated search for low cardinality indexes wouldn’t have revealed it’s uselessness.  For range scans, it’s very easy to lead yourself into a trap where your index can not filter out enough rows to be effective.  I see this a lot in consulting issues where customers have queries that use a BETWEEN on a date, but the window of time it is searching in is too wide.

Side Note: In some texts you’ll see people quote the numbers “20-30%” as the minimum amount of rows you have to filter down to for an index to be useful (that is, eliminate 70-80% of rows).  It’s not quite correct to quote this as an exact percentage, since this value is not fixed in MySQL and can be a much wider window (15-60%) depending on the circumstances.  In this case, MySQL flipped from tablescan to index at about 34%.

How am I supposed to find unused indexes then?
You really have to run queries against your server – there is no other way.  From there, there’s a helpful patch in 5.0-percona called INDEX_STATISTICS that can then show you which indexes were touched and which were not.

If you are not running a patched server, then the alternative is to either use a proxy that checks EXPLAIN information (like QUAN) or set your slow query log to zero microseconds (5.1 feature) and then find someway to parse and EXPLAIN all results, then subtract the indexes that were mentioned from all indexes known.  There’s an old tool called mysqlidxchx which should be able to do this.

Morgan Tocker

Morgan is a former Percona employee. He was the Director of Training at Percona. He was formerly a Technical Instructor for MySQL and Sun Microsystems. He has also previously worked in the MySQL Support Team, and provided DRBD support.


Leave a Reply