A couple of weeks ago, Baron Schwartz wrote an interesting post describing a rule of thumb he sometimes uses to choose the order of columns in an index. In a nutshell, he recommends putting highly selective columns first. This is a very good rule of thumb.
I would like to add another rule of thumb: columns that are used for equality comparisons are better than columns that are used for range queries.
If one has a query “select count(*) from foo where a BETWEEN 10 and 20 and b=5”, an index of (b,a) will always be as good, and likely better, than an index of (a,b), regardless of the selectivity of “a” or “b”.
This is because of how MySQL performs range queries. For an index of (a,b), a range query will be done from (10,5) to (20,5). This will require many rows to be processed where 10 < a < 20, but b != 5. On the other hand, for an index of (b,a), a range query will be done from (5,10) to (5,20). In this case, only rows that are going to be counted are processed. No rows that do not fall in the bounds of the query are processed.
There’s a general rule of thumb that more selective fields should go first. And intuitively, an equality comparison is likely to yield a more selective result. But my point is that it doesn’t matter in this case which is more selective. It is always better to put an equality field before a range field, or at the very least, no worse.
Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.
Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below.