Multi-Column IN clause – Unexpected MySQL Issue

April 5, 2008
Author
Peter Zaitsev
Share this Post:

We have an application which stores massive amount of urls. To save on indexes instead of using URL we index CRC32 of the URL which allows to find matching urls quickly. There is a bit of chance there would be some false positives but these are filtered out after reading the data so it works all pretty well.

If we just process urls one by one it works great:


Handling URLs one by one is however not efficient if you’re processing millions of them so we tried to do bulk fetches:

As you can see just using multiple column IN makes MySQL to pick doing full table scan in this case, even though the cardinality on the first column is almost perfect. I did some more testing and it looks like a bug or missing optimizer feature.

I should not be surprised though as multi-column in is not the most used MySQL feature out there.

For given application case we could simply rewrite query using more standard single column IN clause:

Theoretically speaking this query is not equivalent to the first one – because row having url_crc=2752937066 and url=’http://www.coxandforkum.com/’ would match it, while it should not. It however does not happen in our case as url_crc is functionally dependent on url so both queries are equivalent.

So we’ve got our work around and can forget about the issue and MySQL team gets yet another bug to deal with.
What worries me again is – this is very simple case which seems to to be generally broken which raises a question how good coverage MySQL tests have.

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved