I had a fun case today.
There is set of cache tables which cache certain content in MyISAM tables and queries for these tables such as:
select data from cache0003 where `key`=2342526263 and real_key='cp_140797_6460aad5d2e50d3e859e8649007686ac';
The “key” is CRC32 of the real key which is used to keep index size as small as possible so if we have a cache miss we can in most case learn it without going to the disk.
So far so good.
The problem I discovered however is some of these queries would take enormous amount of time while CRC32 conflicts are really rare.
Looking deep into the problem I found out PHP and MySQL are both to blame. PHP is to blame because in 32bit PHP version result of crc32() function was returned as signed integer, in 64bit build of same PHP version it became signed.
The system worked on 32bit platform initially so “key” column was defined as “int”
As it was migrated to 64bit platform we got unsigned 32bit values which did not fit in this column any more so MySQL was silently converting them to 2^32-1, in just about 50% of the cases. This one is kind of expected.
What was unexpected however is how MySQL executed select queries if key value would be out of signed int range.
Instead of simply telling “impossible where noticed” as we have value outside of rage of values which can possibly be in the database we have MySQL truncating this value to 2^32-1, then performing index ref lookup (traversing about half of the rows in pages as cardinality for this constant is low) and discarding all of them before no values matched supplied key value.
So beware, data truncation can backfire in a ways you might not ever expect