August 31, 2014

When is MIN(DATE) != MIN(DATE) ?

Inspiration for this post is courtesy of a friend and former colleague of mine, Greg Youngblood, who pinged me last week with an interesting MySQL puzzle. He was running Percona Server 5.5.21 with a table structure that looks something like this:

When he ran this query:

The result came back as 2012-06-22 10:28:16. However, when he ran a slightly different query:

The answer which came back was also completely different: 2011-08-22 11:27:27. This is the correct response, not the date from 2012. We considered a few ideas to try to diagnose the issue, one of which being to try to force MySQL to use a different index. Imagine our surprise when we tried a FORCE INDEX on (bar) or an IGNORE INDEX(uid) and we got a completely different answer: 2012-06-21 20:36:35.

So, we then tried the “obvious” solution:

And the result?

Aha! Bad data! We hypothesized that the MIN() function was doing some sort of internal conversion to a unix_timestamp/integer or a ctime; indeed, running SELECT MIN(UNIX_TIMESTAMP(update_time)) …. does give a result of zero, whereas doing a CAST() or a CONVERT() on the update_time field first (so that it would be treated explicitly as a string) produces the 0024-06-22 result.

The documentation for MySQL 5.5 states that the supported range for the DATETIME type is 1000-01-01 00:00:00 through 9999-12-31 23:59:59. It goes on to say that ‘For the DATE and DATETIME range descriptions, “supported” means that although earlier values might work, there is no guarantee.’ It turned out that an application had submitted the odd datetime, and because the value passed the date-format check, it was allowed into the database. Oooops.

What, then, can we learn from this?

  • Unfortunately, setting sql_mode does not help you here. Although 0024-06-21 is technically outside the supported range for DATE/DATETIME, neither TRADITIONAL, STRICT_ALL_TABLES, or STRICT_TRANS_TABLES produces so much as a warning. Is this a bug or just documented behavior? I suppose that depends on your perspective.
  • When the documentation says that there is “no guarantee that something will work”, just don’t risk it.
  • Even if a piece of data matches the expected format, that does not mean that it’s a valid value. So, perhaps the most important lesson of all: Always, always range-check your input! Format-checking is not enough.

One final aside – I was able to duplicate this same issue with Percona Server 5.5.25a, but in my case, the behavior was even stranger. I created a table with a similar structure:

and then inserted a few bogus rows with the 0024-06-21 date. I then proceeded to dump a few thousand random datetimes into this table via a simple Perl script. Originally, it looked like I wasn’t going to be able to duplicate the situation – running “SELECT MIN(update_time) FROM foo” on my table produced what might arguably be considered the correct answer (1058-11-06 00:00:00), which was the minimum datetime value greater than the minimum supported datetime of 1000-01-01 00:00:00. But then I decided to move my bogus rows around by changing their PK, and this is what happened:

Very bizarre. On one hand, it seems like MySQL is using date windowing for two-digit dates (this would explain 2024-06-21), but on the other hand, it hasn’t actually changed any data in those DATETIME values, and now it’s definitely returning the wrong answer. Always range check your input!

About Ernie Souhrada

Ernie joined Percona in April 2012 as a Senior Consultant. In his previous lives, he has been everything from a Perl/Java developer to a Linux sysadmin, a MySQL DBA to a Cisco network engineer, and a security auditor to an IT engineering manager, many of these things all at the same time. When not working on MySQL, he might be found on the ski slope, at a psytrance festival, or at the nearest sushi bar.

Comments

  1. I’ve found that bizarre values tend to be on the extremes, e.g. all-zero dates or dates like you mentioned. Probably not too hard to find them with a LIMIT 1 query, ordering by DESC too to find 9999-99-99. Maybe a Percona Toolkit tool could iterate through all tables, look for indexed date/time columns, and check the first and last to see if there’s garbage?

  2. Sergei Golubchik says:

    Out of curiosity I’ve tried your test case on MariaDB-5.5.
    And I’ve got none of this weird behavior, it always consistently gave me “0024-06-21 10:35:55″ as the minimum datetime value.

  3. Wow. It looks like there are still skeletons in the closets in the edge cases.
    If MariaDB is cleaning these up it is great !

  4. The weird behavior is not seen the latest MySQL 5.6 Development Milestone Releases either.

  5. Sveta Smirnova says:

Speak Your Mind

*