Edge-case behavior of INSERT…ODKU

October 27, 2012
Author
Ernie Souhrada
Share this Post:

A few weeks back, I was working on a customer issue wherein they were observing database performance that dropped through the floor (to the point of an outage) roughly every 4 weeks or so. Nothing special about the environment, the hardware, or the queries; really, the majority of the database was a single table with an auto-incrementing integer PK and a secondary UNIQUE KEY.

The queries being run against this table were almost exclusively INSERT … ON DUPLICATE KEY UPDATE (INSERT ODKU), with the columns from the INSERT part of the statement corresponding to the columns in the secondary index, and they were coming in at a rate of approximately 1500 to 2000 per second, sustained, 24h per day. The mathematically-astute among you may already be able to see where this is going.

For purposes of discussion, we can use the following table to illustrate the situation:

Now consider the following sequence of events.

Nothing crazy here, right? We’ve inserted one row into an empty table, and if we were to do a SHOW CREATE TABLE we’d see that the AUTO_INCREMENT counter is currently set to 2. If we do an INSERT … ODKU on this table, we see the following:

And now, even though we didn’t insert a new row, our auto-increment counter has increased to 3. This is actually expected behavior; InnoDB checks constraints in the order in which they were defined, and the PRIMARY KEY is always going to be considered as being defined first. So, MySQL checks our INSERT, sees that the next auto-inc value is available, and claims it, but then it checks the UNIQUE KEY and finds a violation, so instead it does an UPDATE. If we look at the handler status counters, we can see that there was 1 request to insert a row (which failed) and 1 request to update a row, which succeeded (this explains why, when there’s a row-update, that we have 2 rows affected and not 1.

At this point, you might be thinking, “Ok, so what?” Let’s go back to our customer. 1500 INSERT ODKUs per second, sustained for 24 hours per day. The PK on their table is the same as what I’ve used in my demonstration table – INT UNSIGNED. Do the math. The maximum value for an auto-increment INT UNSIGNED is 4294967295. Divide that by 1500 qps and then again by 86400, which is the number of seconds in a day, and we get 33.1 days, or a little over 4 weeks. Coincidence? I think not.

So what actually happens when we run out of auto-increment space? Some of the behavior might surprise you. Let’s go back to our demonstration table and insert a row at the end of the auto-increment range, and then try to insert another one:

Ok, so we tried to insert a row, and it failed because the counter for auto-increment was already at its maximum value, and the statement was rolled back. But what happens if we try an INSERT … ODKU? First, recall what’s in our table:

Looks fine, right? 2 rows affected, so obviously, the row that we wanted, i.e., the one that has the username = ‘foo’, was updated with the proper host_id and the last_modified time, and we can happily go about our day. Unfortunately, this isn’t the case.

Yep, it’s actually THE LAST ROW, the one where the id is equal to the auto-increment max value, which is the one that got updated. The secondary UNIQUE on username is, for all intents and purposes, ignored.

For the customer whose database served as the inspiration for this post, we can see fairly easily what the problem turned out to be. 1500 queries per second all trying to lock and update the same row is not going to end well; deadlocks, rollbacks, contention, and all sorts of related unpleasantness. There is, of course, a trivial solution to this: just change the AUTO-INCREMENT column to use a BIGINT, and problem solved.

As it turns out, this is documented behavior; the manual states that our INSERT … ODKU on a table with multiple unique indexes such as this one would be equivalent to “UPDATE update_test SET host_id=7, last_modified=NOW() WHERE id=4294967295 OR username=’foo’ LIMIT 1”, and of course, a PK lookup is going to be more likely to be chosen by the optimizer than a UNIQUE on the secondary index.

So what do we learn here?

  • It’s easier to burn through AUTO_INCREMENT values than you might think; the original customer table in question had less than 500K rows in it.
  • Using signed types for AUTO_INCREMENT columns is almost never a good idea; it wastes half of the column’s available range.
  • Intuition, like the laws of physics, often breaks down in edge case scenarios.
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved