September 23, 2014

Edge-case behavior of INSERT…ODKU

A few weeks back, I was working on a customer issue wherein they were observing database performance that dropped through the floor (to the point of an outage) roughly every 4 weeks or so. Nothing special about the environment, the hardware, or the queries; really, the majority of the database was a single table with an auto-incrementing integer PK and a secondary UNIQUE KEY.

The queries being run against this table were almost exclusively INSERT … ON DUPLICATE KEY UPDATE (INSERT ODKU), with the columns from the INSERT part of the statement corresponding to the columns in the secondary index, and they were coming in at a rate of approximately 1500 to 2000 per second, sustained, 24h per day. The mathematically-astute among you may already be able to see where this is going.

For purposes of discussion, we can use the following table to illustrate the situation:

Now consider the following sequence of events.

Nothing crazy here, right? We’ve inserted one row into an empty table, and if we were to do a SHOW CREATE TABLE we’d see that the AUTO_INCREMENT counter is currently set to 2. If we do an INSERT … ODKU on this table, we see the following:

And now, even though we didn’t insert a new row, our auto-increment counter has increased to 3. This is actually expected behavior; InnoDB checks constraints in the order in which they were defined, and the PRIMARY KEY is always going to be considered as being defined first. So, MySQL checks our INSERT, sees that the next auto-inc value is available, and claims it, but then it checks the UNIQUE KEY and finds a violation, so instead it does an UPDATE. If we look at the handler status counters, we can see that there was 1 request to insert a row (which failed) and 1 request to update a row, which succeeded (this explains why, when there’s a row-update, that we have 2 rows affected and not 1.

At this point, you might be thinking, “Ok, so what?” Let’s go back to our customer. 1500 INSERT ODKUs per second, sustained for 24 hours per day. The PK on their table is the same as what I’ve used in my demonstration table – INT UNSIGNED. Do the math. The maximum value for an auto-increment INT UNSIGNED is 4294967295. Divide that by 1500 qps and then again by 86400, which is the number of seconds in a day, and we get 33.1 days, or a little over 4 weeks. Coincidence? I think not.

So what actually happens when we run out of auto-increment space? Some of the behavior might surprise you. Let’s go back to our demonstration table and insert a row at the end of the auto-increment range, and then try to insert another one:

Ok, so we tried to insert a row, and it failed because the counter for auto-increment was already at its maximum value, and the statement was rolled back. But what happens if we try an INSERT … ODKU? First, recall what’s in our table:

Looks fine, right? 2 rows affected, so obviously, the row that we wanted, i.e., the one that has the username = ‘foo’, was updated with the proper host_id and the last_modified time, and we can happily go about our day. Unfortunately, this isn’t the case.

Yep, it’s actually THE LAST ROW, the one where the id is equal to the auto-increment max value, which is the one that got updated. The secondary UNIQUE on username is, for all intents and purposes, ignored.

For the customer whose database served as the inspiration for this post, we can see fairly easily what the problem turned out to be. 1500 queries per second all trying to lock and update the same row is not going to end well; deadlocks, rollbacks, contention, and all sorts of related unpleasantness. There is, of course, a trivial solution to this: just change the AUTO-INCREMENT column to use a BIGINT, and problem solved.

As it turns out, this is documented behavior; the manual states that our INSERT … ODKU on a table with multiple unique indexes such as this one would be equivalent to “UPDATE update_test SET host_id=7, last_modified=NOW() WHERE id=4294967295 OR username=’foo’ LIMIT 1″, and of course, a PK lookup is going to be more likely to be chosen by the optimizer than a UNIQUE on the secondary index.

So what do we learn here?

  • It’s easier to burn through AUTO_INCREMENT values than you might think; the original customer table in question had less than 500K rows in it.
  • Using signed types for AUTO_INCREMENT columns is almost never a good idea; it wastes half of the column’s available range.
  • Intuition, like the laws of physics, often breaks down in edge case scenarios.
About Ernie Souhrada

Ernie joined Percona in April 2012 as a Senior Consultant. In his previous lives, he has been everything from a Perl/Java developer to a Linux sysadmin, a MySQL DBA to a Cisco network engineer, and a security auditor to an IT engineering manager, many of these things all at the same time. When not working on MySQL, he might be found on the ski slope, at a psytrance festival, or at the nearest sushi bar.

Comments

  1. Daniel says:

    when I was still a support DBA at a datacenter I’d seen this all too often, worse case was when I saw a statistics tracking process attempting to insert into a relatively massive myisam table, which was actively read by the marketing teams management tool
    randomly changing data AND massive random lockups, thankfully non-critical data, but still nothing that a little prior planning couldn’t have resolved when the database was originally setup
    I think my default kneejerk anymore is to make all of my id fields bigint and nearly all of my tables innodb based almost exclusively on that one event

  2. Steve Jackson says:

    You should never use InsODKU’s for large numbers of inserts. You are gonna leak autoincrement id’s like there’s no tomorrow due to mysql’s gap locking.

    The method I use is to insert into these types of tables, involves batching up a number of rows to be inserted / updated, then selecting out the current values from the table into a local array to establish what needs inserting / updating.

    So.

    Pass array of usernames to a WHERE IN (usernames)

    Put id, username into a local array (with username as the key).

    Loop through the first array and check for the presence of the username in the second array.

    If username exists, add to an “update” array.

    If not, add to an insert array.

    When there is no users left, we then do the insert / update.

    For insert implode the insert_array and use a multi-insert. i.e insert into users (username, host_id, last_modified) VALUES (‘foo’,3,NOW()), (‘foo2′, 3, NOW(), ‘foo3′, 3, NOW()))

    For the update, loop through the update array and create an “update case statement”

    eg.

    SET last_modified = CASE id
    WHEN 1 THEN NOW()
    WHEN 2 THEN NOW()
    WHEN 3 THEN NOW()
    END,
    host = CASE id
    WHEN 1 THEN 8
    WHEN 2 THEN 2
    WHEN 3 THEN 4
    END

    Of course you MUST be careful using NOW() if you are doing async parsing of some intermediate table/log. Write the timestamp to THAT log and use that in the statement instead.

    If you need the ID’s of the usernames which you have inserted, then you repeat the first select which got out all the usernames requested from the original array. You can then reference them by the array you have created

    Also, safest is to use select for update on the user table to avoid race conditions — but then be careful of the batch insert/update sizes, because you may run into contention issues. If however, the update doesn’t depend on a previous row value you can skip locking and just overwrite…

    I hope I am making some kind of sense here.

    Good luck!

    //Steve

  3. I ran into another incarnation of the same problem. I had one server in a cluster that was running a 32-bit OS and thus 32-bit PHP. I had suitably large INT fields for autoinc in MySQL, but poor little PHP was putting them into signed 32-bit INTs; so after autoinc IDs rolled past 2^31, that server started doing some very odd things…

    Another good way of burning additional autoinc IDs is if you’re using master-master replication with auto_increment_increment > 1.

  4. Steve – Have you ever tried benchmarking your approach versus just going with a traditional INSERT ODKU? I’d be curious as to whether or not the extra overhead required to process the data offsets any gains achieved by being able to do batched operations. I’m also curious about your “update case statement” … seems to me that past a certain point, the amount of resources devoted to the SQL parser to would noticeably impede performance; we know from HandlerSocket that being able to bypass the parser completely is a good way to improve it.

    Marcus – Signed vs. unsigned is always good for some amusement. Here’s another one of my favorites. For a given language/platform combination, sometimes INET_ATON() [or its equivalent] will happily return a negative number. *Most* platforms have no issues converting these negative integers back into dotted-quad IP addresses with their version of INET_NTOA(). Except…

    php > echo long2ip(-1062706175) , “\n”;
    192.168.100.1

    (root@localhost) [(none)]> select INET_NTOA(-1062706175);
    +————————+
    | INET_NTOA(-1062706175) |
    +————————+
    | NULL |
    +————————+
    1 row in set (0.00 sec)

  5. Ernie, I haven’t benchmarked it, but when I get time I will. I am basing this on recent work I have been doing importing large amounts of data from a legacy schema to a completely re-structured one on a project I am working on. InsODKU just didnt cut it. It was slow as a dog (even when batching in transactions), and was leaking autoincr keys when hitting duplicates in the unique index. I was forced to find another method, and my batch insert / update method was it.

    Now I dont remember exactly what the figures were, but I think it was managing about 400 inserts per second with InsODKU. With my method, it was managing around 10 – 15,000 a second I believe.

    And yes. You are correct when you say that past a certain point, the SQL parser may have trouble – but this is all a question of how many rows to insert / update at once. Furthermore, this does not solve the issue of the autoincr leak. You will leak autoincrs if you have duplicates, that’s just the way it is. The only way to do it is to select out first and determine if it will be an insert or a select (and you must do those operations in bulk anyway to offset the overhead incurred)

    I know people will think I am crazy…pulling things down to the client… but believe me.. it really does work.

    It also highlights what would be a VERY good feature in MySQL.

    InsODKU would be much better implemented on the server side in the same fashion I have done in my client. The structures (indexes) are already in place to do it… there just needs to be a test before hand to see what can be inserted and what can be updated, and gaplock the appropriate number of autoincrs.

    It would also mean that last_insert_id should return an array of inserted ids, instead of only the last one, like it does now.
    That array should give back the unique key => primary key mapping.

    Comments anyone? I think this could be an interesting discussion.

  6. Steve — I’m not surprised that straight inserts are faster; if you think about the operations involved – an insert of a new row which doesn’t violate any other unique keys is just two operations – a single handler_write and a single handler_commit. However, an insertODKU which violates a secondary unique key is at least 4 – a handler_read_key, a write, a commit, and an update. I am a little surprised that you’re showing a full 2 orders of magnitude difference, but on large tables I’m sure it’s possible, because you’re completely removing any need to examine secondary indexes.

    FWIW, I don’t think you’re crazy pulling things down to the client. It’s much easier to scale a [perl|php|ruby|bash] script and run it on N different machines than it is to scale a database server. That said, I think I am a lot less concerned about auto-increment ID leakage than you seem to be. I can certainly see cases where it can make life easier to have tightly-packed row IDs, but I’m not aware of any *performance* issues directly related to auto-increment density. That would be an interesting test to try out, though. [**adds item #1054372 to my to-do list]

    Not sure how I feel about your alternative implementation of I-ODKU. Have to think about that one…

  7. Steve Jackson says:

    Ernie. I have seen many a blog post on Percona’s site about customer cases where they have run out of autoincr id’s. When asked why they didnt just use a much bigger int, I am sure their reply would naturally be “we didnt want a huge primary key, epecially with the number of secondary indexes we have”. This is certainly the motivation behind my concern when it comes to not leaking autoincr’. I know there is no advantage to the density of them, but I want to use small, economic PK columns and not run out.

Speak Your Mind

*