The title is SEO bait – you can’t do it. We’ve seen a few recurring patterns trying to achieve similar – and I thought I would share with you my favorite two:
Option #1: Use a table to insert into, and grab the insert_id:
CREATE TABLE option1 (id int not null primary key auto_increment) engine=innodb;
# each insert does one operations to get the value:
INSERT INTO option1 VALUES (NULL);
Option #2: Use a table with one just row:
CREATE TABLE option2 (id int not null primary key) engine=innodb;
INSERT INTO option2 VALUES (1); # start from 1
# each insert does two operations to get the value:
UPDATE option2 SET id=@id:=id+1;
So which is better? I don’t think it’s that easy to tell at a first glance, since option 2 does look more elegant – but if the next value is fetched as part of a transaction – I can see a potential for many other transactions to back up waiting on a lock (more on that in a second).
To start with a naive test, I booted two EC2 small instances in the same availability zone. Ping times are ~0.5ms between nodes. –skip-name resolve is enabled on the server. There is some skew from the machine being virtualized. My simulation is:
- Create a new sequence value
- insert into another table
The testing options are:
- Option 1 – auto_commit each statement
- Option 2 – auto_commit each statement
- Option 1 – nest each loop in a transaction
- Option 2 – nest each loop in a transaction
The raw results are:
option1 usetransactions =19 seconds for x10000 iterations.
option1 ignoretransactions = 13 seconds for x10000 iterations.
option2 usetransactions = 27 seconds for x10000 iterations.
option2 ignoretransactions =22 seconds for x10000 iterations.
Alright – option1 seems quicker. The problem is that to be like most applications, we can’t really tell until a little concurrency is applied. Using only the “transactional” test in a few more concurrency options:
Option #1 starts to flatten out after 4 threads – and this is probably just because the machine I am running it against has 1 CPU. Option #2 stays flat the whole time.. and while we are running it, most of the threads are perpetually in a state of ‘Searching rows for update’ – which is what I suspect is better described as waiting on a lock.
Option #2 will likely scale better in auto_commit, since locks are held for the duration of a transaction, but this is not always possible to do if you have already started modifying data before you need an auto_increment number but you do not want to commit yet.