INSERT ON DUPLICATE KEY UPDATE and REPLACE INTOPeter Zaitsev
Jonathan Haddad writes about REPLACE INTO and INSERT ON DUPLICATE KEY UPDATE. Really, Why MySQL has both of these, especially both are non ANSI SQL extensions ?
The story here seems to be the following – REPLACE INTO existed forever, at least since MySQL 3.22 and was a way to do replace faster and what is also important atomically, remember at that time MySQL only had ISAM tables with table locks and no transactions support. Of course you could use LOCK TABLES but it is not efficient.
The reason REPLACE could be efficient for ISAM and MyISAM, especially for fixed length rows is – it could perform row replacement without reading old data first, and of course because you could set it to replace multiple values at the same time just as you have multiple value INSERT.
As a side note: the fact REPLACE does not have to do read before write is a bit overrated from efficiency standpoint. As most rows are less than 4K-8K in size and are not aligned to OS cache page OS still would need to perform read from hard drive before it can perform an update, if data is not in OS cache, and if it is read would not be large overhead ether.
The problem with REPLACE was – many people tried to use it like update accessing previous column value, for example doing something like REPLACE INTO tbl (col1) VALUES (col1+10);. Instead of acting as update this really will insert NULL value, as col1 is undefined at this stage.
INSERT ON DUPLICATE KEY UPDATE is newer feature which came in MySQL 4.1 by advice one of MySQL big users. The question was efficiently maintaining counters in MySQL. There are also number of similar cases when you want ether to insert the new row or update stats for existing row.
I think both features are really great and I use them both. They are implemented really in MySQL style of being simple powerful and easy to use.