EmergencyEMERGENCY? Get 24/7 Help Now!

Some little known facts about Innodb Insert Buffer

 | January 13, 2009 |  Posted In: Insight for DBAs


Despite being standard Innodb feature forever Insert Buffers remains some kind of mysterious thing for a lot of people, so let me try to explain thing a little bit.

Innodb uses insert buffer to “cheat” and not to update index leaf pages when at once but “buffer” such updates so several updates to the same page can be performed with single sweep. Insert buffer only can work for non-unique keys because until the merge is performed it is impossible to check if the value is unique.

Insert buffer is allocated in the Innodb system table space. Even though it is called “buffer” similar to “doublewrite buffer” it is really the space in the tablepace. Though it can be cached in the buffer pool same as other pages. This property allows insert buffer to survive transaction commits and even MySQL restarts. Really it may take weeks before the given index page is merged, though usually it is much sooner than that.

There are two ways of insert buffer merge is happening. First is on demand merge – if accessed page contains unmerged records in insert buffer the merge is performed before page is made available. This means insert buffer can slow down read operations.

The other way insert buffer is merged is by background thread. There are very little merges happening if system is loaded and merge process becomes more active if system is idle. This behavior can cause interesting results, like you had system lightly used and have very little IO activity, but when you remove the load from the system completely you see high IO load which goes for hours even after all buffer pool dirty pages are completed. This can be very surprising.

Stats about Innodb Insert Merge Buffer are available in SHOW INNODB STATUS output:

The “seg size” is a full allocated size of segment in pages. So in this case it is about 180MB
The “free list” is number of pages which are free – containing no unmerged records. The “size” is size (in pages) of insert buffer which is not merged.

The fact size is in pages is not really helpful because depending on the row size there can be different number of rows in the insert buffer – and it is rows we see in performance stats, for example to understand when insert buffer merge will be completed.

The “inserts” is number of inserts to insert buffer since start and number of merged records is number of records which were merged to their appropriate page locations since start. So we know in this case insert buffer has grown 534339 records since start. There is a temptation to use this number as count of unmerged rows in insert buffer but this would not be correct – insert buffer may not be empty at the start. So you can only tell insert buffer has at least this number of records. For the same reason do not get scared if you see more merged records than inserted.

The value of 2246304 merges shows us there was about 3 records merged for each merge operation, meaning insert buffer could in theory reduce IO needed to update leaf pages 3 times.

As I mentioned Insert buffer merge can take quite a while – with 100 records merged per second we’re looking at least 5343 seconds or 1.5 hours on this server… and there are insert buffers which are 10x and 100x larger than this.

Innodb unfortunately offers no control for insert buffer while it surely would be benefiting for different workloads and hardware configuration. For example there is very good question if insert buffer really makes sense for SSD because saving random IO is not so much needed for these devices.

Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master’s Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.


  • I wonder how many of the “tricks” that modern databases use to work around poor random access IO patterns on winchester drives are going to end up being counterproductive once solid state becomes the norm.

    Does the extra complexity of something like the double write buffer or the insert buffer really provide a worthwhile benefit if you assume equal access time to any point in the io subsystem?

  • Pat,

    You’re right. The storage industry is changed and I expect the best performance will be reached with systems which have not been written yet. Storage engines as well as MySQL often has assumptions about disk being rotating disk.

  • Pavel,

    The 100k max writes refers to the cheaper MLC (Multi layer) Flash. SLC (aimed more at the server end) can sustain a higher number of writes.

  • Wagner,

    It is not pages but records which are being inserted. Merged records are those which are in place on the pages where should be. Unmerged records are ones stored in insert buffer and so still to be merged.

  • Innodb locking problem with unique keys when inserting

    CREATE TABLE male_id (
    entry_id int(10) unsigned NOT NULL AUTO_INCREMENT,
    jy_id int(10) unsigned NOT NULL,
    PRIMARY KEY (entry_id),
    UNIQUE KEY jy_id (jy_id)

    When 10 processes concurrently inserting rows to the above table, only the rows from one process are actually recorded in the table. I tested with autocommit=1 and even when transaction_isolation_level=seriazable. It seems Innodb is not doing locking properly. If I remove the primary key (entry_id), the problem remains. But the problem is gone if I change the unique key (jy_id) to be index (jy_id).

  • Hi Peter,
    For MyISAM, the bulk_insert_buff_size have similar effect. So it is still true for MyISAM for the following “Insert buffer only can work for non-unique keys because until the merge is performed it is impossible to check if the value is unique.”


  • I could not find any reference on MySQL how the inserts are buffered (not the secondary index). Does the insets get buffered in the buffer pool like updates and synced later with the ibdata file. ? For updates this buffering improves the performance and reduce the I/O. Are inserts are also buffered the same way. If not then what is the reason. thanks


  • We have a “work horse” data server performing massive amounts of data purification (string matching with huge secondary indexes). The server is designed to clean and parse close on a TB of data using fuzzy matching.

    I find that with a very large insert buffer I am unable to perform a consistent backup with innobackupex.
    The apply-log always fails.
    It appears that the xtrabackup_log grows too big (can be multi-GB). Even just applying the log causes the restore server resource usage to increase very high (close to 100%).
    Right now I am forced to use either a slave and stop the slave until writes completes, then backup, or stop heavy writes to the master and wait for insert buffer to clear.

Leave a Reply