October 22, 2014

Some little known facts about Innodb Insert Buffer

Despite being standard Innodb feature forever Insert Buffers remains some kind of mysterious thing for a lot of people, so let me try to explain thing a little bit.

Innodb uses insert buffer to “cheat” and not to update index leaf pages when at once but “buffer” such updates so several updates to the same page can be performed with single sweep. Insert buffer only can work for non-unique keys because until the merge is performed it is impossible to check if the value is unique.

Insert buffer is allocated in the Innodb system table space. Even though it is called “buffer” similar to “doublewrite buffer” it is really the space in the tablepace. Though it can be cached in the buffer pool same as other pages. This property allows insert buffer to survive transaction commits and even MySQL restarts. Really it may take weeks before the given index page is merged, though usually it is much sooner than that.

There are two ways of insert buffer merge is happening. First is on demand merge – if accessed page contains unmerged records in insert buffer the merge is performed before page is made available. This means insert buffer can slow down read operations.

The other way insert buffer is merged is by background thread. There are very little merges happening if system is loaded and merge process becomes more active if system is idle. This behavior can cause interesting results, like you had system lightly used and have very little IO activity, but when you remove the load from the system completely you see high IO load which goes for hours even after all buffer pool dirty pages are completed. This can be very surprising.

Stats about Innodb Insert Merge Buffer are available in SHOW INNODB STATUS output:

The “seg size” is a full allocated size of segment in pages. So in this case it is about 180MB
The “free list” is number of pages which are free – containing no unmerged records. The “size” is size (in pages) of insert buffer which is not merged.

The fact size is in pages is not really helpful because depending on the row size there can be different number of rows in the insert buffer – and it is rows we see in performance stats, for example to understand when insert buffer merge will be completed.

The “inserts” is number of inserts to insert buffer since start and number of merged records is number of records which were merged to their appropriate page locations since start. So we know in this case insert buffer has grown 534339 records since start. There is a temptation to use this number as count of unmerged rows in insert buffer but this would not be correct – insert buffer may not be empty at the start. So you can only tell insert buffer has at least this number of records. For the same reason do not get scared if you see more merged records than inserted.

The value of 2246304 merges shows us there was about 3 records merged for each merge operation, meaning insert buffer could in theory reduce IO needed to update leaf pages 3 times.

As I mentioned Insert buffer merge can take quite a while – with 100 records merged per second we’re looking at least 5343 seconds or 1.5 hours on this server… and there are insert buffers which are 10x and 100x larger than this.

Innodb unfortunately offers no control for insert buffer while it surely would be benefiting for different workloads and hardware configuration. For example there is very good question if insert buffer really makes sense for SSD because saving random IO is not so much needed for these devices.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. pat says:

    I wonder how many of the “tricks” that modern databases use to work around poor random access IO patterns on winchester drives are going to end up being counterproductive once solid state becomes the norm.

    Does the extra complexity of something like the double write buffer or the insert buffer really provide a worthwhile benefit if you assume equal access time to any point in the io subsystem?

  2. peter says:

    Pat,

    You’re right. The storage industry is changed and I expect the best performance will be reached with systems which have not been written yet. Storage engines as well as MySQL often has assumptions about disk being rotating disk.

  3. You didn’t mention that the Percona patches provide control over the insert buffer.

  4. peter says:

    Baron,

    I planned another post about it. As I remember this patch is still in development, is not it ?

  5. Pavel says:

    Pat, will solid state disks survive database workload with their 100k possible writes only?

  6. Morgan Tocker says:

    Pavel,

    The 100k max writes refers to the cheaper MLC (Multi layer) Flash. SLC (aimed more at the server end) can sustain a higher number of writes.

  7. Solid state drives are not the big revolution. The big revolution is nano-wire memory: 1000 faster than Flash memory, terabyte capacity, and lower power consumption.

    This will render databases obsolete, along with silly tricks involving hard disk rotation cycles.

  8. Wagner Bianchi says:

    Can you give me an explaination about merged and unmerged pages?

  9. peter says:

    Wagner,

    It is not pages but records which are being inserted. Merged records are those which are in place on the pages where should be. Unmerged records are ones stored in insert buffer and so still to be merged.

  10. Wagner Bianchi says:

    OK, Peter…thank you…I was missunderstood.

  11. mz says:

    Innodb locking problem with unique keys when inserting

    CREATE TABLE male_id (
    entry_id int(10) unsigned NOT NULL AUTO_INCREMENT,
    jy_id int(10) unsigned NOT NULL,
    PRIMARY KEY (entry_id),
    UNIQUE KEY jy_id (jy_id)
    ) ENGINE=InnoDB AUTO_INCREMENT=1

    When 10 processes concurrently inserting rows to the above table, only the rows from one process are actually recorded in the table. I tested with autocommit=1 and even when transaction_isolation_level=seriazable. It seems Innodb is not doing locking properly. If I remove the primary key (entry_id), the problem remains. But the problem is gone if I change the unique key (jy_id) to be index (jy_id).

  12. justin.ren says:

    There are two ways of insert buffer merge is happening
    how about the third way: slow shutdown

  13. mark wang says:

    Hi Peter,
    For MyISAM, the bulk_insert_buff_size have similar effect. So it is still true for MyISAM for the following “Insert buffer only can work for non-unique keys because until the merge is performed it is impossible to check if the value is unique.”

    Regards
    Mark

Speak Your Mind

*