October 31, 2014

What exactly is read_rnd_buffer_size

Looking for documentation for read_rnd_buffer_size you would find descriptions such as “The read_rnd_buffer_size is used after a sort, when reading rows in sorted order. If you use many queries with ORDER BY, upping this can improve performance” which is cool but it does not really tell you how exactly read_rnd_buffer_size works as well as which layer it corresponds to – SQL or storage engine.

Honestly as it had name very similar to read_buffer_size which is currently only used by MyISAM tables I thought read_rnd_buffer_size is also MyISAM only. But talking to Monty today I learned it is not the case.

read_rnd_buffer can be used for All storage engines not only by MyISAM. It is used for some sorts to optimally read rows after the sort. Here is how it works:

As sort is performed it can be performed having only row pointers together with key value – which are offsets for MyISAM and primary key values for Innodb or storing full data which is being retrieved (good for small data lengths).

In case sort with row pointer storage is used and the fields which are being length can be converted to fixed size (basically everything but BLOB/TEXT) MySQL can use read_rnd_buffer to optimize data retrieval – As data is sorted by the key value it needs to be accessed in pretty much random row pointer (typically physical) order. MySQL takes bunch of pointers from sort_buffer (just enough so all rows fit in read_rnd_buffer as they are read) and sorts them by row pointer, when performs reading into read_rnd_buffer in the sorted order – it can be pretty much sequential if you’re lucky.

The read_rnd_buffer_size is important (optimization works in following conditions):

  • Row pointers are stored in the sort_buffer, not the whole data selected
  • Blob/Text columns are not selected
  • A lot of rows are retrieved after sort – if you have LIMIT 10 it is unlikely to help as MySQL will stop fetching rows by
    pointers quickly

For me this means since MySQL 4.1 this option is used in narrow range of cases – if you retrieve few fields (less than max_length_for_sort_data) data should be stored in sort buffer and sort file so there would be no need for read_rnd_buffer, if the selected columns are long so they are longer than max_length_for_sort_data it would frequently mean there are some TEXT/BLOB columns among them. It would be used however if there is large number of columns or there are long VARCHAR columns used – it takes only couple of UTF8 VARCHAR(255) to create a row which is longer than max_length_for_sort_data in its static presentation.

We should do benchmarks sometime to see how it really impacts performance both for MyISAM and Innodb.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Tomas says:

    I would love to see/do some tests related to 3variables changing in some predefined way.
    sort_buffer_size
    max_length_for_sort_data
    read_rnd_buffer_size
    it looks to me that by changing those 3 variables one would be able to tune it in way that
    – short data to be sorted will use just sort_buffer_size
    – while longer rows will also use read_rnd_… which might be set higher
    and the way to change when the data/row is short or long is by max_length_for_sort_data

Speak Your Mind

*