EmergencyEMERGENCY? Get 24/7 Help Now!

Impact of the sort buffer size in MySQL

 | October 25, 2010 |  Posted In: Benchmarks, MySQL


The parameter sort_buffer_size is one the MySQL parameters that is far from obvious to adjust. It is a per session buffer that is allocated every time it is needed. The problem with the sort buffer comes from the way Linux allocates memory. Monty Taylor (here) have described the underlying issue in detail, but basically above 256kB the behavior changes and becomes slower. After reading a post from Ronald Bradford (here), I decide to verify and benchmark performance while varying the size of the sort_buffer. It is my understanding that the sort_buffer is used when no index are available to help the sorting so I created a MyISAM table with one char column without an index:

and I inserted 100k rows with this simple script:

I know, I could have used the uuid() function of MySQL. For the benchmark, I used an old PII 350 MHz computer, I think for such CPU bound benchmarks, an old computer is better, if there are small differences, they’ll be easier to observe. I varied the sort_buffer_size by steps of 32 KB and I recorded the time required to perform 12 queries like ‘select * from sorttest order by data limit 78000,1’ with, of course, the query cache disabled. I also verified that during the whole process, the computer never swapped and I pre-warmed the file cache before the benchmark by doing “alter table sorttest engine=myisam;”. The script used for the benchmark is the following:

which in addition to output the total time, output the number of Sort_merge_passes, which will be useful to interpret the results. The figure below shows a graphical representation of the results.

The first we can notice by looking at the graph is that the expected correspondence between the time for the queries and the number of sort merge passes. For the small values of the sort buffer size, below 440KB, there are many sort merge passes and the time for the queries hover around 18s. Above 440K, as the sort merge passes drops to 1, there is a large drop of the time for the queries below 14s. Then, as the sort buffer size is further risen, the performance gain is negative up to the point, around 6.4MB where no more sort merge pass are required and then, the time for the queries loses all dependency over the sort buffer size. I am still trying to figure out why the number of sort merge passes felt to zero at 6.4MB since the total size of the table is less than 3MB. That seems to be a pretty high overhead per row (~37 bytes) but if the structure has a few pointers, that can add up to such amount of bytes pretty quickly.

The important point here, at least for the Linux, glibc and MySQL versions I used and for the test I did, there doesn’t seem to be an observable negative impact of the glibc memory allocation threshold at 256KB. I’ll try to find ways to repeat this little experiment with the other per session buffers just to confirm the findings.

OS: Ubuntu 10.04 LTS, Linux test1 2.6.32-21-generic-pae #32-Ubuntu SMP Fri Apr 16 09:39:35 UTC 2010 i686 GNU/Linux
MySQL: 5.1.41-3ubuntu12.6-log

P.S.: Gnumerics for graphs is so much better than OpenOffice Calc.

Yves Trudeau

Yves is a Principal Consultant at Percona, specializing in distributed technologies such as MySQL Cluster, Pacemaker and XtraDB cluster. He was previously a senior consultant for MySQL and Sun Microsystems. He holds a Ph.D. in Experimental Physics.


  • I think it would be interesting to run a similar test on data that is at different levels of presorted nature. i.e.

    100% unsorted – (data is stored in reverse order)
    75% unsorted
    50% unsorted
    25% unsorted
    0% unsorted – (everything is already in the right order)

    A CPU should be able to sort millions of rows/second if they are more or less sorted. My understanding of most data is that it arrives in approximately sorted order (order by time, order by check number).

  • Yves,

    Thanks. I wanted to do detailed test for long time. I looked at it years ago and had similar result:

    My guess though is it is CPU cache related not memory allocation as if we’re sorting 100000 rows mmap() memory allocation should be taking relatively small portion. It is very interesting to compare the graph to one on modern CPU and see if there is any difference.

    I’d also really like to see 2 more things – whenever sorting by short columns (say INT) is same and whenever the sorting with storing the data (instead of data pointer) is same.

  • I think a more common scenario I see is something like this (which I saw last week): sort_buffer_size = 128M, queries are all sorting 10 to 20 rows. What is the performance difference then, between 128M sort_buffer_size and, say, 128k? And what happens when there is high concurrency, so all those malloc() calls are happening at the same time?

    I like gnuplot even better than gnumeric, but gnumeric is definitely better than openoffice 😉

  • I will appreciate to see the benchmark test proposed by Baron. I have some situations in which one I used to tune sort_buffer_size with 128M value to achieve better values on its status variables. 😉

  • Question – what factors (i.e. sort_buffer_size) affect the performance of Repair By Sorting when using alter table for large inserts?

    Related question, assuming large tables, how would you determine the trade-off between using alter tables enable/disable keys vs. just taking the hit for slower inserts? e.g. If you are inserting 1M rows into a table with 300M rows, would you disable keys or not?

    In my testing it was originally faster to alter table, but now it seems to have fallen off a cliff as the repair table is taking very long, and I´m not sure what variable to change to fix that (I know it’s probably memory of some kind, but which?).

    Thanks for any response.

  • regarding: ” I am still trying to figure out why the number of sort merge passes felt to zero at 6.4MB since the total size of the table is less than 3MB. ”
    At https://dev.mysql.com/doc/refman/5.0/en/temporary-files.html it states

    (length of what is sorted + sizeof(row pointer)) * number of matched rows * 2


Leave a Reply