How much overhead is caused by on disk temporary tables

August 16, 2007
Author
Peter Zaitsev
Share this Post:

As you might know while running GROUP BY and some other kinds of queries MySQL needs to create temporary tables, which can be created in memory, using MEMORY storage engine or can be created on disk as MYISAM tables. Which one will be used depends on the allowed tmp_table_size and also by the data which needs to be put to temporary tables – BLOB/TEXT columns are not supported with MEMORY storage engine so must use on disk MyISAM temporary table.

What i decided to do is to see how much overhead do on disk temporary tables cause compared to MEMORY tables. To have things comparable I used medium size table and types which can be presented both in MEMORY and MyISAM storage engine.

The benchmarks are done with MySQL 5.0.45 on 32bit Fedora Core 6 on Pentium 4 box – something I had available for testing.

The table contains 1.000.000 rows with mostly unique “c” column.

As you can see I’m using ORDER BY NULL clause as otherwise MySQL will sort the data after performing group by which is overhead we’re not looking to measure. This is actually good trick to use for cases when you do not need GROUP BY results sorted – it can significantly improve performance in some cases.

First I run the query with default key_buffer_size which is not large enough to fit all key blocks from temporary table index, so we get a lot of key writes which kill performance.

The problem with writes is OS can delay writes only for fixed amount of time and when it has to perform them anyway which can cause IO bound load even with small data which can fully fit in OS cache as in this case. In perfect world it should not happen and we should have seen 50MB sequentially written once per certain amount of time which takes fraction of second to do, but it is not how it happens in practice.

So how long does it take:

I repeated the run a few times with very similar results.

OK so what if we set key buffer to 128M ?

As you can see we have no writes because data fits in key buffer. Note however there is still significant amount of reads (the stats were flushed between query runs) which does not make sense as there no data to be read from the disk as nothing was written. So I filled bug on this.

So what was performance

Wow this is over 15 times better so you surely need to make sure your key_buffer_size is set to accommodate needs of your temporary table. Do not forget about it especially if running only Innodb tables so you may think there is no use for it.

Looking at this number I found it to be too good to be true as I’ve seen significantly worse performance for large data sets. So I decided to see what happens if I create a larger table – 4M rows. I sized key_buffer to 256M so it would fit everything needed and there was still enough memory for OS cache to keep temporary table. Results:

VMSTAT:

In this case as you can see there is a lot of dirty blocks flushing going on with pretty random IO. Clearly OS write caching is not effective for relatively large area (which still fits in memory) which is getting a lot of random writes.

OK. Now lets move to testing MEMORY tables for the same queries:

Ahh… I alway forget GLOBAL only sets global value but it does not affect current session. A little gotcha which still catches me.

For 1M rows we have

Wow. This is great difference even from our best MyISAM results.

For 4M of rows

So MEMORY table scales pretty well with query execution time being close to linear dependence of table size, and it can get 100 times faster than MyISAM on disk table even in case temporary table is small enough to fit in OS cache and key_buffer.

My next intension was to test placing table on tmpfs as this should avoid write overhead we’ve observing (and which is indeed very good production practice, if your temporary tables are moderately sized).

Unfortunately I ran into another bug which means I will have to postpone this part of test to another time.

Summary:
If this was too many details for you to read through here is the summary:

 

    • key_buffer_size is important for MyISAM temporary tables performance to avoid OS writes

 

    • OS Write cache is not as efficient as I would hope, at least on Linux

 

    • Performance of Disk MyISAM tables may not scale proportionally with table size, even when all data fits in memory

 

    • MEMORY temporary tables can be 10-100 times faster than disk based MyISAM tables

 

Hopefully MySQL will implement support of dynamic rows for MEMORY tables some time in the future and so we will be able to use MEMORY table for all cases when data set fits in memory as it is really worth it.

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved