October 21, 2014

Filling the tmp partition with persistent connections

The use of tmpfs/ramfs as /tmp partition is a common trick to improve the performance of on-disk temporary tables. Servers usually have less RAM than disk space so those kind of partitions are very limited in size and there are some cases were we can run out of space.

Let’s see one example. We’re running a typical master-slave replication with ROW based binary log file. The application connects to the database using persistent connection technique to avoid creating new connections for every request so it just recycle old connections.

The application starts to load data into the database with “LOAD DATA INFILE”. After some minutes the server runs out of disk space on /tmp/partition. The first step is clear, check which files are on that partition:

Nothing. No files on the partition but it is not really empty:

Lets see which files has MySQL opened:

There are lot of files in deleted status that are filling the partition. The fourth column is the size of those files.

What are those ML* files?

When you run a big UPDATE, INSERT or LOAD DATA INFILE MySQL send the rows to the Binary Log Cache before writing it to the Binary Log Files. If that cache is not big enough to store the temporary data it will create a on-disk cache on the tmp partition with the name ML*.

Here is the problem. Until the connection is closed those files won't disappear, even after committing the transaction and with the rows on the binary log. There are different solution to this problem:

  • Increase the size of the /tmp partition.
  • Increase the size of the binary log cache. binlog_cache_size and binlog_stmt_cache_size. Before 5.5.9 binlog_cache_size was used for both transactional and no transactional statements. After 5.5.9 the cache was splitted in two, binlog_cache_size to hold transactions data and binlog_stmt_cache_size to hold non-transactional statements.
  • Change the session value of binlog_format to STATEMENT just before the execution of those large queries.

It's also worth to mention that there are two methods to mount tmp on RAM, tmpfs and ramfs.

  • Tmpfs has a maximum size, so it returns an error if you fill it too much, and also can go to swap, which would defeat the performance improvement as it uses disk I/O anyway.
  • Ramfs will not swap, and also ramfs will grow dynamically so that it won't return an error that it has run out of space. This can be a bad practice because if you LOAD DATA for a file bigger than available memory, you could accidentally overcommit the physical memory, most likely killing system in the end.

Conclusion

MySQL can creates temporary files even thought those temporary files are not visible on the EXPLAIN or SHOW STATUS output. Therefore, the best solution is to monitor your system.

If your tmp partition is limited you should check the free space on it with a monitoring system like Nagios. Trending graphs from Cacti are also a very good option to know how and when the size of your temporary tables increases. We have our own plugins for both system that can help you avoid these situations.

I've opened a bug report to follow this issue:

http://bugs.mysql.com/bug.php?id=66237

About Miguel Angel Nieto

Miguel joined Percona in October 2011. He has worked as a System Administrator for a Free Software consultant and in the supporting area of the biggest hosting company in Spain. His current focus is improving MySQL and helping the community of Free Software to grow. Miguel's roles inside Percona are Senior Support Engineer and Manager of EMEA Support Team.

Comments

  1. Justin Swanhart says:

    On 5.5.9+ you might have to increase binlog_cache_size if you are using InnoDB, or binlog_cache_stmt_size if you are using MyISAM.

    binlog_cache_size is used for transactional tables, binlog_cache_stmt_size for non-transactional tables

  2. It seems like good idea if we can put temp directory on memory or swap memory for highly loaded MySQL server,
    where running stupid-coded scripts.

    what I am interested in is what is the possibilities to limit the disk space used, or redirect the “disk space” to another partition – let say SSD or striped RAID0

  3. Akshay Suryavanshi says:

    Hi Miguel,

    Thanks for the post. I would like to know methods to find the size of temporary table created while, doing an Alter table on InnoDB table using “expand_fast_index_creation” option turned ON.

    The tmpdir, utilizes the space however it doesnt show up the files consuming the space.

    My understanding is, it uses this space for sorting the clustered index (during Restoring keys phase) and then starts applying the newly sorted secondary indexes.

    Do we have a plugin or method to find the tmp tables created and to find out their sizes.

    Akshay

  4. @Akshay

    Temporary tables created by an alter table are created in your data directory not in the /tmp folder. You can check the opened temporary tables using the INFORMATION SCHEMA:

    INFORMATION_SCHEMA.GLOBAL_TEMPORARY_TABLES
    INFORMATION_SCHEMA.TEMPORARY_TABLES

    http://www.percona.com/doc/percona-server/5.5/diagnostics/misc_info_schema_tables.html

  5. Andy says:

    How does using connection pooling instead of persistent connections solve the problem? With connection pooling there is still a pool of connections that are never closed, so the problem would still exist, no?

  6. @Andy

    If your connection pool doesn’t have any option to enable a periodical recycle of the connections, then you can also have this problem.

  7. @Nikolay

    tmpfs can be limited by size. I don’t know how to redirect the disk space :)

  8. Andy says:

    Miguel,

    What connection pool software would you recommend?

  9. Andy,

    Sorry, I’m not experienced enough on connection pooling to answer that question. Maybe someone else who visits this blog can give you a good advice.

  10. limit of ramfs / memfs is not implemented very good, however probably is good idea to make a file placed on memfs and to mount it like filesystem.

  11. So, I did some checking on this (for another issue).

    The prefix ML is reserved for binlog files, MY for temporary files and ib* (like /tmp/ib38kFa7) for InnoDB’s files. sql/unireg.h has details on this.

    The ‘(deleted)’ is not a bug per se, it is done intentionally so that there is no need to cleanup files if there is a crash etc.

    However, what can be construed as a bug is that these files are not closed till thread disconnects or killed since the cache is per THD and is destroyed only when destructor is called for it.

    It would be better if it is called after every statement rather than after thread exits.

  12. For anyone interested here is the backtrace of it http://sprunge.us/ahdX

    Following patch makes it close the file after every transaction commit (depending on autocommit it can be per statement or without)

    =================
    diff -u sql/log.cc /tmp/log.cc
    — sql/log.cc 2012-08-16 02:33:54.572978622 +0530
    +++ /tmp/log.cc 2012-08-16 02:33:15.489940519 +0530
    @@ -202,7 +202,6 @@
    ~binlog_cache_data()
    {
    DBUG_ASSERT(empty());
    – close_cached_file(&cache_log);
    }

    bool empty() const
    @@ -257,6 +256,7 @@
    cache_log.disk_writes= 0;
    using_xa= FALSE;
    commit_bin_log_file_pos= 0;
    + close_cached_file(&cache_log);
    DBUG_ASSERT(empty());
    }

  13. The previous patch won’t work because it is bound to binlog_cache_mngr. But since it causes performance loses either way, it is not worth it.

  14. Cédric says:

    Hi, I wrote a post about temporary file behavior and I think that can help to merge our two posts : http://www.mysqlplus.net/2012/05/07/temporary-file-behavior-and-lsof-save-life/

Speak Your Mind

*