In my last post, “A closer look at the MySQL ibdata1 disk space issue and big tables,” I looked at the growing ibdata1 problem under the perspective of having big tables residing inside the so-called shared tablespace. In the particular case that motivated that post, we had a customer running out of disk space in his server who was looking for a way to make the ibdata1 file shrink. As you may know, that file (or, as explained there, the set of ibdata files composing the shared tablespace) stores all InnoDB tables created when innodb_file_per_table is disabled, but also other InnoDB structures, such as undo logs and data dictionary.
For example, when you run a transaction involving InnoDB tables, MySQL will first write all the changes it triggers in an undo log, for the case you later decide to “roll them back”. Long standing, uncommited transactions are one of the causes for a growing ibdata file. Of course, if you have innodb_file_per_table disabled then all your InnoDB tables live inside it. That was what happened in that case.
So, how do you move a table outside the shared tablespace and change the storage engine it relies on? As importantly, how does that affects disk space use? I’ve explored some of the options presented in the previous post and now share my findings with you below.
The experiment
I created a very simple InnoDB table inside the shared tablespace of a fresh installed Percona Server 5.5.37-rel35.1 with support for TokuDB and configured it with a 1GB buffer pool. I’ve used a 50G partition to host the ‘datadir’ and another one for ‘tmpdir’:
1 2 3 |
Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg0-lvFernando1 50G 110M 47G 1% /media/lvFernando1 # datadir /dev/mapper/vg0-lvFernando2 50G 52M 47G 1% /media/lvFernando2 # tmpdir |
Here’s the table structure:
1 2 3 4 5 6 7 |
CREATE TABLE `joinit` ( `i` int(11) NOT NULL AUTO_INCREMENT, `s` varchar(64) DEFAULT NULL, `t` time NOT NULL, `g` int(11) NOT NULL, PRIMARY KEY (`i`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; |
I populated it with 134 million rows using the following routine:
1 2 |
INSERT INTO joinit VALUES (NULL, uuid(), time(now()), (FLOOR( 1 + RAND( ) *60 ))); INSERT INTO joinit SELECT NULL, uuid(), time(now()), (FLOOR( 1 + RAND( ) *60 )) FROM joinit; # repeat until you reach your target number of rows |
which resulted in the table below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
mysql> show table status from test like 'joinit'G *************************** 1. row *************************** Name: joinit Engine: InnoDB Version: 10 Row_format: Compact Rows: 134217909 Avg_row_length: 72 Data_length: 9783214080 Max_data_length: 0 Index_length: 0 Data_free: 1013972992 Auto_increment: 134872552 Create_time: 2014-07-30 20:42:42 Update_time: NULL Check_time: NULL Collation: latin1_swedish_ci Checksum: NULL Create_options: Comment: 1 row in set (0.00 sec) |
The resulting ibdata1 file was showing to have 11G, which accounted in practice for 100% of the datadir partition use then. What follows next is a few experiences I did by converting that table to use a different storage engine, moving it outside the shared tablespace, compressing it, and dumping and restoring the database back to see the effects in disk space use. I haven’t timed how long running each command took and focused mostly on the generated files size. As a bonus, I’ve also looked at how to extend the shared table space by adding an extra ibdata file.
#1) Converting to MyISAM
Technical characteristics and features apart, MyISAM tables are know to occupy less disk space than InnoDB’s ones. How much less depends on the actual table structure. Here I made the conversion in the most simplest way:
1 |
mysql> ALTER TABLE test.joinit ENGINE=MYISAM; |
which created the following files (the .frm file already existed):
1 2 3 4 5 |
$ ls -lh /media/lvFernando1/data/test/ total 8.3G -rw-rw---- 1 fernando.laudares admin 8.5K Jul 31 16:21 joinit.frm -rw-rw---- 1 fernando.laudares admin 7.0G Jul 31 16:27 joinit.MYD -rw-rw---- 1 fernando.laudares admin 1.3G Jul 31 16:27 joinit.MYI |
The resulting MyISAM files amounted for an additional 8.3G of disk space use:
1 |
/dev/mapper/vg0-lvFernando1 50G 19G 29G 40% /media/lvFernando1 |
I was expecting smaller files but, of course, the result depends largely on the data types of the columns composing the table. The problem (or the consequence) is that we end up with close to the double of the initial disk space being used:
As it happens with the other solutions presented in this section that migrate the target table outside the shared tablespace, the common/safest way to reclaim the freed (unused) space inside the ibdata1 file back to the operating system is by doing a dump & restore of the full database.
There’s an alternative approach with MyISAM though, which doesn’t involve dump & restore and only requires a MySQL restart. However, you need to convert all InnoDB tables to MyISAM, stop MySQL, delete all ib* files (there should be no remaining .ibd files after you’ve converted all InnoDB tables to MyISAM), and then restart MySQL again. Upon MySQL restart, ibdata1 will be re-created with it’s default initial size (more on this below). You can then convert the MyISAM tables back to InnoDB and if you have innodb_file_per_table enabled this time then the tables will be created with their own private tablespace file.
#2) Exporting the table to a private tablespace
Once you have innodb_file_per_table enabled you can move a table residing inside ibdata1 to it’s private tablespace (it’s own .ibd file) by either running ALTER TABLE or OPTIMIZE TABLE. Both commands create a “temporary” (though InnoDB, not MyISAM) table with it’s own tablespace file inside the database directory (and not in the tmpdir, as I believed it would happen), the rows from the target table being copied over there.
Here’s showing the temporary table (#sql-4f10_1) that was created while the process was still ongoing:
1 2 3 4 5 |
$ ls -lh /media/lvFernando1/data/test total 2.2G -rw-rw---- 1 fernando.laudares admin 8.5K Jul 30 20:42 joinit.frm -rw-rw---- 1 fernando.laudares admin 8.5K Jul 30 23:05 #sql-4f10_1.frm -rw-rw---- 1 fernando.laudares admin 2.2G Jul 30 23:12 #sql-4f10_1.ibd |
and the resulting .ibd file from when the process completed:
1 2 3 4 |
$ ls -lh /media/lvFernando1/data/test total 9.3G -rw-rw---- 1 fernando.laudares admin 8.5K Jul 30 23:05 joinit.frm -rw-rw---- 1 fernando.laudares admin 9.3G Jul 30 23:35 joinit.ibd |
Note that the new joinit.ibd file is about 9.3G. Again, the process resulted in the use of extra disk space:
1 |
/dev/mapper/vg0-lvFernando1 50G 20G 28G 42% /media/lvFernando1 |
#3) Dump and restore: looking at the disk space use
As pointed, the way to reclaim unused disk space inside ibdata1 back to the file system (and consequently making it shrink) is by dumping the full database to a text file and then restoring it back. I’ve started by doing a simple full mysqldump encompassing all databases:
1 |
$ mysqldump -S /tmp/mysql_sandbox5537.sock --user=msandbox --password=msandbox --all-databases > dump.sql |
which created the following file:
1 |
-rw-r--r-- 1 fernando.laudares admin 8.1G Jul 31 00:02 dump.sql |
I’ve then stopped MySQL, wiped out the full datadir and used the script mysql_install_db to (re-)create the system tables (though that’s not needed, I did it to level comparisons; once you have all InnoDB tables out of the system tablespace you can simply delete all ib* files along any .ibd and .frm files for related InnoDB tables), started MySQL again, and finally restored the backup: