EmergencyEMERGENCY? Get 24/7 Help Now!

How to reclaim space in InnoDB when innodb_file_per_table is ON

 | September 25, 2013 |  Posted In: Insight for DBAs, MySQL, Percona Toolkit

PREVIOUS POST
NEXT POST

When innodb_file_per_table is OFF and all data is going to be stored in ibdata files. If you drop some tables of delete some data then there is no any other way to reclaim that unused disk space except dump/reload method.

When Innodb_file_per_table is ON, each table stores data and indexes in it’s own tablespace file. However, the shared tablespace-ibdata1 can still grow and you can check more information here about why it grows and what are the solutions.

https://www.percona.com/blog/2013/08/20/why-is-the-ibdata1-file-continuously-growing-in-mysql/

Following the recent blog post from Miguel Angel Nieto titled “Why is the ibdata1 file continuously growing in MySQL?“, and since this is a very common question for Percona Support, this post covers how to reclaim the space when we are using innodb_file_per_table. Also, I will show you how to do it without causing performance or availability problems with the help of our Percona Toolkit.

When you remove rows, they are just marked as deleted on disk but space will be consumed by InnoDB files which can be re-used later when you insert/update more rows but it will never shrink. Very old MySQL bug : http://bugs.mysql.com/bug.php?id=1341

But, if you are using innodb_file_per_table then you can reclaim the space by running OPTIMIZE TABLE on that table. OPTIMIZE TABLE will create a new identical empty table. Then it will copy row by row data from old table to the new one. In this process a new .ibd tablespace will be created and the space will be reclaimed

You can see that after deleting 2M records, the test.ibd size was 168M.

After OPTIMIZE, you will be able to reclaim the space. As you can see, test.ibd file size is decreased from 168M to 68M.

I would like to mention here that during that process the table will be locked.(Table locked for just Writes) Which can affect to the performance when you’ll have large table. So If you don’t want to lock the table then you can use one of the best utility by Percona, pt-online-schema-change. It can ALTER without locking tables. You can run ALTER TABLE with ENGINE=INNODB which will re-create the table and reclaim the space.

(It’s always recommended to use latest version of pt-* utilities)

Same here, you can notice that test.ibd file size decreased from 157M to 56M.

NOTE: Please make sure that you have ample space before you run pt-online-schema-change because it will create a temporary table that contains roughly the size of the original table. By running this on the primary node, the changes will be replicated to your slaves.

PREVIOUS POST
NEXT POST
Nilnandan Joshi

Nilnandan officially started with Percona as a Support Engineer. Before joining Percona, he has worked as a MySQL Database administrator with different types of service based companies managing high-traffic websites and web applications. Nilnandan has extensive experience in database design and development, database management, client management, security/documentations/training, implementing DRM solutions, automating backups and high availability. Nilnandan is based at Pune (India). In his spare time, he likes to listen Indian classical/semi-classical music, watching tv, playing cricket/badminton and hang out with his family.

11 Comments

  • Nice use of pt-online-schema-change, Thanks for one of the obvious but very often forgotten uses: plain old alter to clean up space!

  • Unfortunately, optimize table is not a good solution in general. Try to do that with a table with 50M rows, it will take you several hours if not days. Innodb should really fix it at the engine level, i have tables with triple the size of what they should be taking multi GB of data because of this inefficiency of innodb.

  • Agree Recardo, but until they fix this on engine level, we have to be rely on alternative solutions. For the bigger table, you can also use pt-online-schema-change which will run without locking tables.

  • Hi Joshi,
    Question: How does pt-online-schema-change handle the fact that while performing the action the source table might change (because it is not locked). Does it read the binlogs for changed data in the table?

  • Hi Gerrit,

    As you know, pt-online-schema-change creates an empty copy of the table to alter, modifying it as desired, and then copying rows from the original table into the new table. So while copying into new table, pt-online-schema-change will create triggers on the original table to update the corresponding rows in the new table. So any modifications to data in the original tables during the copy will be reflected in the new table.

    Please check more information here.
    https://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html

  • Nifty use of the tool.

    Is it possible to force pt-online-schema-change to use an alternate partition or some other disk location to store the temporary table? I have a slightly degenerate case where we’re writing and deleting a massive amount of binary data to a single table. We use file_per_table, but it gets pretty easy to where the size of just that one table exceeds the remaining disk space left on the high speed SSD disk that this DB runs on.

    Is it possible to force the non-locking schema change to write the temp table to a slower disk of my choosing and then copy it back across when it’s been rewritten? Other thoughts on how we could handle?

    Thanks for a nice and useful post!

  • Looks good!

    We need to reclaim disk space on one of our replication slaves a.s.a.p., which is why I am wondering if I can run pt-online-schema-change ONLY on the slave lacking disk space, or if that will mess up the replication all together (using row based replication)?

    If not, would it be possible to run pt-online-schema-change on the master and rely on the replication to reflect the changes (i.e. the file shrinkage) accordingly to the slave?

    Thanks!

  • Hi Andreas,

    With row based replication, you can’t use pt-osc (pt-online-schema-change) directly on slave because pt-osc uses triggers to get new data during update. triggers are not executed for ROW events, pt-osc will not work if slave will receive row event from master and you might get data inconsistency between master-slave servers.

    So in this case, you can use simple ALTER TABLE on slave directly, it can lock the table while ALTER running and slave can be on hold but you can reclaim the space.

  • Hello Nilnandan,

    I have a problem. The production table I want to alter is around 650 Gb. I want to remove a FK from that table. Mysql version is 5.1.61-log

    I used this tool for stage and that work fine. But the problem is I am not on innodb_file_per_table mode and on production server I am about 200g short on space, since table size is 650 GB and available space is 450 GB. There is no room for another drive.

    Please advise. Is there any way to skip copy?

  • For a practical application of an automatic reclaim of table space one solution is to simply do the pt-online-schema-change for all tables with the ‘no-op’ alter table statement as mentioned in the post.

    However, this is not that efficient because that requires all data from all tables to be copied with possibly very little space reclaimed. As an alternative it would be nice to estimate beforehand how much space can be reclaimed for a particular table in relation to the total table size. Then the automatic task for reclaiming table space could run only when 10% or more table space can be reclaimed.

    To implement something like this it is necessary to get estimates for the total table size and the size of unused or used data in the table. Are there easy methods to do this in MySQL?

Leave a Reply

 
 

Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.

Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.

No, thank you. Please do not ask me again.