pt-online-schema-change and innodb_stats_persistent on MySQL 5.6

Percona ToolkitRecently we released an updated version of our Percona Toolkit with an important update for pt-online-schema-change if you use MySQL 5.6. A new parameter has been added, analyze-before-swap. What does it do? Let me explain it in more detail.

A bug not a bug

A customer contacted us because pt-online-schema-change caused hundred of queries to get stuck after the ALTER was done. All those queries were trying read from the altered table but for some reason the queries never finished. Of course, it caused downtime. The reason behind this is this “bug not a bug”:

As a summary, if you are running 5.6 with persistent stats enabled (which it is by default), the optimizer in some cases could choose a bad execution plan because it has incorrect statistics to make a good decision. Even simple queries with a WHERE condition trying to find a value on the PK could switch to a full table scan, because the optimizer has no idea there is a PK.

There are two ways to force index calculations:

  • Wait until the background thread recalculates the statistics of the new table. This could take longer than expected if the load on the server is really high.
  • Run analyze table.

So, –analyze-before-swap actually does that. It runs ANALYZE on the new table before the table swap is done. That means that ANALYZE on the new table (before rename) does NOT affect queries of the table we are altering, but could affect the triggers’ operations for the new table.

Analyze table is not the best solution either. Check out this blog post from Peter:

So in case you want to disable this feature, you would need to add no-analyze-before-swap. If pt-online-schema change finds that the version running is 5.6 and that persistent stats are enabled, then the configuration option will be enabled by default.

Share this post

Comment (1)

  • Øystein Grøvlen

    I do not think Peter’s warning about ANALYZE TABLE is relevant in this scenario. Since applications will not access the new table before the swap, there should be no lock conflicts between ANALYZE TABLE and other connections.

    Note also that it is not sufficient to wait for the recalculation of statistics. Ongoing connections will, AFAIU, never see the updated statistics. The recalculated statistics are only read when opening new table objects. Hence, either ANALYZE TABLE or FLUSH TABLES, which closes all open table objects, are necessary in order for all connections to see the recalculated statistics.

    I do not think “because the optimizer has no idea there is a PK” can be correct. What is the primary key is determined from the data dictionary, not based on InnoDB statistics.

    November 16, 2015 at 7:19 am

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.