Learning MySQL 5.7: Q & A

August 11, 2017

Author

Jervin Real

MySQL

Share this Post:

In this post I’ll answer questions I received in my Wednesday, July 19, 2017, webinar Learning MySQL 5.7!

First, thank you all who attended the webinar. The link to the slides and the webinar recording can be found here.

I received a number of interesting questions in the webinar that I’ve followed up with below.

Would there be a big difference on passing from 5.1 to 5.6 before going to 5.7 or, at this point, would it be roughly the same?

The biggest risk of jumping between versions, in this case 5.1 to 5.6, is reverting in case of problems. Rollbacks don’t happen often, but they do happen and you have to make sure you have the infrastructure in place whenever you decide to execute. These upgrade steps are not officially supported by Oracle nor even recommended here at Percona. Having said that, as long as your tests (checksums, pt-upgrade) and rollback plan works, this shouldn’t be a problem.

One unforgettable issue I have personally encountered is an upgrade from 5.1 via dump and reload to 5.6. The 5.6 version ran with ROW binlog format preventing replication back to 5.1 because of the limitation with the TIMESTAMP columns. Similarly, downgrading without replication means you have to deal with changes to the MySQL system schema, which obviously require some form of downtime.

Additionally, replication from 5.7 to 5.5 will not work because of the additional metadata information that 5.7 creates (i.e., GTID even when GTID is disabled).

After in-place upgrade a Percona XtraDB Cluster from 5.5 to 5.7 (through 5.6), innodb_file_per_table is enabled by default and the database is now almost twice the size. It was a 40 GB DB now it’s 80 GB due to every table has its own file but ibdata1 is still 40 GB. Is there any solution for this (that doesn’t involve mysqldump and drop tables) and how can this be avoided in future upgrades?

The reason this might be the case is that after upgrading, a number (or possibly all) of tables were [re]created. This would obviously create separate tablespaces for each. One way I can think of reclaiming that disk space is through a familiar upgrade path:

Detach one of the nodes and make is an async replica of the remaining nodes in the cluster

Dump and reload data from this node, then resume replication

Join the other nodes from the cluster as additional nodes of a new cluster using the async replica

Once there is only one node remaining in the original cluster, you can switch to the new cluster for production

Rejoin the last node from the original cluster into the new cluster to complete the process

Depending on the semantics of your switch, it may or may not involve a downtime. For example, if you use ProxySQL this should be a transparent operation.

One way to avoid this problem is by testing. Testing the upgrade process in a lab will expose this kind of information even before deploying the new version into production, allowing you to adjust your process accordingly.

What is a possible impact on upgrades going from the old table format to Barracuda?

So far I am not aware of any negative impact – except if you upgrade and need to downgrade but have since created indexes with prefixes larger than what was supported on the previous version (see large_index_prefix and Barracuda documentation).

Upgrading to Barracuda and one of the supported row formats specifically allows memory constrained systems to save a little more. With BLOB/TEXT column stored off the page, they will not fill the buffer pool unless they are needed.

How do you run mysql_upgrade in parallel?

Good question, I actually wrote about it here.

Can you elaborate on ALTER progress features, and is it also applicable to “Optimization ” query?

I was not able to get more details on the “Optimization” part of this question. I can only assume this too was meant to be table rebuild via OPTIMIZE TABLE. First I would like to point out that OPTIMIZE has been an online DDL operation from 5.6 (with few limitations). As such, there is almost no point in monitoring. Also, for the cases where the online DDL does not apply to OPTIMIZE, under the hood, this is ALTER TABLE .. FORCE – a full table rebuild.

Now, for the actual ALTER process doing a table copy/rebuild, MySQL 5.7 provides some form of progress indication as to how much work has been done. However, it does not necessarily provide an estimate of the actual time it would take to complete. Each ALTER process has different phases which can vary under different conditions. Alternatively, you can also employ other ways of monitoring progress as described in the post.

We are migrated from 5.7.11 to 5.7.17 Percona Server and facing “ Column 1 of table 'x.x' cannot be converted from type 'varchar(100)' to type 'varchar(100)'”.

This is interesting – what we have seen so far are errors with different datatypes or sizes, which most likely means inconsistency from the table structures if the error is coming from replication. We will need more information on what steps were taken during the upgrade to tell what happened here. Our forums would be the best place to continue this conversation. To begin with, perhaps slave_type_conversions might help if the table structures in replication are the same.

Is the Boost Geometry almost on par with Postgres GIS functions?

I cannot answer this with authority or certainty. I’ve used GIS functions in MySQL, but not developed code for it. Although Boost::Geometry was chosen because of its well-designed API, rapid development and license compatibility, it does not necessarily mean it is more mature than PostGIS (which is widely adopted).

What is the best bulk insert method for MySQL 5.7?

The best option can be different in many situations, so we have to put context here. For this reason, let me give some example scenarios and what might work best:

On an upgrade process where you are doing a full dump and reload, parallelizing the process by using mydumper/myloader or mysqlpump will save a lot of time depending the hardware resource available.

Bulk INSERT from your application that happens at regular intervals – multi-row inserts are always ideal to reduce disk writes per insert. LOAD DATA INFILE is also a popular option if you can.

Again, thank you for attending the webinar – if you have additional questions head on out to the Percona Forums!