Upgrading MySQL Server is a very interesting task as you can approach it with so much different “depth”. For some this is 15 minutes job for others it is many month projects. Why is that ?
Performing MySQL upgrade two things should normally worry you. It is Regressions – functionality regressions when what you’ve been using before does not work any more or works with different result and performance regressions when performance (in a broad sense) is negatively affected.
In general even minor MySQL version upgrade can have both of these issues. It gets larger as you have a larger leap in minor version – Upgrade of MySQL 5.0.30 to 5.0.32 is generally likely to expose less issues than 5.0.30 to 5.0.86. The largest amount of potential issues happens when you upgrade to different major version, especially if you skip over one. MySQL 5.0 to 5.1 upgrade is a lot safer upgrade path than 4.1 to 5.1 just because so much more people have followed that road.
So how do people approach the upgrade process ? I will describe couple of processes which I’ll call “reckless” and “safe” with no negative meaning implied 🙂
Reckless approach is typically used by small simple systems or by hosting providers. This simply means stopping old MySQL version and starting new one. MySQL always was very good maintaining on disk binary compatibility between version and so you can run MySQL 5.0 database with MySQL 5.1 with no problems in most cases. When storage type changes, such as DECIMAL field got whole another meaning in MySQL 5.0 MySQL generally supports legacy storage format for a while, even if it is not clearly visible. mysqlcheck -A –check-upgrade is a good quick way to check for known incompatibilities and fix them.
The data storage is however only small part of the problem – much bigger is potential behavior changes which may impact your application only subtly – some features can stop working or work differently after change. For example you may get queries returning different result or getting different execution plan, taking minutes or hours even though they were previously sub second. The changes which got a lot of publicity over years is change of TIMESTAMP output format in MySQL 4.1 or changing JOIN evaluation in 5.0 (which could both cause different result set or make query not runnable in MySQL 5.0)
In case you’re using Reckless approach you just hope you’re lucky – you hope you’re using simple enough functionality not to be affected or if you are you are capable of implementing fixes quickly, and what is most important your users will forgive you bugs or potential data loss you may run into 🙂
Reckless approach is safest if you’re running application which uses simple MySQL functionality which is unlikely to be affected. If you’re something like SELECT name FROM users WHERE id=5 you can probably go to the earliest MySQL version and still get same result. The more exotic or complicated functionality is the more likely it to be changed. It is also a lot safer to use this approach if you’re using some widely used application rather than your site which exists in single installation – WordPress, vBulletin etc. With such software, if you’re not upgrading to the most bleeding edge software, chances are someone else has done it before you, and so if there are some problems they have been worked around. The more widely used and well maintained software is the more likely it is to run on wide set of MySQL versions.
Safe Approach So what if you actually care about your user experience or at least about your data ? In this case you may need to take it much slower checking your progress along the way. There is no limit as Safe you can get so I would just outline some reasonable safety for example
Upgrade QA Upgrade needs to start with development/QA boxes. If you’re using safe approach you probably do not do development against your production. Note as soon as you upgrade QA envinronment the code which runs on it may not run well on production until it is upgraded. In case fixing code to run on different MySQL version is taking a lot of time you may wish to create second QA/dev environment just for this project.
Query Validation It is often hard to ensure queries work same way with different MySQL version, unless you have very detailed automated QA. So what you can do is get the backup of the database and get query log (you can use mk-query-digest with tcpdump to sniff out queries if you can’t enable log for any reason). As you’ve gotten database backup and set of queries which is representative enough you should set it up on 2 same boxes and use mk-upgrade tool to run comparison. The tool will run SELECT queries on both new and old MySQL installations and check result set, explain plan and execution time, reporting you on all the differences. When you can analyze the differences and see if there is anything you need to deal with.
Stress Testing Running single stream of queries with good speed is not enough. You also need to perform stress testing to ensure both MySQL does not crash under your load as well s it scales well. There have been number of scalability bugs in MySQL history when issues would only happen at high load. You can do this in test/stress test envinronment or you can do it later when you setup some slaves with this version (or selecting to use version only for new shard and measuring its capacity)
Setting up the Slave(s) If you want to minimize downtime it is best to use MySQL Replication for upgrade. Once you have setup slave with new version and made it to caught up you can use mk-table-checksum to ensure data is the same. It is possible some update statements worked differently in the new version or you run into some replication bug – in any case you need to take care about these if you’re to be safe. Adding the slave with new version for testing is also a safe way to test things out – you can move read traffic to such slave to ensure new version stability and performance while you still have the slave running the old version in case you need to fail back quickly. This especially makes sense for cloud environments as Amazon EC2 where it is easy and inexpensive to temporary get extra boxes for time of upgrade.
Swapping master. Finally as you have your slave running on new MySQL version you want to promote it to the master and upgrade Master too. I’d take the final backup before upgrading and make sure to keep binary logs for some time, in case you run into some serious issues with Master. In many cases you can leave the ex-Master as a slave so you can fail back to the old version easily if you need to. In other cases however it is not possible as MySQL only fully support slaves newer than the master. Older slaves may have issues unable to interpret new replication stream correctly. If this is the problem for you can try Tungsten Replicator which claims to work in both directions (though I have not tried it myself)
Note in case you’re having Sharded envinronment you often approach things a bit differently – upgrading one shard with full validation and when doing reckless upgrades for other shards after you’re sure your application works well with them. You can often stage upgrades in time a bit so if something happens (like MySQL crashes) you have only one/few shard to deal with.
P.S Happy New Year to all of you 🙂