Troubleshooting MySQL Upgrade Performance Regressions

So lets say you upgraded from MySQL 5.1 to Percona Server 5.5 and instead of expected performance improvement you see your performance being worse. What should you do ?
First if you followed MySQL upgrade best practices such as testing your workload with pt-upgrade the chances of this happening are rather slim. But lets assume you have not followed these recommendations to the book or some things just slipped through.

First lets talk about what you should not do. You should not panic and go change all kind of configuration options in case you get lucky. Sometimes you do but more often you just waste your time. Instead try to understand what has changed and what exactly changed for worse.

I am speaking about MySQL upgrade – version change but in a lot of cases this will happen when settings are changed at the same time, operating system is upgraded or system is moved to “better” hardware. It is best if you can isolate the problem to one of these – make sure changing only MySQL version on the same hardware with same settings causes regression. You can always improve them later after you got well performing baseline. Do not change storage engine at the same time eather.

Make sure you’re looking at fully warmed up system. Rather often people panic and what thought to be slow down with version upgrade is rather system warming up. You need to compare apples to apples such as both systems after restart or even better both systems fully warmed up.

To spot what exactly is causing the problem it is good to use pt-query-digest to compare workloads on servers. Best if you can analyze full query logs but if this is not possible at least look at network traffic from tcpdump. This should help you to spot queries which are a lot slower on new MySQL version.

Once you have the queries you will most likely able to repeat the problem by running query on old and new MySQL Server version and observing the difference. Sometimes though it might not show performance difference ran alone on idle system – in this case you might be dealing with concurrency problem and you might need to create a more complicated test which runs this query concurrently many times or even workload consisting of several query types. Though it is rather rare for this to be needed so lets not go there.

It is ideal if you can repeat the problem on “test server” where you can analyze it without side load. Setting up 2 MySQL Servers side by side (for example with MySQL Sandbox) can especially be helpful.

Once you have spotted the query which performs differently between MySQL Server versions you should:

Check Query Plan Run EXPLAIN to see if plans for the query are the same. Changing Query Execution plans is the most common regression problem. If plan has changed check
if you can get the plan manually to the old one by using hints such as STRAIGHT_JOIN, FORCE INDEX, BIG_RESULT/SMALL_RESULT. Check whatever stats are the same (run SHOW INDEXES FROM

for tables involved and check cardinality) Different stats can often cause different plans. Run ANALYZE TABLE on both MySQL Versions to see if you can get statistics to be same or close. I should warn you though. In some cases the faster plan may be wrong plan from optimizer standpoint and updating stats may cause old MySQL Version to get slower plan instead of new one getting faster. Check stats sampling settings such as innodb-stats-method and myisam-stats-method. For Innodb storage engines it is possible for stats to be different for the same data because it uses random sampling to update the stats. You may consider disabling innodb_stats_on_metadata and increasing innodb-stats-sample-pages to get more accurate results. In recent Percona Server versions and MySQL 5.6 you can also store innodb stats in table so you have more control over them.

In the end the tricky thing about Query Plan is you might not be able to get new version to pick the old plan no matter how you try and you might be forced to change your application by changing how query is written and provide the hints.

Check Query Execution If query execution plan looks the same it is good to check whenever it seems to be executing same on low level. FLUSH STATUS; SHOW STATUS can show whenever internal operation is different, you can also check SHOW PROFILE to see where differences are. Running query in the loop and getting o_profile sample can also be a good idea. If you found query is executing differently even if the plan is the same it could be because changes to the options (might be even defaults) or some more subtle plan changes which are not seen in explain. It also could be some additional internal overhead being added. If you have query with same plan taking a lot longer to execute you might be up for deep investigation. You may also wish to create a repeatable test case which does not use any sensitive data and file the bug at this point.

Check Binaries Make sure you’re not using debug binary or differently optimized binaries.

When you’re about to file a bug or report the problem through MySQL support channel if often helps to know whenever it is the problem in stick MySQL changes between versions on Percona enhancements. Before reporting bug to Percona Server bug tracking system check whenever the bug is in Percona Server indeed by running the same test on MySQL server of the same version. If you’re Support customer you do not need to do it as Support Engineer can do it for you provided we have complete test case provided.

Share this post