November 23, 2014

Load management Techniques for MySQL

One of the very frequent cases with performance problems with MySQL is what they happen every so often or certain times. Investigating them we find out what the cause is some batch jobs, reports and other non response time critical activities are overloading the system causing user experience to degrade.

The first thing you need to know it is not MySQL problem, might be even not problem with your MySQL configuration, queries and hardware, even though fixing these does help in many cases. Whatever powerful and well tuned system you have if you put too heavy of concurrent load on it the response times will increase and user experience will suffer.

So what you can do to prevent this problem from happening ? The answer is easy. Throttle the side load so it does not consume too much system resources. Here are some specific techniques to use.

Do push concurrency too high Many developers will test script with multiple level of concurrency and find out doing work from 32 processes is faster than just having one process. This is true if you have system completely at your disposal. If you however need system to serve other users too you typically need to reduce concurrency to where it does not overload the system. Unless it is really time critical process I would not use more than 4 parallel processes heavily writing to database.

Introduce Throttling Sometimes even single process overloads system too much in this case throttling by having relatively short queries and introducing “sleeps” between them can be a good idea. It also often helps with monopolizing replication thread. For example if I need to delete old data instead of DELETE FROM TBL WHERE ts<"2010-01-01" I’ll do “DELETE FROM TBL WHERE TS<"2010-01-01" LIMIT 1000 in the loop until no more rows need to be deleted. When I may inject “sleep” between iterations which to be as long as query execution – so the longer queries run (and the more system is loaded) the more “rest” it will get. Alternatively you can look at “threads_running” variable which is very good simple identifier of the current load and sleep based on its value – for example you may want chose to pause the script at all if the load is too high and wait for threads_running to go below certain value.

Tuning Cron It also often helps to look into your cron or other scheduling system you’re using. Frequently way too many scripts can be started at once, or very close to each other so they start to overlap and so producing the overload. Solutions could be spacing them out, introducing some “job control” to ensure scripts do not run in parallel if they should not (and especially you do not get many copies of same script running at once). One simple solution is instead of having bunch of scripts scheduled at midnight, 1AM, 2AM to start I can put them into nightly.sh one after another and schedule that to run at midnight – this way I get scripts ran one after another at their own pace.

Dedicated Slave I remember listening to Cary Millsap’s talk once and he recommended moving the load in time and space as optimization technique. We spoke about moving load in time before, but we also can move in space – putting it on the different system, which in MySQL space is most commonly dedicated slave. In a lot of environments especially with low level of operational/development discipline to enforce previous solutions it can be a life saver. Of course it only works for read jobs which is important limitation. Getting slave(s) for batch jobs also can help in other ways too – such as competition for buffer pool between different kinds of workloads is reduced.

innodb_old_blocks_time Surprisingly simple but effective, setting innodb_old_blocks_time=1000 can often be very helpful in avoiding batch jobs washing away buffer pool contents and so making normal user queries a lot more disk bound and slower. I wrote about it in more details few months ago.

Finally lets touch upon discovery question. To deal with load management you need to understand whenever the problem is happening in your environment (we want to catch it before users complain right?) and if it does what jobs exactly cause the overload. In complex environments it might be harder question than it looks. pt-stalk is a great tool for this purpose. Getting it running can help you to collect the state of your system when it was overloaded with side load (as well as performing poorly for other reasons). Analyzing wealth of data it generate will most likely contain answers you’re looking for.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Kevin says:

    Another option for using a dedicated slave is to write all batch jobs to use Gearman MySQL UDF interface to apply update transactions to the master (but having the batch job only use the slave connection). You only need to implement a generic Gearman Worker to take update transactions to apply and pace MySQL transactions to the real master db. You can configure multiple workers for quicker batch job execution if the load on the DB is light and these workers can easily throttle themselves when the DB load increases.

    And better yet, you could use Percona MySQL Cluster to have a dedicated master for all batch job processing and use Gearman Workers to apply updates to this dedicated master, using pacing in the Gearman Workers based on load statistics of the other dbs in the cluster.

    Maybe Percona could implement built-in update pacing using Gearman and Percona provided generic worker. You might be able to hide all the implementation details from clients and allow the client to enable Update Pacing using a MySQL session variable.

  2. Angelo says:

    So, i dont know if there is the right place to suggest somenting , but i will, i have one question about select sorted results from a big mysql tabble, myISAM, with about 2 Milion registers , so , what i can do , to get randomic result from this table, with best perfomance?!!? Do you want to post about it?! thanks..

  3. Alexey Polyakov says:

    For linux-based systems one of easiest ways to check if concurrency level needs tuning is user/system cpu usage ratio. Server running at it’s optimal load should have roughly 80-90% userspace cpu usage and less than 10% system.

Speak Your Mind

*