One common question I guess is how much should I see performance improved in case I increase memory say from 16GB to 32GB. The benefit indeed can be very application dependent – if you have working set of say 30GB with uniform data access raising memory from 16GB to 32GB can improve performance order of magnitude by converting very IO bound load to CPU bound, it is well possible to see limited gains – if your working set already fits in 16GB you may not see any significant gains upgrading memory to 32GB. Interesting enough similar can happen for very large working set – for example if your main queries do full table scan of 100GB table it does not matter if you have 16GB or 32GB the load is going to be way too much IO bound anyway.
Interesting enough because of MySQL scaling issues it is also possible to see performance to go down as you increase buffer pool size. Some threads which would be safely sleeping waiting on IO completion are now finding their data in buffer pool so they start to compete on hot latches and performance go down.
Now back to original question – how do we predict the benefit from increasing the memory and so cache sizes ? I typically start by looking at the type of load we’re dealing with. If it is CPU bound and there is little IO wait we typically do not expect to gain much by increasing the memory.
This however have to be watched carefully. Performance does not always stays the same and the goal may not be optimizing average performance. It may be heavy data processing batch job which is IO bound and which runs too slow (and affects other transaction) and may be increasing memory is helpful to solve this problem.
If it is IO bound (high IO system utilization, low CPU usage) one should think about how much CPU capacity is available. If your CPU is 25% busy you will unlikely get more than 4x even if you eliminate all IO completely (unlikely because there is increased CPU overhead going IO path as well), so account for that.
Besides pure CPU based computation you should account for locking. Consider for example bunch of transactions updating single row in the table. Having such workload you would likely see no IO and a lot of CPU idle and not because of internal Innodb limits but because your application logical serialization problems.
So what if we have very IO bound application without serialization issues (say reporting slave) which is very IO bound showing 100% IO subsystem utilization on 5% of CPU usage ? This is the true challenge becomes because MySQL has no tools to analyze working set (we have per query working set in our patches but it is not enough). We have couple of ideas how to do global working set profiling but it should wait for now.
At this point I typically use my intuition to try to guess how much data application to get some ballpark figure and often it is enough.
If you would like to be more scientific there are couple of other things you can do. First – you can test by scaling down the data. If you have data for say 500.000 users on the 16GB server get it down to the half of that and you will often be close to seeing performance 32GB server would have. You however have to be careful and understand how data is used in your application. If you say have data for 10 years and load data for 5 years only to compare performance you may get misleading results if reports are typically done for last few months. Basically in such exercise your goal is load data so the working set would be half of original so it would have cache fit similar to one on the larger system you’re trying to compare to. Using this approach you also should be careful with your estimations and take IO subsystem into account – even with same cache hit ratio more data and more load means there are higher demands for IO subsystem performance.
By far the best method is trying, if you can afford it – just get memory upgrade and see how it affects performance. With many vendors you can get the memory upgrade or the whole system to try and return it back if it does not fit your needs. This approach especially works well if you have many slaves (or many shards) in which case you can see performance or capacity improvements quite easily.