We’ve recently done benchmarks comparing different MySQL versions in terms of their CPU efficiently in TPC-C like Workload. We did it couple of weeks ago so MySQL 5.0.67, MySQL 5.1.29 and Innodb Plugin 1.0.1 were used which are not very recent, though we do not think results will differ a lot with today versions.
Results are as follows:
The system was 2* Quad Core Xeon E5310, CentOS 5, Data stored on ramfs. We controlled number of cores used with /sys/devices/system/cpu/cpuX/online Maximum performance for each number of cores was taken though it was reached with number of sessions matching number of cores. Just 1 “Data warehourse” was used to keep data small.
As you can see there is some gain for MySQL from read-write lock split patch (found in Percona Builds) though it is not very significant for this workload. To isolate effect of this patch we only use this patch not full patch set in testing.
MySQL 5.1 is 4% slower than MySQL 5.0 with two cores and just 2% slower with 8 cores, thus showing a bit better scalability.
MySQL 5.1 plugin (compiled in) is further 3% slower compared to MySQL 5.1 with 2 cores and about 6% slower with 8 cores, meaning regression from plugin increases with number of cores.
If you would not only run MySQL plugin but also use new “Barracuda” Innodb format you see just 1% slow down with 2 cores and about half percent with 8 cores which is what you can attribute to measurement error.
This tells us there are some workloads when MySQL 5.1 is slower than 5.0, and same applies to the new Innodb code. Well newer does not mean more efficient, on the contrary newer features often come together with larger code and longer execution path.
Another thing to note – if you’re using Innodb Plugin consider using new Barracuda format, though do this only after you have done your careful testing as this format will not be recognized by older Innodb versions.
Note: These are completely CPU bound test conditions, data fits to buffer pool furthermore data and logs are on ramfs so no IO is ever needed.
UPDATE: Some people are asked about CPU usage in this condition. Here is the graph:
The CPU usage is normalized to match number of CPU cores used, so 100% in 2 cores case is 100% of two cores in case of 8 cores it is 8 cores etc.
As you can see in general the more cores we get the more idle CPU we’re getting.
It is also very interesting to see the corellation between CPU usage and performance. We can see Plugin uses less CPU with 8 cores and has less performance – this usually shows synchronization is the issue. Barracuda format uses more CPU while delivering better performance so it is probably better with latching too though it is hard to say anything about CPU efficiency.
The RW-Lock patch is best in this case. It shows increased performance while decreased CPU usage.
Note when I say CPU usage drop means concurrency issues it does not mean CPU increase does not mean concurrency issues too. You can have 2 cases when waiting on mutexes and other synchronization objects – either everything waits so there is not enough runnable threads to cause full CPU usage, or you may be waiting by spinning on the spinlock wasting CPU cycles. Detailed profiling tells which one.