Impact of memory allocators on MySQL performanceAlexey Stroganov
MySQL server intensively uses dynamic memory allocation so a good choice of memory allocator is quite important for the proper utilization of CPU/RAM resources. Efficient memory allocator should help to improve scalability, increase throughput and keep memory footprint under the control. In this post I’m going to check impact of several memory allocators on the performance/scalability of MySQL server in the read-only workloads.
For my testing i chose following allocators: lockless, jemalloc-2.2.5, jemalloc-3.0, tcmalloc(gperftools-2.0), glibc-2.12.1(new malloc)(CentOS 6.2), glibc-2.13(old malloc), glibc-2.13(new malloc), glibc-2.15(new malloc).
Let me clarify a bit about malloc in glibc. Starting from glibc-2.10 it had two malloc implementations that one can choose with configure option –enable-experimental-malloc. (You can find details about new malloc here). Many distros switched to this new malloc in 2009. From my experince this new malloc behaved not always efficiently with MySQL so i decided to include old one to comparison as well. I used glibc-2.13 for that purpose because later –enable-experimental-malloc option was removed from glibc sources.
I built all allocators from sources(except system glibc 2.12.1) with stock CentOS gcc(version 4.4.6 20110731). All of them were built with -O3. I used LD_PRELOAD for lockless, jemalloc-2.2.5, jemalloc-3.0, tcmalloc and for glibc I prefixed mysqld with:
/[path]/glibc-[X].root/lib/ld-[X].so --library-path /[path]/glibc-[X].root/lib:/lib64:/usr/lib64
- Testing details:
- Cisco USC_C250 box
- Percona Server 5.5.24
- 2 read only scnearios: OLTP_RO and POINT_SELECT from the latest sysbench-0.5
- dataset consists of 4 sysbench tables(50M rows each) ~50G data / CPU bound case
- For every malloc allocator perform the following steps:
- start Percona server either with LD_PRELOAD=[allocator_lib.so] or glibc prefix(see above)/get RSS/VSZ size of mysqld
- warmup with ‘select avg(id) from sbtest$i FORCE KEY (PRIMARY)’ and then OLTP_RO for 600sec
- run OLTP_RO/POINT_SELECT test cases, duration 300 sec and vary number of threads: 8/64/128/256/512/1024/1536
- stop server/get RSS/VSZ size of mysqld
The best throughput/scalability we have with lockless/jemalloc-3.0/tcmalloc. jemalloc-2.2.5 slightly drops with higher number of threads. On the graph with response time(see below) there are spikes for it that may be caused by some contention in the lib. All variations of glibc that are based on new malloc with increasing concurrency demonstrate notable drops – almost two times at high threads. In the same time glibc-2.13 built with old malloc looks good, results are very similar to lockless/jemalloc-3.0/tcmalloc.
For POINT_SELECT test with increasing concurrency we have two allocators that handle load very well – tcmalloc and only slightly behind … glibc-2.13 with old malloc. Then we have jemalloc-3.0/lockless/jemalloc-2.2.5 and last ones are glibc allocators based on new malloc. Along with the best throughput/scalability runs with tcmalloc also demonstrate best response time (30-50 ms at the high threads).
Besides throughput and latency there is one more factor that should be taken into account – memory footprint.
|memory allocator||mysqld RSS size grow(kbytes)||mysqld VSZ size grow(kbytes)|
The only two allocators lockless and glibc-2.15-with new malloc notably incressed RSS memory footprint of mysqld server – more than on 5G. Memory usage for others allocators looks more or less acceptable.
Taking into account all 3 factors – throughput, latency and memory usage for above POINT_SELECT/OLTP_RO type of workloads the most suitable allocators are tcmalloc, jemalloc-3.0 and glibc-2.13 with old malloc.
Important point to take is that new glibc with new malloc implementation may be NOT suitable and may show worse results than on older platforms.
To cover some questions raised in the comments I rerun OLTP_RO/POINT_SELECT tests with jemalloc-2.2.5/jemalloc-3.0/tcmalloc, varied /sys/kernel/mm/transparent_hugepage/enabled(always|never) and gathered mysqld size with ‘ps –sort=-rss -eopid,rss,vsz,pcpu’ during the test run. Just to remind whole test run cycle looks like following:
start server, warmup, OLTP_RO test, POINT_SELECT test. So on charts below you will see how mysqld footprint is changed during the test cycle and what is the impact of disabling of hugepages.
You can read Part 2 of this topic here.