MySQL and Percona Server in LinkBench benchmarkAlexey Stroganov
Around month ago Facebook has announced the Linkbench benchmark that models the social graph OLTP workload. Sources, along with a very nice description of how to setup and run this benchmark, can be found here. We decided to run this benchmark for MySQL Server 5.5.30, 5.6.11 and Percona Server 5.5.30 and check how these servers will handle such OLTP workloads in the CPU and IO-bound cases. For this test we used a PowerEdge R720 box with a fast PCI-e flash card as storage.
By default linkbench dataset has 10M ids(after load of data size of datadir ~10GB). We used this dataset to check server behavior when data fully fits buffer pool(size of buffer pool is 30GB). So basically this is a CPU-bound case.
As you can see there is a very slight difference between servers at 64 threads but a much more notable drop for 5.6.11 at 128 threads.
Then we loaded 10x dataset – 100M ids (size of datadir ~100GB), size of the buffer pool is the same – 30GB. So now we explore the IO-bound scenario.
Percona Server 5.5 outperforms MySQL in about 2x times.
Both MySQL 5.5.30 and MySQL 5.6.11 demonstrate notable drops in performance. What can be the reason for that?
Below is a chart with top mutexes for each server at 64 threads:
For MySQL 5.5.30 top mutex is &doublewrite->mutex: trx0sys.c:196. And most likely this symptom is related to BUG#67808.
For MySQL 5.6.11 top mutexes is &buf_pool->mutex,&new_index->lock. I profiled 5.6.11 in this IO bound scenario with the perf – see profile below:
# Overhead Samples Command Shared Object
# ........ .......... ....... ................... ..................................................................................................................................
35.85% 17738833 mysqld mysqld [.] buf_LRU_free_block(buf_page_t*, unsigned long)
--- buf_LRU_free_block(buf_page_t*, unsigned long)
|--99.94%-- buf_LRU_scan_and_free_block(buf_pool_t*, unsigned long)
| |--94.84%-- buf_page_init_for_read(dberr_t*, unsigned long, unsigned long, unsigned long, unsigned long, long, unsigned long)
31.41% 15534570 mysqld mysqld [.] rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned long)
--- rw_lock_x_lock_func(rw_lock_t*, unsigned long, char const*, unsigned long)
|--99.14%-- buf_LRU_free_block(buf_page_t*, unsigned long)
| |--100.00%-- buf_LRU_scan_and_free_block(buf_pool_t*, unsigned long)
| | buf_LRU_get_free_block(buf_pool_t*)
2.53% 1338484 mysqld mysqld [.] ut_delay(unsigned long)
So basically most of the time 5.6.11 spent in LRU_scan. I tried to increase innodb_lru_scan_depth variable to 8k,16k,32k but that had no notably impact on result. In the best possible combination I got ~15k operations per second for MySQL 5.6.11.
In CPU-bounds case MySQL performs quite well, though we can see small performance drop in MySQL 5.6.
In IO-bound cases MySQL still has performance issues around mutexes and Percona Server shows much better results
Configurations and how to run benchmark:
innodb_buffer_pool_size = 30G
innodb_flush_method = O_DIRECT
innodb_log_file_size = 2000M
innodb_log_files_in_group = 2
innodb_flush_log_at_trx_commit = 1
innodb_read_io_threads = 8
innodb_write_io_threads = 8
innodb_io_capacity = 5000
innodb_buffer_pool_instances=16 # MySQL 5.5 and 5.6
loose-innodb_io_capacity_max = 15000
command line to load 10x dataset:
./bin/linkbench -D dbid=linkdb -D host=127.0.0.1 -D user=root -D port=3306 -D password= -D maxid1=100000001 -c config/MyConfig.properties -l
command line to run test for 10x dataset:
./bin/linkbench -D requesters=64 -D dbid=linkdb -D host=127.0.0.1 -D user=root -D port=3306 -D password= -D maxid1=100000001 -c config/MyConfig.properties -csvstats final-stats.csv -csvstream streaming-stats.csv -D requests=5000000 -D maxtime=1800 -r