MySQL 5.6 vs MySQL 5.5 and the Star Schema Benchmark

MySQL 5.6 vs MySQL 5.5 and the Star Schema Benchmark

MySQL 5.6 vs MySQL 5.5 & the Star Schema Benchmark

So far most of the benchmarks posted about MySQL 5.6 use the sysbench OLTP workload.  I wanted to test a set of queries which, unlike sysbench, utilize joins.  I also wanted an easily reproducible set of data which is more rich than the simple sysbench table.  The Star Schema Benchmark (SSB) seems ideal for this.

I wasn’t going to focus on the performance of individual queries in this post, but instead intended to focus only on the overall response time for answering all of the queries in the benchmark. I got some strange results, however, which showed MySQL 5.6.10 to be much slower than MySQL 5.5.30 even with only a single connection. I felt these results warranted deeper investigation, so I did some research and detailed my findings here.

Just a few notes:
I tested two scenarios: a buffer pool much smaller than the data set (default size of 128MB, which is 1/8th of the data) and I also testing a 4G buffer pool, which is larger than the data. Very little tuning was done. The goal was to see how MySQL 5.6 performs out-of-the-box as compared to 5.5.30 with default settings. The non-default settings were tried to dig deeper into performance differences and are documented in the post.

This blog post is not a definitive conclusion about innodb_old_blocks_pct or innodb_old_blocks_time. It does highlight how a data set much larger than the buffer pool may perform worse with innodb_old_blocks_time=1000, but as I said this needs further investigation. One particular point of investigation which needs to be followed up on, including testing innodb_old_blocks_time=1000 on MySQL 5.5.30 and testing multiple buffer pools on MySQL 5.5.30. Finally, MySQL 5.6.10 has many additional tuning options which must be investigated (MRR, BKA, ICP, etc) before coming to further conclusions. These will be the topic of further blog posts.

Benchmark Details:

The SSB employs a data generator which produces data for a star schema.  Star schema are commonly used for analytics because it is extremely easy to construct queries against.  It is also very easy to define an OLAP cube over a star schema, so they are popular for use with tools like Mondrian and also for data mining.  I wrote an earlier blog post which describes the differences between major schema types.

  • I used the SSB data set at scale factor 1.  Scale factor 1 results in 587MB of raw data, mostly in one  table (lineorder).
  • Each of the 13 queries were executed serially in a single connection
  • I modified the queries to use ANSI JOIN syntax.  No other changes to the queries were made.

Test Environment

  • The MySQL versions used at the time of this post are 5.5.30 and 5.6.10, each of which are GA when this was written.
    • I compiled both servers from source (cmake -gui .; make; make install)
    • Only changes from defaults was that both servers are compiled without the PERFORMANCE_SCHEMA, and paths are unique for basedir and datadir
  • I tested three configurations:
    • Config 1: Default config for MySQL 5.5 and MySQL 5.6, no tuning at all
    • Config 2: MySQL 5.6 with all default settings except innodb_old_blocks_time=0
    • Config 3: MySQL 5.5 and 5.6 with a 4G buffer pool instead of the default 128M


  • Since O_DIRECT is not used by default, the file system cache will give better read performance after first run (but not as good as warm buffer pool)
  • Thus, the results marked COLD are the results after the server reboot, when the FS cache is cold
  • The remaining results are runs without a server restart.  For the default size BP, this means the FS cache is warm.  For the 4G BP, the BP is completely warm.
    • The idea here is to test the situation when the buffer pool is smaller than data and the IO is slow (when the  FS cache is cold, IO to slow IO subsystem happens)
    • Repeated runs test a buffer pool which is smaller than the data but underlying IO is fast (a warm FS cache reduces IO cost significantly)
    • And finally, testing with a 4G buffer pool shows how the system performs when the data fits completely into the buffer pool (no IO on repeat runs)

Test Server:

    • Intel core i970-3.20GHz.  12 logical cores (six physical cores).
    • 12GB memory
    • 4 disk 7200RPM RAID 10 array with 512MB write-back cache

 Star Schema Benchmark – Scale Factor 1 – Mysql 5.5 vs 5.6
response times are in seconds (lower is better)

Version Buffer Cold Run1 Run2 Run3
5.5.30 128M 361.49 189.29 189.34 189.40
5.6.10 128M 362.31 324.25 320.74 318.84
5.6.10 (innodb_old_blocks_time=0) 128M 349.24 178.80 178.55 179.07
5.5.30 4G 200.87 20.53 20.36 20.35
5.6.10 4G 195.33 14.41 14.45 14.61

I started by running the benchmark against MySQL 5.5.30.  It took 361.49 seconds to complete all 13 queries.  I then repeated the run three more times.  The speed is very consistent, just a few tenths of a second off per run.  I then rebooted the machine and fired up 5.6.10.   I ran the test, and to my surprise MySQL 5.6.10 did not get much faster during the repeat runs, compared to the initial cold run.  I stopped the MySQL 5.6 server, rebooted and verified again.  Same issue.  This was very different from MySQL 5.5.30, which performs significantly better on the repeat warm runs.

Just to be sure it wasn’t a disk problem, I pointed the MySQL 5.6.10 at the MySQL 5.5.30 data directory.  Tthe speed was essentially the same.   I did some further investigation and I determined that there was a lower buffer pool hit ratio during the MySQL 5.6 runs and MySQL 5.6.10 was doing more IO as a consequence.  To confirm that this was indeed the problem I decided to compare performance with a buffer pool much larger than the data size, so I configured the server with a 4GB buffer pool.  I tested both versions, and as you can see above, MySQL 5.6 outperformed MySQL 5.5.30 with the big buffer pool.

Why is the MySQL 5.6.10 with default settings test significantly slower than MySQL 5.5.30 in repeat runs?

I thought about the differences in the defaults between MySQL 5.5 and MySQL 5.6 and innodb_old_blocks_time immediately came to mind.  The InnoDB plugin introduced innodb_old_blocks_time to help control the behavior of the new split LRU mechanism which was implemented in the plugin.  In the original InnoDB, the LRU was implemented as a classic LRU which is subject to “pollution” by full table scans.  In the classic LRU, a full table scan pushes out important hot pages from the buffer pool often for an infrequent scan, like a backup or report.  In an OLTP system this can have very negative performance consequences.

The plugin attempts to fix this problem by splitting the LRU into hot and cold sections.  When a page is first read into the buffer pool it is first placed onto the head of the cold section of the LRU, where it begins to age of naturally.  If the page is touched again while on the cold portion, it is moved to the head of the hot portion.

This sounds good in theory, but in practice it is problematic.  What usually happens is that the full table scans access the table by primary key.  This forces the storage engine to touch the same page numerous times in rapid succession.  This invariably moves the page onto the hot area, defeating the split.  In order to prevent this from happening, another variable innodb_old_blocks_time was introduced.

Innodb_old_blocks_time controls how long a page must be on the cold portion of the LRU before it is eligible to be moved to the hot portion.  In MySQL 5.5 and earlier, innodb_old_blocks_time defaults to a value of 0(zero), which means that pages move rapidly from the cold portion to the hot portion because they must stay on the cold LRU for zero milliseconds before being able to move to the hot list.  In MySQL 5.6 the default value of innodb_old_blocks_time is changed to 1000.   The location at which a page is initially placed into the LRU is defined by innodb_old_blocks_pct.  The default value on both versions is 38, which happens to be 3/8 of the buffer pool.

For this workload with a small buffer pool (the buffer pool is smaller than the working set) having innodb_old_blocks_time=1000 appears to cause a major performance regression.  The new setting  changes which pages end up staying in the buffer pool, and which are aged out.

Digging into why innodb_old_blocks_time change the performance?

Each “flight” of queries represents a set of drill-down queries to find an anomaly.  I am going to focus on the first query, which uses only one join. Since it is practical for a query with only one join, I’ve tested performance of the query with the join in both directions.
Explain for query Q1.1:

After running the query, see how many pages were read from disk versus how many page requests their were: