EmergencyEMERGENCY? Get 24/7 Help Now!

Maximal write througput in MySQL

 | February 28, 2010 |  Posted In: Benchmarks, MySQL

PREVIOUS POST
NEXT POST

I recently was asked what maximal amount transactions per second we can get using MySQL and XtraDB / InnoDB storage engine if we have high-end server. Good questions, though not easy to answer, as it depends on:

– durability setting ( innodb_flush_log_at_trx_commit = 0 or 1 ) ?
– do we use binary logs ( I used ROW based replication for 5.1)
– do we have sync_binlog options.

So why would not take these as variable parameters and run simple benchmark.
I took sysbench update_key scenario ( update indexed field on simple table)
and used Dell PowerEdge R900 with 16 cores, FusionIO as storage for table and RAID 10 with BBU as storage for innodb log files, innodb system table space and binary logs. And I used Percon-XtraDB-5.1.43-9.1 for benchmarks. All used partitions are formatted in XFS and mounted with nobarrier option.

I run update key for various threads and with next parameters

  • trx_commit=0 : innodb_flush_log_at_trx_commit = 0 and no binary logs
  • trx_commit=1 : innodb_flush_log_at_trx_commit = 1 and no binary logs
  • trx_commit=0 & binlog : innodb_flush_log_at_trx_commit = 0 and binary logs
  • trx_commit=1 & binlog : innodb_flush_log_at_trx_commit = 1 and binary logs
  • trx_commit=1 & binlog & sync_bin : innodb_flush_log_at_trx_commit = 1 and binary logs and sync_binlog=1
  • There are results I get:

    I found results being quite interesting.
    with innodb_flush_log_at_trx_commit = 0 maximal tps is 36332.02 tps, which drops to 23115.04 tps as
    we switch to innodb_flush_log_at_trx_commit = 1. As we use RAID10 with BBU, I did not expect the drops is going to be significant. In second case InnoDB spends

    With enabling binary logs, the results drops to 17451.01 tps with innodb_flush_log_at_trx_commit = 0 and to 12097.39 tps with innodb_flush_log_at_trx_commit = 1. So with binary logs serialization is getting even worse.

    Enabling sync_binlog makes things really bad, and maximal results I have is
    3086.7 tps. So this is good decision if binary log protection is worth such drop.

    UPDATE ( 3/4/2010 )

    Results with innodb_flush_log_at_trx_commit = 2

    Results with innodb_flush_log_at_trx_commit = 2 and binlogs

    PREVIOUS POST
    NEXT POST
    Vadim Tkachenko

    Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Officer. Vadim leads Percona Labs, which focuses on technology research and performance evaluations of Percona’s and third-party products. Percona Labs designs no-gimmick tests of hardware, filesystems, storage engines, and databases that surpass the standard performance and functionality scenario benchmarks. Vadim’s expertise in LAMP performance and multi-threaded programming help optimize MySQL and InnoDB internals to take full advantage of modern hardware. Oracle Corporation and its predecessors have incorporated Vadim’s source code patches into the mainstream MySQL and InnoDB products. He also co-authored the book High Performance MySQL: Optimization, Backups, and Replication 3rd Edition.

    11 Comments

    • What about innodb_flush_log_at_trx_commit=2 ? That’s a common optimization technique used for less risk (although it’s still technically not 100% ACID compliant, it’s consistent/durable to within a second, give or take depending on the filesystem’s journaling, etc.)

    • Hi,

      Few questions :
      – should innodb_flush_log_at_trx_commit = 2 change anything on this ?
      – if you use replication, by default tutorials put you on the green curve. Would you advise moving to “trx_commit=1 & binlog” or even “trx_commit=0 & binlog” ?

      Thanks,
      Nicolas

    • Hi Vadim! Really cool article!

      BTW, I was wondering how does XtraDB and the like behaves over different RAID-10 stripe sizes and the fact that actually a lot of Linux distros don’t have their default partition boundaries aligned on raid stripe sizes (due to 63 sector offset from a msdos partition table). Have you ever considered (or have any plans) in making those tests? It would be great to see how XtraDB behaves on unaligned vs aligned systems.

      I’ve found some interesting links:
      http://www.linux.com/archive/articles/140734
      http://www.linuxfoundation.org/collaborate/workgroups/linux-raid/raid_setup
      http://kbase.redhat.com/faq/docs/DOC-2893 { unfortunately didn’t find anything like this for XFS 🙁 }

      Best regards,
      Raine

    • 3086.7 tps. looks pretty reasonable. Clearly, synchronous IO is bootlenecked on the raid controller/disk. I am getting similar performance with a raid-10 with a simple dd command:

      time dd if=/dev/zero of=a.dat oflag=sync count=100000
      100000+0 records in
      100000+0 records out
      51200000 bytes (51 MB) copied, 27.9191 s, 1.8 MB/s

      real 0m27.931s
      user 0m0.047s
      sys 0m3.906s

      About 3.5K writes per sec.

    • Interesting results.

      1) The huge drop in performance (sometimes over 50%) due to binlog is a surprise. According to the MySQL Manual “Running a server with binary logging enabled makes performance about 1% slower.” (http://dev.mysql.com/doc/refman/5.1/en/binary-log.html)

      What caused the huge discrepancy between the 50% you measured and the 1% according to MySQL manual?

      2) If the binlog IO is the bottleneck, would putting binlog on FusionIO (instead of RAID 10 HDD) speed things up?

      3) What about innodb_support_xa? Will enabling that cause a further significant drop in performance?

    • Vadim,

      What filesystem did you use here ? I remember EXT3 performs surprisingly bad with sync_binlog=1

      http://www.mysqlperformanceblog.com/2009/01/21/beware-ext3-and-sync-binlog-do-not-play-well-together/

      In any case it looks like sync_binlog has very bad overhead. in innodb_flush_log_at_trx comit=1 you should have 2 fsync per commit in single client (because of XA) with binlog it should be 3 as I understand – the fact performance gap is much larger is different.

      Does group commit works on stage of writiting to binary log or is it only on transactional log commit ?

    • I have been wondering about Power7 / IBM architecture and so-called “high-end” server testing for MySQL.

      Obviously, the Power7 architecture is pound-for-pound more powerful than any Intel platform — Sparc isn’t even a consideration.

      I’d love to see the same tests run on the new 750 series Power7 CPU’s and hardware from IBM.

      My guess is that we’d see another full CLASS of hardware / performance benchmarks unlike any yet seen….

    • Important thing here I’m not sure of.

      CPU doesn’t really matter here much actually.

      Do you have your write cache on your raid controller set to Write through or Write backed?

      Seems too slow for some of the tests you did for the hardware you have this is why I’m asking.

      With write backed you don’t usually see that much difference if raid controller is properly configured and working, it’s nearly like writing to memory rather than disk.

      Was your BBU in re-learn mode at the time?
      If so how much write back cache does your raid card have

    Leave a Reply

 
 

Percona’s widely read Percona Data Performance blog highlights our expertise in enterprise-class software, support, consulting and managed services solutions for both MySQL® and MongoDB® across traditional and cloud-based platforms. The decades of experience represented by our consultants is found daily in numerous and relevant blog posts.

Besides specific database help, the blog also provides notices on upcoming events and webinars.
Want to get weekly updates listing the latest blog posts? Subscribe to our blog now! Submit your email address below and we’ll send you an update every Friday at 1pm ET.

No, thank you. Please do not ask me again.