Evaluating Performance Improvements in MariaDB 10.5.5

Performance Improvements in MariaDB 10.5.5Recently, I published a series of posts on MySQL and MariaDB, where MariaDB 10.5.4 does not necessarily perform well in some scenarios:

In comments left on How MySQL and MariaDB Perform on NVMe Storage, I was told that MariaDB 10.5.4 is not the best version, as there are already known performance bugs that will be fixed in MariaDB 10.5.5, and it would be better to test MariaDB 10.5.5. And now, MariaDB 10.5.5 comes with fixes:

I made a promise that I will test the new version, so this is me fulfilling my promise.

Instead of posting multiple posts, let’s review all scenarios at once. Just to refresh the material, I’ve tested sysbench-tpcc 1000W, which is about 100GB in size in the following setups:

  • Data is located on SATA SSD, innodb_buffer_pool_size=140GB (CPU-bound scenario)
  • Data is located on NVMe, innodb_buffer_pool_size=140GB (CPU-bound scenario)
  • Data is located on SATA SSD, innodb_buffer_pool_size=25GB (IO-bound scenario)
  • Data is located on NVMe, innodb_buffer_pool_size=25GB (IO-bound scenario)

Data is located on SATA SSD, innodb_buffer_pool_size=140GB (CPU-bound scenario)

(The original results are here – Checkpointing in MySQL and MariaDB)

For this re-test round, I compare only MariaDB 10.5.4 vs MariaDB 10.5.5:

MariaDB 10.5.4 vs MariaDB 10.5.5

After the difference in the warm-up behavior, the result is about the same, and we can confirm this with boxplots for the last 2500 sec:

MariaDB 10.5.4 vs MariaDB 10.5.5 2500 sec

Data is located on SATA SSD, innodb_buffer_pool_size=25GB (IO-bound scenario)

This one is interesting, and actually this is where we can see improvements in MariaDB 10.5.5.

improvements in MariaDB 10.5.5

To see individual variations in details, let’s separate charts:

And then again compare the throughput for the last 2500 sec:

compare the throughput for the last 2500 sec

What’s interesting is not only has MariaDB 10.5.5 improved the results, but it also shows a better average throughput than MySQL 8.0.21, though with a higher variation.

Data is located on NVMe, innodb_buffer_pool_size=140GB (CPU-bound scenario)

There we still see a weird initial drop followed by recovery, and the throughput is similar to MariaDB 10.5.4, but something a little worse.

Data is located on NVMe, innodb_buffer_pool_size=25GB (IO-bound scenario)

throughput is similar to MariaDB 10.5.4

This case is an absolute disaster for MariaDB 10.5.5; there is bug https://jira.mariadb.org/browse/MDEV-23399. Most of the time MariaDB 10.5.5 shows the throughput under 100 tps (while MySQL averages above 5000 tps).

I personally think this is a catastrophic bug and MariaDB 10.5.5 with this performance should never have been released to the public, but instead, stay in the internal testing stage.

Share this post

Comments (8)

  • Federico Razzoli Reply

    Apparently they solved most of the problems you pointed out and they slightly beat MySQL in those areas. The last test is still a disaster.

    There is clearly a problem here. MariaDB 10.5.4 was the first 10.4 stable release, but it included important changes to InnoDB. They should have waited to declare is stable.

    This is comparable to MySQL that constantly introduces changes in GA versions, which at some point broke compatibility with Xtrabackup – fortunately Percona fixed the problem on Xtrabackup side relatively soon (congrats!), but I’d like to know how many environments had broken backups because of this…

    August 15, 2020 at 10:41 am
  • Marko Mäkelä Reply

    I suspect that what you call “weird initial drop followed by recovery” is the purge of history for INSERT.

    Since MDEV-12288 in MariaDB Server 10.3, the hidden DB_TRX_ID column will be reset to 0 when the row is visible to all readers. The motivation for this is to avoid unnecessary lookups of old transactions, speeding up non-locking reads and checks for implicit locks.

    Maybe you did not wait for the purge to run into completion before you started the benchmark?

    MDEV-515 in a future MariaDB release should improve this by allowing inserts into empty tables or partitions skip undo logging altogether. The initial data would be inserted with DB_TRX_ID=0 straight from the beginning.

    MDEV-23399 has been filed for the performance regression in write-heavy I/O bound workloads.

    August 15, 2020 at 3:42 pm
    • Mark Callaghan Reply

      Marko – that is an interesting optimization. Two questions:
      1) Is this the first time InnoDB purge is doing work after inserts?
      2) If purge falls behind and the pages with inserts are no longer in cache will purge threads then do reads from storage to bring them back? So there can be extra storage reads?
      3) Similar to #2, if purge is slow does this risk making a page dirty twice — thus being written back twice (more storage writes) where the first write follows the insert and the second write follows setting DB_TRX_ID to 0?

      August 15, 2020 at 8:23 pm
      • Mark Callaghan Reply

        And is there another way to set DB_TRX_ID=0 with less risk of IO? For example when a page is made dirty, then consider setting DB_TRX_ID=0 at that point (or immediately prior to writeback) because it is in cache and it is already dirty.

        August 15, 2020 at 10:17 pm
      • Marko Mäkelä Reply

        Mark,
        The answer to your first 3 questions is ‘yes’ and to the 4th ‘currently no’. The old idea of attempting to purge history on every visit to the page could be worth pursuing, but I do not think that it would help in this particular case. With per-record transaction identifier in secondary indexes (MDEV-17598), I think that such ‘pre-purge’ should be entirely feasible.

        Currently, to ensure that the database is in a stable and clean state after loading data, SET GLOBAL innodb_fast_shutdown=0 should be executed before initiating shutdown. In MariaDB Server 10.5, the slow shutdown should complete faster than in earlier releases.

        August 16, 2020 at 10:27 am
    • vadimtk Reply

      Marco,

      My procedure is following:
      1. Load data (I do not wait for any internal processes to finish, just the application to finish its queries)
      2. Stop database
      3. Make backup

      The benchmark procedure is:
      1. Restore from backup
      2. Run benchmark

      August 16, 2020 at 6:38 am
      • Hussein Nasser Reply

        It would be interesting to test Marko’s theory and wait for all internal processes to finish before backing up the database. I think the database might have been backed up in a state with all those historical transactions that are no longer necessary since all the rows already visible to all readers anyway.

        I wonder if you will start seeing better performance on both mySQL & MariaDB since there won’t be additional lookups for the old transactions and cost of locking checks. (This is according to Marko’s comment)

        Thanks Vadim for these great articles I learned a lot! I report on your work on my YouTube channel and will be interesting to see the new findings.

        August 18, 2020 at 12:37 pm
  • Pavel Katiushyn Reply

    Hm, I was thinking that each release should pass numerous tests before going live. I am sure it will be quickly fixed, but still thank you Vadim for pointing this issues out.

    August 18, 2020 at 8:27 am

Leave a Reply