Faster Node Rejoins with Improved IST performance

Faster Node Rejoins with Improved IST performance

PREVIOUS POST
NEXT POST

In this blog, we’ll look at how improvements to Percona XtraDB Cluster improved IST performance.

Introduction

Starting in version 5.7.17-29.20 of Percona XtraDB Cluster significantly improved performance. Depending on the workload, the increase in throughput is in the range of 3-10x. (More details here). These optimization fixes also helped improve IST (Incremental State Transfer) performance. This blog is aimed at studying the IST impact.

IST

IST stands for incremental state transfer. When a node of the cluster leaves the cluster for a short period of time and then rejoins the cluster it needs to catch-up with cluster state. As part of this sync process existing node of the cluster (aka DONOR) donates missing write-sets to rejoining node (aka JOINER). In short, flow involves, applying missing write-sets on JOINER as it does during active workload replication.

Percona XtraDB Cluster / Galera already can apply write-sets in parallel using multiple applier threads. Unfortunately, due to commit contention, the commit action was serialized. This was fixed in the above Percona XtraDB Cluster release, allowing commits to proceed in parallel.

IST uses the same path for applying write-sets, except that it is more like a batch operation.

IST Performance

Let’s look at IST performance before and now.

Setup

  1. Two node cluster (node-1 and node-2) and gcache is configured large enough to avoid purging as we need IST
  2. Start workload against node-1 for 30 seconds
  3. Shutdown node-2
  4. Start workload that performs 4M requests against node-1. Workload produces ~3.5M write-sets that are cached in gcache and used later for IST
  5. Start node-2 with N-applier threads
  6. Wait until IST is done
  7. ….. repeat steps 3-6 with different values of N.

Observations:

  • IST is 4x faster with PXC-5.7.17 (compared to previous releases)
  • Improved performance means a quicker node rejoin, and an overall increase in cluster productivity as joiner node is available to process the workload more quickly

Conclusion

Percona XtraDB Cluster 5.7.17 significantly improved IST performance. A faster re-join of the node effectively means better cluster productivity and flexibility in planning maintenance window. So what are you waiting for? Upgrade to Percona XtraDB Cluster 5.7.17 or latest Percona XtraDB Cluster 5.7 release and experience the power!

PREVIOUS POST
NEXT POST

Share this post

Comments (3)

  • Fernando Reply

    Can you show galera.conf variables used by this test? SST section in particular. Thanks

    July 26, 2017 at 3:54 pm
  • Rick James Reply

    Is this fix yet available through Codership or MariaDB?

    September 21, 2017 at 11:53 pm
    • Krunal Bauskar Reply

      Currently it is available only as part of PXC.

      September 24, 2017 at 11:39 pm

Leave a Reply