EmergencyEMERGENCY? Get 24/7 Help Now!

Galera Cache (gcache) is finally recoverable on restart

 | November 30, 2016 |  Posted In: MySQL, Percona XtraDB Cluster, XtraDB Cluster

PREVIOUS POST
NEXT POST

GcacheThis post describes how to recover Galera Cache (or gcache) on restart.

Recently Codership introduced (with Galera 3.19) a very important and long awaited feature. Now users can recover Galera cache on restart.

Need

If you gracefully shutdown cluster nodes one after another, with some lag time between nodes, then the last node to shutdown holds the latest data. Next time you restart the cluster, the last node shutdown will be the first one to boot. Any followup nodes that join the cluster after the first node will demand an SST.

Why SST, when these nodes already have data and only few write-sets are missing? The DONOR node caches missing write-sets in Galera cache, but on restart this cache is wiped clean and restarted fresh. So the DONOR node doesn’t have a Galera cache to donate missing write-sets.

This painful set up made it necessary for users to think and plan before gracefully taking down the cluster. With the introduction of this new feature, the user can retain the Galera cache.

How does this help ?

On restart, the node will revive the galera-cache. This means the node can act as a DONOR and service missing write-sets (facilitating IST, instead of using SST). This option to retain the galera-cache is controlled by an option named gcache.recover=yes/no. The default is NO (Galera cache is not retained). The user can set this option for all nodes, or selective nodes, based on disk usage.

gcache.recover in action

The example below demonstrates how to use this option:

  • Let’s say the user has a three node cluster (n1, n2, n3), with all in sync.
  • The user gracefully shutdown n2 and n3.
  • n1 is still up and running, and processes some workload, so now n1 has latest data.
  • n1 is eventually shutdown.
  • Now the user decides to restart the cluster. Obviously, the user needs to start n1 first, followed by n2/n3.
  • n1 boots up, forming an new cluster.
  • n2 boots up, joins the cluster, finds there are missing write-sets and demands IST but given that n1 doesn’t have a gcache, it falls back to SST.

n2 (JOINER node log):

n1 (DONOR node log), gcache.recover=no:

Now let’s re-execute this scenario with gcache.recover=yes.

n2 (JOINER node log):

n1 (DONOR node log):

You can also validate this by checking the lowest write-set available in gcache on the DONOR node.

So as you can see, gcache.recover could restore the cache on restart and help service IST over SST. This is a major resource saver for most of those graceful shutdowns.

gcache revive doesn’t work if . . .

If gcache pages are involved. Gcache pages are still removed on shutdown, and the gcache write-set until that point also gets cleared.

Again let’s see and example:

  • Let’s assume the same configuration and workflow as mentioned above. We will just change the workload pattern.
  • n1, n2, n3 are in sync and an average-size workload is executed, such that the write-set fits in the gcache. (seqno=1-x)
  • n2 and n3 are shutdown.
  • n1 continues to operate and executes some average size workload followed by a huge transaction that results in the creation of a gcache page. (1-x-a-b-c-h) [h represent transaction seqno]
  • Now n1 is shutdown. During shutdown, gcache pages are purged (irrespective of the keep_page_sizes setting).
  • The purge ensures that all the write-sets that has seqno smaller than gcache-page-residing write-set are purged, too. This effectively means (1-h) everything is removed, including (a,b,c).
  • On restart, even though n1 can revive the gcache it can’t revive anything, as all the write-sets are purged.
  • When n2 boots up, it requests IST, but n1 can’t service the missing write-set (a,b,c,h). This causes SST to take place.

Summing it up

Needless to say, gcache.recover is a much needed feature, given it saves SST pain. (Thanks Codership.) It would be good to see if the feature can be optimized to work with gcache pages.

And yes, Percona XtraDB Cluster inherits this feature in its upcoming release.

PREVIOUS POST
NEXT POST
Krunal Bauskar

Krunal is PXC lead at Percona. He is responsible for day-day PXC development, what goes into PXC, bug fixes, releases, etc.. Before joining Percona he use to work as part of InnoDB team at MySQL/Oracle. He authored most of the temporary table revamp work, undo log truncate, atomic truncate and lot of other features. In past he was associated with Yahoo! Labs researching on bigdata problems and database startup which is now part of Teradata. His interest mainly includes data-management at any scale and has been practicing it for more than decade now.

2 Comments

  • Hi Krunal,

    As per documentation, majority of nodes should be up for cluster to function. But how come both node 1 is able to serve request when other 2 nodes are down.

  • You can have cluster with single node too. When you first boot the node you have cluster with single node.
    If you have 2 node cluster and node-2 leaves the cluster gracefully (user shutdown) that is not treated as split-brain as before going off node-2 communicate its graceful shutdown status to node-1.

Leave a Reply