Synchronous Replication in MySQL: Galera vs. Group Replication

Group ReplicationUPDATE: Some of the language in the original post was considered overly-critical of Oracle by some community members. This was not my intent, and I’ve modified the language to be less so. I’ve also changed the term “synchronous” (which the use of is inaccurate and misleading) to “virtually synchronous.” This term is more accurate and already used by both technologies’ founders, and should be less misleading.

I also wanted to thank Jean-François Gagné for pointing out the incorrect sentence about multi-threaded slaves in Group Replication, which I also corrected accordingly.

In today’s blog post, I will briefly compare two major virtually synchronous replication technologies available today for MySQL.

More Than Asynchronous Replication

Thanks to the Galera plugin, founded by the Codership team, we’ve had the choice between asynchronous and virtually synchronous replication in the MySQL ecosystem for quite a few years already. Moreover, we can choose between at least three software providers: Codership, MariaDB and Percona, each with its own Galera implementation.

The situation recently became much more interesting when MySQL Group Replication went into GA (stable) stage in December 2016.

Oracle, the upstream MySQL provider, introduced its own replication implementation that is very similar in concept. Unlike the others mentioned above, it isn’t based on Galera. Group Replication was built from the ground up as a new solution. MySQL Group Replication shares many very similar concepts to Galera. This post doesn’t cover MySQL Cluster, another and fully-synchronous solution, that existed much earlier then Galera — it is a much different solution for different use cases.

In this post, I will point out a couple of interesting differences between Group Replication and Galera, which hopefully will be helpful to those considering switching from one to another (or if they are planning to test them).

This is certainly not a full list of all the differences, but rather things I found interesting during my explorations.

It is also important to know that Group Replication has evolved a lot before it went GA (its whole cluster layer was replaced). I won’t mention how things looked before the GA stage, and will just concentrate on the latest available 5.7.17 version. I will not spend too much time on how Galera implementations looked in the past and will use Percona XtraDB Cluster 5.7 as a reference.

Multi-Master vs. Master-Slave

Galera has always been multi-master by default, so it does not matter to which node you write. Many users use a single writer due to workload specifics and multi-master limitations, but Galera has no single master mode per se.

Group Replication, on the other hand, promotes just one member as primary (master) by default, and other members are put into read-only mode automatically. This is what happens if we try to change data on the non-master node:

To change from single primary mode to multi-primary (multi-master), you have to start group replication with the  group_replication_single_primary_mode variable disabled.
Another interesting fact is you do not have any influence on which cluster member will be the master in single primary mode: the cluster auto-elects it. You can only check it with a query:

Or just:

To show the hostname instead of UUID, here:

Replication: Majority vs. All

Galera delivers write transactions synchronously to ALL nodes in the cluster. (Later, applying happens asynchronously in both technologies.) However, Group Replication needs just a majority of the nodes confirming the transaction. This means a transaction commit on the writer succeeds and returns to the client even if a minority of nodes still have not received it.

In the example of a three-node cluster, if one node crashes or loses the network connection, the two others continue to accept writes (or just the primary node in Single-Primary mode) even before a faulty node is removed from the cluster.

If the separated node is the primary one, it denies writes due to the lack of a quorum (it will report the error ERROR 3101 (HY000): Plugin instructed the server to rollback the current transaction.). If one of the nodes receives a quorum, it will be elected to primary after the faulty node is removed from the cluster, and will then accept writes.

With that said, the “majority” rule in Group Replication means that there isn’t a guarantee that you won’t lose any data if the majority nodes are lost. There is a chance these could apply some transactions that aren’t delivered to the minority at the moment they crash.

In Galera, a single node network interruption makes the others wait for it, and pending writes can be committed once either the connection is restored or the faulty node removed from cluster after the timeout. So the chance of losing data in a similar scenario is lower, as transactions always reach all nodes. Data can be lost in Percona XtraDB Cluster only in a really bad luck scenario: a network split happens, the remaining majority of nodes form a quorum, the cluster reconfigures and allows new writes, and then shortly after the majority part is damaged.

Schema Requirements

For both technologies, one of the requirements is that all tables must be InnoDB and have a primary key. This requirement is now enforced by default in both Group Replication and Percona XtraDB Cluster 5.7. Let’s look at the differences.

Percona XtraDB Cluster:

Before Percona XtraDB Cluster 5.7 (or in other Galera implementations), there were no such enforced restrictions. Users unaware of these requirements often ended up with problems.

Group Replication:

I am not aware of any way to disable these restrictions in Group Replication.

GTID

Galera has it’s own Global Transaction ID, which has existed since MySQL 5.5 and is independent of MySQL’s GTID feature introduced in MySQL 5.6. If MySQL’s GTID is enabled on a Galera-based cluster, both numerations exist with their own sequences and UUIDs.

Group Replication is based on a native MySQL GTID feature and relies on it. Interestingly, a separate sequence block range (initially 1M) is pre-assigned for each cluster member.

WAN Support

The MySQL Group Replication documentation isn’t very optimistic on WAN support, claiming that both “Low latency, high bandwidth network connections are a requirement” and “Group Replication is designed to be deployed in a cluster environment where server instances are very close to each other, and is impacted by both network latency as well as network bandwidth.” These statements are found here and here. However, there is network traffic optimization: Message Compression.

I don’t see group communication level tunings available yet, as we find in the Galera evs.* series of wsrep_provider_options.

Galera founders actually encourage trying it in geo-distributed environments, and some WAN-dedicated settings are available (the most important being WAN segments).

But both technologies need a reliable network for good performance.

State Transfers

Galera has two types of state transfers that allow syncing data to nodes when needed: incremental (IST) and full (SST). Incremental is used when a node has been out of a cluster for some time, and once it rejoins the other nodes has the missing write sets still in Galera cache. Full SST is helpful if incremental is not possible, especially when a new node is added to the cluster. SST automatically provisions the node with fresh data taken as a snapshot from one of the running nodes (donor). The most common SST method is using Percona XtraBackup, which takes a fast and non-blocking binary data snapshot (hot backup).

In Group Replication, state transfers are fully based on binary logs with GTID positions. If there is no donor with all of the binary logs (included the ones for new nodes), a DBA has to first provision the new node with initial data snapshot. Otherwise, the joiner will fail with a very familiar error:

The official documentation mentions that provisioning the node before adding it to the cluster may speed up joining (the recovery stage). Another difference is that in the case of state transfer failure, a Galera joiner will abort after the first try, and will shut down its mysqld instance. The Group Replication joiner will then fall-back to another donor in an attempt to succeed. Here I found something slightly annoying: if no donor can satisfy joiner demands, it will still keep trying the same donors over and over, for a fixed number of attempts: