MySQL InnoDB Cluster in a nutshell Part 1: Group Replication

July 9, 2018

Author

Francisco Bordenave

Insight for DBAs

MySQL

Share this Post:

Since MySQL 5.7 we have a new player in the field, MySQL InnoDB Cluster. This is an Oracle High Availability solution that can be easily installed over MySQL to get High Availability with multi-master capabilities and automatic failover.

This solution consists in 3 components: InnoDB Group Replication, MySQL Router and MySQL Shell, you can see how these components interact in this graphic:

In this three blog post series, we will cover each of these components to get a sense of what this tool provides and how it can help with architecture decisions.

Group Replication

This is the actual High Availability solution, and a while ago I wrote a short review of it when it still was in its labs stage. It has improved a lot since then.

This solution is based on a plugin that has to be installed (not installed by default) and works on the top of built-in replication. So it relies on binary logs and relay logs to apply writes to members of the cluster.

The main concept about this new type of replication is that all members of a cluster (i.e. each node) are considered equals. This means there is no master-slave (where slaves follow master) but members that apply transactions based on a consensus algorithm. This algorithm forces all members of a cluster to commit or reject a given transaction following decisions made by each individual member.

In practical terms, this means each member of the cluster has to decide if a transaction can be committed (i.e. no conflicts) or should be rolled back but all other members follow this decision. In other words, the transaction is either committed or rolled back according to the majority of members in a consistent state.

To achieve this, there is a service that exposes a view of cluster status indicating what members form the cluster and the current status of each of them. Additionally Group Replication requires GTID and Row Based Replication (or binlog_format=ROW ) to distribute each writeset with row changes between members. This is done via binary logs and relay logs but before each transaction is pushed to binary/relay logs it has to be acknowledged by a majority of members of the clusters, in other words through consensus. This process is synchronous, unlike legacy replication. After a transaction is replicated we have a certification process to commit the transaction and thus making it durable.

Here appears a new concept, the certification process, which is the process that confirms if a writeset can be applied/committed (i.e. a row change can be done without conflicts) after replication of the transaction is complete.

Basically, this process consists of inspecting writesets to check if there are conflicts (i.e. same row updated by concurrent transactions). Based on an order set in the writeset, the conflict is resolved by ‘first-commiter wins’ while the second is rolled back in the originator. Finally, the transaction is pushed to binary/relay logs and committed.

Solution features

Some other features provided by this solution are:

Single-primary or multi-primary modes meaning that the cluster can operate with a single writer and multiple readers (recommended and default setup); or with multiple writers where all nodes are capable to accept write transactions. The latter is at the cost of a performance penalty due to conflict resolution.

Automatic failure detection, where an internal mechanism is able to detect a failed node (i.e. a crash, network problems, etc) and decide to exclude it from the cluster automatically. Also if a member can’t communicate with the cluster and gets isolated, it can’t accept transactions. This ensures that cluster data is not impacted by this situation.

Fault tolerance. This is the strategy that the cluster uses to support failing members. As described above, this is based on the majority. A cluster needs at least three members to support one node failure because the other two members will keep the majority. The bigger the number of nodes, the bigger the number of failing nodes the cluster supports. The maximum number of members (nodes) in a cluster is currently limited to 7. If it has seven members, then the majority is kept by four or more active members. In other words, a cluster of seven would support up to three failing nodes.

We will not cover the installation and configuration aspects now. This will probably come with a new series of blogs where we can cover not only deployment but also use cases and so on.

In the next post we will talk about the next cluster component: MySQL Router, so stay tuned.

0 0 votes

Article Rating

Subscribe

3 Comments

Oldest

Newest Most Voted

SacTiw

7 years ago

“In other words, the transaction is either committed or rolled back according to the majority of members in a consistent state.”

Can you elaborate bit more on this? What happens to the ones in inconsistent state?

0

Reply

Author

Francisco

7 years ago

SacTiw, since replication is considered synchronous (actually the right term is virtually synchronous) it assumes that all members of cluster has the same data, but (there is always a but) it may fail and at the end this is RBR based replication (same than asynchronous) so if a transaction fails to be commited on a single member (i.e. a duplicate key error) this member stops group replication and reports an error so when you check cluster status you will see the offender offline and not being part of the group until problem is fixed. As pointed, so far there is no autoprovision of members in the same way than Galera does.

0

Reply