Inconsistent Voting in Percona XtraDB Cluster

AKA Cluster Error Voting…

Cluster Error VotingWhat is Cluster Error Voting (CEV)?

“Cluster Error Voting is a new feature implemented by Alexey Yurchenko, and it is a protocol for nodes to decide how the cluster will react to problems in replication. When one or several nodes have an issue applying an incoming transaction(s) (e.g., suspected inconsistency), this new feature helps. In a 5-node cluster, if 2-nodes fail to apply the transaction, they get removed, and a DBA can go in to fix what went wrong so that the nodes can rejoin the cluster. (Seppo Jaakola)”

This feature was ported to Percona XtraDB Cluster (PXC) in version 8.0.21. As indicated above, it is about increasing the resilience of the cluster, especially when TWO nodes fail to operate and may drop from the cluster abruptly. The protocol is activated in a cluster with any number of nodes.

Before CEV, if a node has a problem/error during a transaction, the node having the issue would report the error in his own log and exit the cluster:

While the other nodes will “just” report the node as out of the view:

With CEV, we have a different process. Let us review it with images first.

Let us start with a cluster…

3 Nodes, where only one works as Primary.

Primary writes and, as expected, writesets are distributed on all nodes.

Some inexperienced DBA does a manual operation on a secondary using the very unsafe feature wsrep_on…

And then, by mistake or because he did not understand what he is doing…

At the end of the operation of the Secondary node, he will have: