In this blog, we’ll discuss how Percona XtraDB Cluster certification works. Percona XtraDB Cluster replicates actions executed on one node to all other nodes in the cluster and make it fast enough to appear as it if is synchronous (aka virtually synchronous).|
1 2 3 4 5 6 7 8 9 |
/* Common situation - * increment and assign act_id only for totally ordered actions * and only in PRIM (skip messages while in state exchange) */ rcvd->id = ++group->act_id_; [This is an amazing way to solve the problem of the id co-ordination in multiple master system, otherwise a node will have to first get an id from central system or through a separate agreed protocol and then use it for the packet there-by doubling the round-trip time]. |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
create -> insert (1,2,3,4)....nodes are in sync till this point. node-1: update i = i + 10; node-2: update i = i + 100; Let's associate transaction-id (trx-id) for an update transaction that is executed on node-1 and node-2 in parallel (The real algorithm is bit more involved (with uuid + seqno) but conceptually the same so for ease I am using trx_id) node-1: update action: trx-id=n1x node-2: update action: trx-id=n2x Both node packets are added to the channel but the transactions are conflicting. Let's see which one succeeds. The protocol says: FIRST WRITE WINS. So in this case, whoever is first to write to the channel will get certified. Let's say node-2 is first to write the packet and then node-1 makes immediately after it. NOTE: each node subscribes to all packages including its own package. See below for details. Node-2: - Will see its own packet and will process it. - Then it will see node-1 packet that it tries to certify but fails. (Will talk about certification protocol in little while) Node-1: - Will see node-2 packet and will process it. (Note: InnoDB allows isolation and so node-1 can process node-2 packets independent of node-1 transaction changes) - Then it will see the node-1 packet that it tries to certify but fails. (Note even though the packet originated from node-1 it will under-go certification to catch cases like thes. This is beauty of listening to own events that make consistent processing path even if events are locally generated) |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
Node-2: - node-2 sees its own packet for certification, adds it to its local CCV and performs certification checks. Once these checks pass it updates the reference transaction by setting it to "n2x" - node-2 then gets node-1 packet for certification. Said key is already present in CCV with a reference transaction set it to "n2x", whereas write-set proposes setting it to "n1x". This causes a conflict, which in turn causes the node-1 originated transaction to fail the certification test. This helps point out a certification failure and the node-1 packet is rejected. Node-1: - node-1 sees node-2 packet for certification, which is then processed, the local CCV is updated and the reference transaction is set to "n2x" - Using the same case explained above, node-1 certification also rejects the node-1 packet. Well this suggests that the node doesn't need to wait for certification to complete, but just needs to ensure that the packet is written to the channel. The applier transaction will always win and the local conflicting transaction will be rolled back. |
|
1 2 3 4 5 6 |
create (id primary key) -> insert (1), (2), (3), (4); node-1: wsrep_on=0; insert (5); wsrep_on=1 node-2: insert(5). insert(5) will generate a write-set that will then be replicated to node-1. node-1 will try to apply it but will fail with duplicate-key-error, as 5 already exist. XtraDB will flag this as an error, which would eventually cause node-1 to shutdown. |