Disclaimer: The following blog post does not try to provide a solution for split-brain situations, and the example provided is for demonstrative purposes only. Inconsistencies resulting from a split-brain scenario might possibly be more complex than the one shown, so do not use the following example as a complete guide.
What is Split-Brain?
A split-brain scenario is the result of two data sets (which were originally synced) losing the ability to sync while potentially continuing to receive DMLs over the same rows and ids on both sides. This could have consequences such as data corruption or inconsistencies where each side has data that does not exist on the other side.
For example before split-brain:
It can be seen that after the split-brain scenario there are many differences between the nodes:
- Customer row id=2 was deleted on Node2
- There was one insert on “Customer” table on each node, but the inserted row has repeated id=3
- There was an insert on the “Customer_product” table on each node with id=2 referencing Customer with id=3
Trying to merge all the rows will result in id collision and inconsistencies.
Why Does Split-Brain Occur?
The most typical reason why split-brain occurs is when an async cluster topology loses connectivity and each node continues to give service independently, but other example cases are:
- A slave node has replication stopped and is out of sync with the master for many days; then the master fails with automatic promotion to slave. When a master server is back online it will have inconsistencies 1) master will have writes that were not replicated to slave 2) slave will have writes that are NOT on the master which occurred while it was offline.
- On a cluster, some node receives writes/deletes with replication disabled, causing data inconsistencies compared to the other nodes.
- A system migration in which old and new servers work concurrently for different customers without syncing data, and at some point the data from the old server needs to be migrated to the new server for the old to be shut down.
Why Is It Hard to Sync Split-Brain Scenarios?
Under the split-brain scenario, there are many inconsistencies that cannot be automatically merged. If restoring, information from one node to the other will lose the changes that occurred on the latter node, and if trying to merge, many issues arise, for instance, the above example:
- What happens with deleted Customer id=2 from Node2? Should it be kept or removed?
- Inserted rows with id=3 should be two different rows, or only 1 row should be kept? In which case for id=3, the correct value would be “Tom” or “Bill”?
- The referenced table has only 1 difference, but merging row id=2 will reference parent_id=3 which is ambiguous
Because of the above, fixing a split-brain scenario will depend on business rules to decide which rows to keep/discard and would result in a manual job to decide for every inconsistency.
How to Avoid Split-Brain Scenarios
When there is a network issue that results in a split-brain situation and as per the CAP theorem, it’s not possible to assure both Consistency and Availability, having to prioritize one over the other.
In the case of favoring Consistency, using Percona XtraDB Cluster allows the primary component (if there is one) to keep serving data, while the non-primary component refuses to operate. If using automatic failover solutions such as Orchestrator, it should be configured so that writes only go to one node, and the promotion of slave only occurs if the slave was up to date at the moment the network failed. If it’s not possible to guarantee data consistency, then nodes should refuse to work.
In case the service needs to be maintained under split-brain scenarios, inconsistencies will arise, but something which can be done to avoid having duplicate ids is using different auto_increment_increment and auto_increment_offset on each server so that, for example, server A ids are even numbers and server B ids are odd numbers.
Updates and deletes must be resolved case by case.
How Do You Fix a Split-Brain Scenario?
As previously shown, if writes occur in different nodes after a split-brain, merging the rows in a consistent way can be difficult. The procedure is manual and cumbersome as rows to be kept have to be checked one by one, and criteria for deciding inconsistencies should be based on business logic.
If only one of the nodes received writes (and provided all required variables are enabled beforehand and binary logs are available), there is a tool called MySQL rewind which can help in reverting the changes on the changed node compared to the original node. The tool has some limitations listed on the official documentation but it’s very useful provided the requirements are met.
In case two (or more) nodes have received writes after split-brain, the merge procedure will be more complex. A high-level procedure on how to fix inconsistencies would be as follows:
First, backup both databases. Then identify which rows are different on each cluster by comparing tables from both nodes. If you have tables with timestamps of creation/edition it might greatly help to identify which rows were updated/inserted after split-brain scenario started, otherwise the best would be to chose one of the databases as the source of truth, and work on the other to identify the differences, while merging the changes one by one and according to business rules.
Possible Scenarios That Can Happen:
- If a row exists on one node but not on the other, it might be a delete. Depending on business logic, decide to keep or delete it.
- If a row exists in both nodes but an update has occurred and some (but not all) column differs, use business logic and available information to choose which row to keep.
- If two rows exist with the same id but all columns differ, it might be two different inserts (on each node) and merging them would be a collision. If you decide to insert the row from Node 2 into the source of truth, you might need to adjust ids by for example adding a big number to the current id (i.e current_id + 100k).
- If a row from Node2 is going to be inserted into the source of truth and references another row, double-check if parent row id is correct, otherwise it id might have been fixed adding 100k as shown on the above example, in which case you need to fix the referenced id.
After merging the data and in case of adding ids, there might be a gap in the ids resulting in a waste of ids, in which case dumping and reloading the table might be a good idea to remove the gaps.
Business logic must decide in advance how to react under split-brain scenarios – whether to favor consistency or availability – by comparing the consequences of losing service vs. the cost and difficulty of data merging after the split-brain is over.