In this second part of the blog post, we will explore how the PXC Replication Manager script handles source and replica failover in a multi-source replication topology.

Multi-source replication is commonly used when data from multiple independent sources needs to be gathered into a single instance which is often required for reporting, analytics, or specific ad-hoc business cases. In this post, we’ll walk through how failover is managed in such a setup when integrated in PXC/Galera based environment.

For an initial understanding of the basic PXC replication manager setup, you can refer to the linked blog post

Let’s dive into the practical use.

Topology:

Async Replication syncing flow:

  • DC1 [DC1-1] will have a multi-source replication channel and syncing from DC2[DC2-1] and DC3 [DC3-1] nodes.
  • DC2 [DC2-1] will be syncing from DC1 [DC-1].
  • DC3 [DC3-1] will be syncing from DC1[DC-1].

Async Multi-Source

Async Multi-Source topology

 

PXC/Async configurations

The configuration details as per each DC node is mentioned in the Github file at location –  https://gist.github.com/aniljoshi2022/7714c97a9c755e3d12c60e3ead21a55f .

At this stage, all 3 clusters should be bootstrapped and in running state.

  • First Node:

  • Second and rest other Nodes:

We should also make sure the replication user created on the DC1[mysql-DC1-1] node.

Replication Manager configuration

Now we will add configuration entries in the replication manager related tables on DC-1 [DC1-1]. I am not covering what each table does here,  as we already mentioned in the first part of the blogpost.

Asynchronous Replication Setup

  • Taking mysqldump from DC1 [DC1-1] node.

  • Transferring dump to DC2[DC2-1] and DC3[DC3-1].

  • Restoring dump on  DC2[DC2-1] and DC3[DC3-1].

  • Replication channel setup and starting.

DC2-1

  DC3-1

  • Multo-Source async replication setup on DC1[DC1-1]

Now, all the clusters are linked as a source to source.

Replication Manager Cron Setup

We need to enable replication manager cron across all PXC/Async nodes.

For any error or issues we can check insight the – /tmp/replication_manager.log log file.

Testing Source Failover For Multi-Source Channel

DC1-1:

Now we will stop source DC2 [DC2-1] .

 Below, we can see that connectionName with “DC1-DC2” is in a “Failed” state.

After a couple of mins when the script again starts monitoring, DC1 [DC1-1] is now connected with another source node of DC2 which is [DC2-2] .

Testing Replica Failover For Multi-Source channel

We will stop DC1 (DC1-1) which is the current multi source replica connected via both DC2 and DC3 nodes.

Once we stop the database service on DC1[DC1-1] and after waiting for a while we can check the status again and it will show as DC1[DC1-2] as a new multi-source replica.

Important consideration:

The topology or scenario discussed above is intended solely for demonstration purposes and to observe how the PXC Replication Manager handles failover in complex topologies. In a production environment, such architectures should be avoided, as performing writes across both clusters (multiple nodes simultaneously) can lead to inconsistencies. For any similar use cases, thorough and in-depth testing is strongly recommended beforehand.

Summary

The replication  manager script can be particularly useful in complex PXC/Galera topologies that require multi-source replication. This will ease the auto source and replica failover to ensure all replication channels are healthy and in sync. If certain nodes shouldn’t  be part of a async/multi-source replication, we can disable the replication manager script there. Alternatively, node participation can be controlled by adjusting the weights in the percona.weight table, allowing replication behavior to be managed more precisely.

 

 

 

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments