keepalived with reader and writer VIPs for Percona XtraDB Cluster

PREVIOUS POST
NEXT POST

This is a followup to Jay Janssen’s October post, “Using keepalived for HA on top of Percona XtraDB Cluster.” We got a request recently where the customer has 2 VIPs (Virtual IP addresses), one for reader and one for a writer for a cluster of 3 nodes. They wanted to keep it simple, with low latency and does not require an external node resource like HaProxy would.

keepalived is a simple load balancer with HA capabilities, which means it can proxy TCP services behind it and at the same time, keep itself highly available using VRRP as failover mechanism. This post is about taking advantage of the VRRP capabilities built into keepalived to intelligently manage your PXC VIPs.

While Yves Trudeau also wrote a very interesting and somewhat similar solution using ClusterIP and Pacemaker to load balance VIPs, they have different use cases. Both solutions reduce latency from an external proxy or load balancer, but unlike ClusterIP, connections to the desired VIP with keepalived go to a single node which means a little less work for each node trying to see if they should respond to the request. ClusterIP is good if you want to send writes to all nodes in calculated distribution while with our keepalived option, each VIP at best assigned to only a single node – depending on your workload, each will have advantages and disadvantages.

The OS I used was CentOS 6.4 with keepalived 1.2.7 available in the yum repositories, however, it’s difficult to troubleshoot failover behavior with VRRP_Instance weights without seeing them from keepalived directly. So I used a custom build, with a patch for –vrrp-status option that allows me to monitor something like this:

So first, let’s compile keepalived from source, the Github branch here is where the status patch is available.

Install the customer tracker script below – because compiling keepalived above installs it on /usr/local/bin, I put this script there as well. One would note that this script is completely redundant, it’s true, but beware that keepalived does not validate its configuration, especially track_scripts so I prefer to have it on separate bash script so I can easily debug misbehavior. Of course when all is working well, you can always merge this to the keepalived.conf file.

And below is my /etc/keepalived.conf:

There are a number of things you can change here like remove or modify the notify_* clauses to fit your needs or send SMTP notifications during VIP failovers. I also prefer the initial state of the VRRP_Instances to be on BACKUP instead of master and let the voting on runtime dictate where the VIPs should go.

The configuration ensures that the reader and writer will not share a single node if more than one is available in the cluster. Even though the writer VIP prefers pxc01 in my example, this does not really matter much and only makes a difference when the reader VIP is not in the picture, there is no automatic failback with the help of the nopreempt_* track_scripts.

Now, to see it in action, after starting the cluster and keepalived in order pxc01, pxc02, pxc03, I have these statuses and weights:

The writer is on pxc01 and reader on pxc02 – even though the reader VIP score between pxc02 and pxc03 matches, it remains on pxc02 because of our nopreempt_* script. Let’s see what happens if I stop MySQL on pxc02:

The reader VIP moved to pxc03 and the weights changed, pxc02 reader dropped by 100 and on pxc03 it gained by 50 – again we set this higher for nor preempt. Now let’s stop MySQL on pxc03:

All our VIPs are now on pxc01, let’s start MySQL on pxc02:

Our reader is back to pxc02 and writer remains intact. When both VIPs end up on a single node i.e. last node standing, and a second node comes up, the reader moves not the writer this is to prevent any risks in breaking any connections that may be writing to the node currently owning the VIP.

PREVIOUS POST
NEXT POST

Leave a Reply

Your email address will not be published. Required fields are marked *