Percona XtraDB Cluster (PXC) itself manages quorum and node failure. Minorities of nodes in a network partition situation will move themselves into a Non-primary state and not allow any DB activity. Nodes in such a state will be easily detectable via SHOW GLOBAL STATUS variables.
It’s common to use HAproxy with PXC for load balancing purposes, but what if you are planning to just send traffic to a single node? We would standardly use keepalived to HA HAproxy, and keepalived supports track_scripts that can monitor whatever we want, so why not just monitor PXC directly?
If we have clustercheck working on all hosts:
|
1 |
mysql> GRANT USAGE ON *.* TO 'clustercheck'@'localhost' IDENTIFIED BY PASSWORD '*2470C0C06DEE42FD1618BB99005ADCA2EC9D1E19';<br><br>[root@node1 ~]# /usr/bin/clustercheck clustercheck password 0; echo $?<br>HTTP/1.1 200 OK<br>Content-Type: text/plain<br>Connection: close<br>Content-Length: 40<br><br>Percona XtraDB Cluster Node is synced.<br>0 |
Then we can just install keepalived and the this config on all nodes:
|
1 |
vrrp_script chk_pxc {<br> script "/usr/bin/clustercheck clustercheck password 0"<br> interval 1<br>}<br><br>vrrp_instance PXC {<br> state MASTER<br> interface eth1<br> virtual_router_id 51<br> priority 100<br> nopreempt<br> virtual_ipaddress {<br> 192.168.70.100<br> }<br><br> track_script {<br> chk_pxc<br> }<br><br> notify_master "/bin/echo 'now master' > /tmp/keepalived.state"<br> notify_backup "/bin/echo 'now backup' > /tmp/keepalived.state"<br> notify_fault "/bin/echo 'now fault' > /tmp/keepalived.state"<br>} |
And start the keepalived service. The virtual IP above will be brought up on an active node in the cluster and moved around if clustercheck fails.
|
1 |
[root@node1 ~]# cat /tmp/keepalived.state<br>now backup<br>[root@node2 ~]# cat /tmp/keepalived.state<br>now master<br>[root@node3 ~]# cat /tmp/keepalived.state<br>now backup<br><br>[root@node2 ~]# ip a l | grep 192.168.70.100<br> inet 192.168.70.100/32 scope global eth1<br><br>[root@node3 ~]# mysql -h 192.168.70.100 -u test -ptest test -e "show global variables like 'wsrep_node_name'"<br>+-----------------+-------+<br>| Variable_name | Value |<br>+-----------------+-------+<br>| wsrep_node_name | node2 |<br>+-----------------+-------+ |
If I shutdown PXC on node2:
|
1 |
[root@node2 keepalived]# service mysql stop<br>Shutting down MySQL (Percona XtraDB Cluster)....... SUCCESS!<br>[root@node2 ~]# /usr/bin/clustercheck clustercheck password 0; echo $?<br>HTTP/1.1 503 Service Unavailable<br>Content-Type: text/plain<br>Connection: close<br>Content-Length: 44<br><br>Percona XtraDB Cluster Node is not synced.<br>1<br>[root@node1 ~]# cat /tmp/keepalived.state<br>now master<br>[root@node2 ~]# cat /tmp/keepalived.state<br>now fault<br>[root@node3 ~]# cat /tmp/keepalived.state<br>now backup<br><br>[root@node1 ~]# ip a l | grep 192.168.70.100<br> inet 192.168.70.100/32 scope global eth1<br>[root@node2 ~]# ip a l | grep 192.168.70.100<br>[root@node2 ~]#<br>[root@node3 ~]# ip a l | grep 192.168.70.100<br>[root@node3 ~]# |
We can see node2 moves to a FAULT state and the VIP moves to node1 instead. This provides us with a very simple way to do Application to PXC high availability.
A few additional notes: