The aim of this post is to enumerate real-world usage of Percona XtraDB Cluster (PXC), and also to solicit use cases from the readers. One of the prominent usages in the production environment that we have come across (and our Percona consultants have assisted) is that of HP Cloud. There is a post about it here by Patrick Galbraith of HP. The post focuses on their deployment of PXC for HP Cloud DNS. The post focuses on the key aspects of synchronous replication setup with high-availability guarantees like split-brain immunity.
Nobody likes to debug async replication while its broken or do the master-master/master-slave switchover when master is dying/dead. Yes, there are wrappers/scripts around this to make life easier, however, wouldn’t it be nice if this was built into the system itself? PXC based on Galera strives to provide that. Scaling makes sense only when addition/removal of hosts from a cluster or a HA setup is simple and uncomplicated.
Their post focuses on following aspects:
To elucidate, their initial setup goes into bootstrapping the first node. Note that in the cloud environment other nodes are not known until they are brought up, hence bootstrapping with an empty gcomm:// is done for the first node by the chef. The second node is then added which SSTs with node1 (based on gcomm://node1 of node2) through Xtrabackup SST (state snapshot transfer). Node3 subsequently joins the cluster with node1 and node2 in its gcomm:// (since by this time node1, node2 are up). After this, a subsequent run of chef-client is done to update the cnf files with IP address of members (excluding itself). The rationale behind this is that when a node is restarted (and there are others when it comes up) it joins the cluster seamlessly. I would like to note here that we are adding a bootstrap parameter to PXC so that any latter modifications like these to cnf files are not required and preset it during cluster startup itself. The only caveat is that the node information – IP address or hostname – should be known in advance (the node itself needn’t be up), which may not be feasible in a cloud environment.
Next, the SST. Xtrabackup SST is used there. SST matters a lot because not only is it used during initial node setup but also it is required when a node has been down for a while and IST (incremental state transfer) is not feasible. It also helps when node data integrity is compromised. So, naturally duration of SST is paramount. We recommend Xtrabackup SST for its reduced locking period from its use (which means the donor is blocked for a shorter while). By using Xtrabackup for SST, you also get its benefits like compression, parallel streaming, encryption, compact backups which can be used for SST (Note, the wsrep_sst_xtrabackup in 5.5.30 can’t do those except parallel, the one in 5.5.31 will handle them all, also XB 2.1 is required for most).
Finally, the HAProxy. HAProxy is one of the loadbalancers recommended for use with PXC. The other one is glb. HAProxy is used with xinetd on the node along with a script which checks PXC for its sync status. As referenced in that post, you can refer a post by Peter Boros (“Percona XtraDB Cluster reference architecture with HaProxy“) for details. In their setup they have automated this with a HAProxy in each AZ (Availability Zone) for the API server. To add, we are looking at reducing the overhead here, through steps like replacing xinetd and clustercheck with a single serving process (we are adding one in 5.5.31), looking for optimizations with HAProxy to account for high connection rates, and using pacemaker with PXC. The goal is to reduce the overhead of status checks, mainly on the node. You can also look this PLMCE talk for HAProxy deployment strategies with PXC.
To conclude, it is interesting to note that they have been able to manage this with a small team. That strongly implies scalability of resources – you scale more with less, and that is how it should be. We would like to hear from you about your architectural setup around PXC – any challenges you faced (and horror stories if any), any special deployment methodologies you employed (Puppet, Chef, Salt, Ansible etc. ), and finally any suggestions.