This past week was marked by a series of personal findings related to the use of Global Transaction IDs (GTIDs) on Galera-based clusters such as Percona XtraDB Cluster (PXC). The main one being the fact that transactions touching MyISAM tables (and FLUSH PRIVILEGES!) issued on a giving node of the cluster are recorded on a GTID set bearing the node’s server_uuid as “source id” and added to the binary log (if the node has binlog enabled), thus being replicated to any async replicas connected to it. However, they won’t be replicated across the cluster (that is, all of this is by design, if wsrep_replicate_myisam is disabled, which it is by default).
My colleague Stéphane covered this story in one of his recent blog posts titled, “Percona XtraDB Cluster 5.6: a tale of 2 GTIDs,” explaining how those local (in reference to the node) transactions are saved in a different GTID set to the cluster’s main one and the impact this may cause when re-pointing an async slave to replicate from a different node.
GTIDs is a feature introduced in MySQL 5.6 that made replication management much easier and considering there’s a series of advantages in having an async replica attached to a PXC cluster, why hasn’t this popped out earlier to either of us? I guess there aren’t so many people using GTIDs with Galera-based clusters around yet so here’s a post to show you how to do it.
Initializing a PXC cluster configured with GTIDs
My testing environment for a 3-node cluster is composed of node1 (192.168.70.2), node2 (.3) and node3 (.4). All of them have the same PXC binaries installed:
1 2 3 4 5 6 |
$ rpm -qa |grep -i percona-xtradb-cluster Percona-XtraDB-Cluster-56-5.6.21-25.8.938.el6.x86_64 Percona-XtraDB-Cluster-galera-3-3.8-1.3390.rhel6.x86_64 Percona-XtraDB-Cluster-server-56-5.6.21-25.8.938.el6.x86_64 Percona-XtraDB-Cluster-client-56-5.6.21-25.8.938.el6.x86_64 Percona-XtraDB-Cluster-shared-56-5.6.21-25.8.938.el6.x86_64 |
and are configured with an almost identical /etc/my.cnf (apart from wsrep_node_address and the prompt line):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
[mysql] prompt='mysql {node1} > ' [mysqld] datadir = /var/lib/mysql log_warnings=2 server_id=1 log_bin=percona-bin log_slave_updates binlog_format = ROW enforce_gtid_consistency=1 gtid_mode=on wsrep_cluster_name = my-three-node-cluster wsrep_cluster_address = gcomm://192.168.70.2,192.168.70.3,192.168.70.4 wsrep_node_address = 192.168.70.2 wsrep_provider = /usr/lib64/libgalera_smm.so wsrep_sst_method = xtrabackup-v2 wsrep_sst_auth = sst:secret innodb_locks_unsafe_for_binlog = 1 innodb_autoinc_lock_mode = 2 |
server_id, log_bin, log_slave_updates and binlog_format are not needed for the cluster to operate but they are required to setup regular replication later on so I’ve added those to let the nodes ready to operate as masters.
We start with an empty, inexisting datadir on all nodes so I’ll use the mysql_install_db script to create a base datadir with all that is needed for MySQL to work on node1, which will be the reference node of the cluster:
1 |
[node1]$ mysql_install_db --user=mysql |
We’re now ready to bootstrap the cluster from this reference node:
1 |
[node1]$ service mysql bootstrap-pxc |
With that, we have an operational reference node:
1 2 3 4 5 6 7 |
mysql [node1] > select @@server_uuid; +--------------------------------------+ | @@server_uuid | +--------------------------------------+ | 834bca7f-b45e-11e4-a7b5-0800272e951c | +--------------------------------------+ 1 row in set (0.00 sec) |
Note from above my.cnf settings that I’ve chosen xtrabackup-v2 as the State Snapshot Transfer (SST) method, which requires authentication (wsrep_sst_auth). For this reason, if we now try to start MySQL on node2 it will fail with its error log showing:
1 2 3 4 5 |
2015-02-14 16:58:26 24149 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --addre ss '192.168.70.3' --auth 'sst:secret' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '24149' --binlog 'percona-bin' : 1 (Operation not permitted) 2015-02-14 16:58:26 24149 [ERROR] WSREP: Failed to read uuid:seqno from joiner script. 2015-02-14 16:58:26 24149 [ERROR] WSREP: SST failed: 1 (Operation not permitted) 2015-02-14 16:58:26 24149 [ERROR] Aborting |
The problem here is that the SST method requires MySQL authentication: even though the credentials are shared on the wsrep_sst_auth variable of all node’s my.cnf configuration file this only tells xtrabackup-v2 to require it, it doesn’t actually configure MySQL with it. That’s a step left for us:
1 2 |
mysql [node1] > GRANT RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'sst'@'localhost' IDENTIFIED BY 'secret'; Query OK, 0 rows affected (0.02 sec) |
And this constitutes our very first commited transaction, which goes into the cluster’s GTID set:
1 2 3 4 5 6 7 |
mysql [node1] > select @@global.gtid_executed; +----------------------------------------+ | @@global.gtid_executed | +----------------------------------------+ | 7cf02e4b-4ba1-ee1b-4599-0c821ea92393:1 | +----------------------------------------+ 1 row in set (0.00 sec) |
This GRANT only needs to be issued once, in the reference node. Now you start MySQL on node2 and node3: they will use XtraBackup to make a backup of node1’s dataset, which will contain this GRANT, and restore it in their own datadir:
1 |
[node2]$ service mysql start |
and:
1 |
[node3]$ service mysql start |
OK, that’s done. But how do I attach an async replica to the cluster?
Just to make it clear, you cannot attach an async replica to the cluster: you need to choose a node that is member of the cluster and make it the master for the async replica. Considering all nodes should have the same data replicated it should be easy to change the async replica’s configuration and make it replicate from any other node from the cluster – and it is, though you may need to do some adjustments (more on this later).
The initial procedure is exactly the same one used with regular replication: you start by taking a backup of the master and restoring it on the replica. We’ll use XtraBackup again to perform the backup and we’ll start by having node2 as the master of our async replica (192.168.70.7). We could stream the backup from node2 directly to the async replica and later “prepare” it (by applying the logs, which needs to be done using the same version of Percona XtraBackup that you’ve used to take the backup), but to make things simple we’ll first take the backup on node2:
1 2 3 4 5 6 7 |
[node2]$ innobackupex /tmp (...) innobackupex: Using server version 5.6.21-70.1-56-log innobackupex: Created backup directory /tmp/2015-02-14_17-53-22 (...) 150214 17:53:26 |