Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Bypassing SST in Percona XtraDB Cluster with binary logs

July 16, 2015

Author

Jay Janssen

MySQL

Percona Software

Share this Post:

In my previous post, I used incremental backups in Percona XtraBackup as a method for rebuilding a Percona XtraDB Cluster (PXC) node without triggering an actual SST. Practically this reproduces the SST steps, but it can be handy if you already had backups available to use.

In this post, I want to present another methodology for this that also uses a full backup, but instead of incrementals uses any binary logs that the cluster may be producing.

Binary logs on PXC

Binary logs are not strictly needed in PXC for replication, but you may be using them for backups or for asynchronous slaves of the cluster. To set them up properly, we need the following settings added to our config:

server-id=1
log-bin
log-slave-updates

server-id=1

log-bin

log-slave-updates

As I stated, none of these are strictly needed for PXC.

- server-id=1 — We recommend PXC nodes share the same server-id.

- log-bin — actually enable the binary log

- log-slave-updates — log ALL updates to the cluster to this server’s binary log

This doesn’t need to be set on every node, but likely you would set these on at least two nodes in the cluster for redundancy.

Note that this strategy should work with or without 5.6 asynchronous GTIDs.

Recovering data with backups and binary logs

This methodology is conventional point-in-time backup recovery for MySQL. We have a full backup that was taken at a specific binary log position:

... backup created in the past...
# innobackupex --no-timestamp /backups/full
# cat /backups/full/xtrabackup_binlog_info
node3-bin.000002	735622700

... backup created in the past...

# innobackupex --no-timestamp /backups/full

# cat /backups/full/xtrabackup_binlog_info

node3-bin.000002 735622700

We have this binary log and all binary logs since:

-rw-r-----. 1 root root 1.1G Jul 14 18:53 node3-bin.000002
-rw-r-----. 1 root root 1.1G Jul 14 18:53 node3-bin.000003
-rw-r-----. 1 root root 321M Jul 14 18:53 node3-bin.000004

-rw-r-----. 1 root root 1.1G Jul 14 18:53 node3-bin.000002

-rw-r-----. 1 root root 1.1G Jul 14 18:53 node3-bin.000003

-rw-r-----. 1 root root 321M Jul 14 18:53 node3-bin.000004

Recover the full backup

We start by preparing the backup with –apply-log:

# innobackupex --apply-log --use-memory=1G /backups/full
...
xtrabackup: Recovered WSREP position: 1663c027-2a29-11e5-85da-aa5ca45f600f:60072936
...
InnoDB: Last MySQL binlog file position 0 735622700, file name node3-bin.000002
...
# innobackupex --copy-back /backups/full
# chown -R mysql.mysql /var/lib/mysql

# innobackupex --apply-log --use-memory=1G /backups/full

...

xtrabackup: Recovered WSREP position: 1663c027-2a29-11e5-85da-aa5ca45f600f:60072936

...

InnoDB: Last MySQL binlog file position 0 735622700, file name node3-bin.000002

...

# innobackupex --copy-back /backups/full

# chown -R mysql.mysql /var/lib/mysql

The output confirms the same binary log file and position that we knew from before.

Start MySQL without Galera

We need to start mysql, but without Galera so we can apply the binary log changes before trying to join the cluster. We can do this simply by commenting out all the wsrep settings in the MySQL config.

# grep wsrep /etc/my.cnf
#wsrep_cluster_address           = gcomm://pxc.service.consul
#wsrep_cluster_name              = mycluster
#wsrep_node_name                 = node3
#wsrep_node_address              = 10.145.50.189
#wsrep_provider                  = /usr/lib64/libgalera_smm.so
#wsrep_provider_options          = "gcache.size=8G; gcs.fc_limit=1024"
#wsrep_slave_threads             = 4
#wsrep_sst_method                = xtrabackup-v2
#wsrep_sst_auth                  = sst:secret

# systemctl start mysql

# grep wsrep /etc/my.cnf

#wsrep_cluster_address = gcomm://pxc.service.consul

#wsrep_cluster_name = mycluster

#wsrep_node_name = node3

#wsrep_node_address = 10.145.50.189

#wsrep_provider = /usr/lib64/libgalera_smm.so

#wsrep_provider_options = "gcache.size=8G; gcs.fc_limit=1024"

#wsrep_slave_threads = 4

#wsrep_sst_method = xtrabackup-v2

#wsrep_sst_auth = sst:secret

# systemctl start mysql

Apply the binary logs

We now check our binary log starting position:

# mysqlbinlog -j 735622700 node3-bin.000002 | grep Xid | head -n 1
#150714 18:38:36 server id 1  end_log_pos 735623273 CRC32 0x8426c6bc 	Xid = 60072937

1 2	# mysqlbinlog -j 735622700 node3-bin.000002 \| grep Xid \| head -n 1 #150714 18:38:36 server id 1 end_log_pos 735623273 CRC32 0x8426c6bc Xid = 60072937

We can compare the Xid on this binary log position to that of the backup. The Xid in a binary log produced by PXC will be the seqno of the GTID of that transaction. The starting position in the binary log shows us the next Xid is one increment higher, so this makes sense: we can start at this position in the binary log and apply all changes as high as we can go to get the datadir up to a more current position.

# mysqlbinlog -j 735622700 node3-bin.000002 | mysql
# mysqlbinlog node3-bin.000003 | mysql
# mysqlbinlog node3-bin.000004 | mysql

# mysqlbinlog -j 735622700 node3-bin.000002 | mysql

# mysqlbinlog node3-bin.000003 | mysql

# mysqlbinlog node3-bin.000004 | mysql

This action isn’t particularly fast as binlog events are being applied by a single connection thread. Remember that if the cluster is taking writes while this is happening, the amount of time you have is limited by the size of gcache and the rate at which it is being filled up.

Prime the grastate

Once the binary logs are applied, we can check the final log’s last position to get the seqno we need:

[root@node3 backups]# mysqlbinlog node3-bin.000004 | tail -n 500
...
#150714 18:52:52 server id 1  end_log_pos 335782932 CRC32 0xb983e3b3 	Xid = 63105191
...

[root@node3 backups]# mysqlbinlog node3-bin.000004 | tail -n 500

...

#150714 18:52:52 server id 1 end_log_pos 335782932 CRC32 0xb983e3b3 Xid = 63105191

...

This is indeed the seqno we put in our grastate.dat. Like in the last post, we can copy a grastate.dat from another node to get the proper format. However, this time we must put the proper seqno into place:

# cat grastate.dat
# GALERA saved state
version: 2.1
uuid:    1663c027-2a29-11e5-85da-aa5ca45f600f
seqno:   63105191
cert_index:

# cat grastate.dat

# GALERA saved state

version: 2.1

uuid: 1663c027-2a29-11e5-85da-aa5ca45f600f

seqno: 63105191

cert_index:

Be sure the grastate.dat has the proper permissions, uncomment the wsrep settings and restart mysql on the node:

# chown mysql.mysql /var/lib/mysql/grastate.dat
# grep wsrep /etc/my.cnf
wsrep_cluster_address           = gcomm://pxc.service.consul
wsrep_cluster_name              = mycluster
wsrep_node_name                 = node3
wsrep_node_address              = 10.145.50.189
wsrep_provider                  = /usr/lib64/libgalera_smm.so
wsrep_provider_options          = "gcache.size=8G; gcs.fc_limit=1024"
wsrep_slave_threads             = 4
wsrep_sst_method                = xtrabackup-v2
wsrep_sst_auth                  = sst:secret
# systemctl restart mysql

# chown mysql.mysql /var/lib/mysql/grastate.dat

# grep wsrep /etc/my.cnf

wsrep_cluster_address = gcomm://pxc.service.consul

wsrep_cluster_name = mycluster

wsrep_node_name = node3

wsrep_node_address = 10.145.50.189

wsrep_provider = /usr/lib64/libgalera_smm.so

wsrep_provider_options = "gcache.size=8G; gcs.fc_limit=1024"

wsrep_slave_threads = 4

wsrep_sst_method = xtrabackup-v2

wsrep_sst_auth = sst:secret

# systemctl restart mysql

The node should now attempt to join the cluster with the proper GTID:

2015-07-14 19:28:50 4234 [Note] WSREP: Found saved state: 1663c027-2a29-11e5-85da-aa5ca45f600f:63105191

1	2015-07-14 19:28:50 4234 [Note] WSREP: Found saved state: 1663c027-2a29-11e5-85da-aa5ca45f600f:63105191

This, of course, still does not guarantee an IST. See my previous post for more details on the conditions needed for that to happen.

0 0 votes

Article Rating

1 Comment

Oldest

Newest Most Voted

Mann

8 years ago

Hi Jay,
We’ve been using different server-id for different PXC nodes, in your note above, it recommend PXC nodes share the same server-id. Is that a over oversight, what is the problem if we’ve using different for each nodes?