September 20, 2014

Percona XtraDB Cluster: Setting up a simple cluster

Percona XtraDB Cluster (PXC) is different enough from async replication that it can be a bit of a puzzle how to do things the Galera way.  This post will attempt to illustrate the basics of setting up 2 node PXC cluster from scratch.

Requirements

Two servers (could be VMs) that can talk to each other.  I’m using CentOS for this post.  Here’s a dirt-simple Vagrant setup: https://github.com/jayjanssen/two_centos_nodes to make this easy (on Virtualbox).

These servers are talking over the 192.168.70.0/24 internal network for our example.

Install the software

These steps should be repeated on both nodes:

Disable IPtables and SElinux

It is possible to run PXC with these enabled, but for simplicity here we just disable them (on both nodes!):

Configure the cluster nodes

Create a my.cnf file on each node and put this into it:

Note that the wsrep_node_address should be the proper address on each node.  We only  need this because in this environment we are not using the default NIC.

Bootstrap node1

Bootstrapping is simply starting up the first node in the cluster.  Any data on this node is taken as the source of truth for the other nodes.

We can see the cluster is Primary, the size is 1, and our local state is Synced.  This is a one node cluster!

Prep for SST

SST is how new nodes (post-bootstrap) get a copy of data when joining the cluster.  It is in essence (and reality) a full backup.  We specified Xtrabackup as our backup and a username/password (sst:secret).  We need to setup a GRANT on node1 so we can run Xtrabackup against it to SST node2:

This GRANT should not be necessary to re-issue more than once if you are adding more nodes to the cluster.

Start node2

Assuming you’ve installed the software and my.cnf on node2, then it should be ready to start up:

If we check the status of the cluster again:

We can see that there are now 2 nodes in the cluster!

The network connection is established over the default Galera port of 4567:

Summary

In these steps we:

  • Installed PXC server package and dependencies
  • Did the bare-minimum configuration to get it started
  • Bootstrapped the first node
  • Prepared for SST
  • Started the second node (SST was copied by netcat over port 4444)
  • Confirmed both nodes were in the cluster

The setup can certainly be more involved in this, but this gives a simple illustration at what it takes to get things rolling.

About Jay Janssen

Jay joined Percona in 2011 after 7 years at Yahoo working in a variety of fields including High Availability architectures, MySQL training, tool building, global server load balancing, multi-datacenter environments, operationalization, and monitoring. He holds a B.S. of Computer Science from Rochester Institute of Technology.

Comments

  1. nurettin says:

    The bootstrap on node1 worked (I had to remove mysql-libs before installing percona rpm)

    in node2 I copied the bare minimum my.cnf and changed:

    wsrep_node_address = 192.168.70.3

    then started:

    service mysql start

    ERROR! MySQL (Percona XtraDB Cluster) server startup failed!

  2. nurettin says:

    131207 13:03:32 [Note] WSREP: gcomm: connecting to group ‘twonode’, peer ‘192.168.70.2:,192.168.70.3:’
    131207 13:03:32 [Warning] WSREP: (f97436fe-5f3f-11e3-95b1-bfa94871db44, ‘tcp://0.0.0.0:4567′) address ‘tcp://192.168.70.3:4567′ points to own listening address, blacklisting
    131207 13:03:35 [Warning] WSREP: no nodes coming from prim view, prim not possible
    131207 13:03:35 [Note] WSREP: view(view_id(NON_PRIM,f97436fe-5f3f-11e3-95b1-bfa94871db44,1) memb {
    f97436fe-5f3f-11e3-95b1-bfa94871db44,
    } joined {
    } left {
    } partitioned {
    })
    131207 13:03:35 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.51248S), skipping check
    131207 13:04:05 [Note] WSREP: view((empty))
    131207 13:04:05 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
    at gcomm/src/pc.cpp:connect():141
    131207 13:04:05 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():196: Failed to open backend connection: -110 (Connection timed out)
    131207 13:04:05 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1292: Failed to open channel ‘twonode’ at ‘gcomm://192.168.70.2,192.168.70.3′: -110 (Connection timed out)
    131207 13:04:05 [ERROR] WSREP: gcs connect failed: Connection timed out
    131207 13:04:05 [ERROR] WSREP: wsrep::connect() failed: 7
    131207 13:04:05 [ERROR] Aborting

  3. Connection failed:

    131207 13:04:05 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1292: Failed to open channel ‘twonode’ at ‘gcomm://192.168.70.2,192.168.70.3′: -110 (Connection timed out)

    That’s the problem — make sure the 2nd node can reach the first on port 4567.

Speak Your Mind

*