MHA MySQL Quick Start Guide

high availabilityMHA (Master High Availability Manager and tools for MySQL) is one of the most important pieces of our managed services. When properly set up, it can check replication health, move writer and reader virtual IPs, perform failovers, and have its output constantly monitored by Nagios. Is it easy to deploy and follows the KISS (Keep It Simple, Stupid) philosophy that I love so much.

This blog post is a quick start guide to try it out and play with it in your own testing environment. I assume that you already know how to install software, deal with SSH keys and setup replication in MySQL. The post just covers MHA configuration.

Testing environment

Taken from /etc/hosts

mysql-server1: Our master MySQL server with 5.6
mysql-server2: Slave server
mysql-server3: Slave server
mha-manager: The server monitors the replication and from where we manage MHA. The installation is also required to meet some Perl dependencies.

We just introduced some new concepts, the MHA Node and MHA Manager:

MHA Node

It is installed and runs on each MySQL server. This is the piece of software that it is invoked by the manager every time we want to do something, like for example a failover or a check.

MHA Manager

As explained before, this is our operations center. The manager monitors the services, replication, and includes several administrative command lines.


  • Replication must already be running. MHA manages replication and monitors it, but it is not a tool to deploy it. So MySQL and replication need to be running already.
  • All hosts should be able to connect to each other using public SSH keys.
  • All nodes need to be able to connect to each other’s MySQL servers.
  • All nodes should have the same replication user and password.
  • In the case of multi-master setups, only one writable node is allowed. All others need to be configured with read_only.
  • MySQL version has to be 5.0 or later.
  • Candidates for master failover should have binary log enabled. The replication user must exist there too.
  • Binary log filtering variables should be the same on all servers (replicate-wild%, binlog-do-db…).
  • Disable automatic relay-log purge and do it regularly from a cron task. You can use an MHA-included script called “purge_relay_logs”.

While that is a large list of requisites, I think that they are pretty standard and logical.

MHA installation

As explained before, the MHA Node needs to be installed on all the nodes. You can download it from this Google Drive link.

This post shows you how to install it using the source code, but there are RPM packages available. Deb too, but only for older versions. Use the installation method you prefer. This is how to compile it:

The commands included in the node package are save_binary_logs, filter_mysqlbinlog, purge_relay_logs, apply_diff_relay_logs. Mostly tools that the manager needs to call in order to perform a failover, while trying to minimize or avoid any data loss.

On the manager server, you need to install MHA Node plus MHA Manager. This is due to MHA Manager dependance on a Perl library that comes with MHA Node. The installation process is just the same.


We only need one configuration file on the Manager node. The example below is a good starting point:

So pretty straightforward. It specifies that there are three servers, two that can be master and one that can’t be promoted to master.

Let’s check if we meet some of the pre-requisites. We are going to test if replication is working, can be monitored, and also if SSH connectivity works.

It works. Now let’s check MySQL:

Start the manager and operations

Everything is setup, we meet the pre-requisites. We can start our manager:

The manager found our master and it is now actively monitoring it using a SELECT command. –remove_dead_master_conf tells the manager that if the master goes down, it must edit the config file and remove the master’s configuration from it after a successful failover. This avoids the “there is a dead slave” error when you restart the manager. All servers listed in the conf should be part of the replication and in good health, or the manager will refuse to work.

Automatic and manual failover

Good, everything is running as expected. What happens if the MySQL master dies!?!

First, it tries to connect by SSH to read the binary log and save it. MHA can apply the missing binary log events to the remaining slaves so they are up to date with all the before-failover info. Nice!

Theses different phases follow:

The phases are pretty self-explanatory. MHA tries to get all the data possible from the master’s binary log and slave’s relay log (the one that is more advanced) to avoid losing any data or promote a slave that it was far behind the master. So it tries to promote a slave with the most current data as possible. We see that server2 has been promoted to master, because in our configuration we specified that server3 shouldn’t be promoted.

After the failover, the manager service stops itself. If we check the config file, the failed server is not there anymore. Now the recovery is up to you. You need to get the old master back in the replication chain, then add it again to the config file and start the manager.

It is also possible to perform a manual failover (if, for example, you need to do some maintenance on the master server). To do that you need to:

  • Stop masterha_manager.
  • Run masterha_master_switch –master_state=alive –conf=/etc/app1.cnf. The line says that you want to switch the master, but the actual master is still alive, so no need to mark it as dead or remove it from the conf file.

And that’s it. Here is part of the output. It shows the tool making the decision on the new topology and asking the user for confirmation: