This post is the second of a series that started here.
The first step to build the HA solution is to create two working instances, configure them to be EBS based and create a security group for them. A third instance, the client, will be discussed in part 7. Since this will be a proof of concept, I’ll be using m1.small type instances while normally, the mysql host would be much larger. Using another type is trivial. I will assume you are using the command line api tools, on Ubuntu, install “ec2-api-tools”. The use of these tools simplifies the expression of the command compared to the web based console.
Create the security group
The instances involved in the MySQL HA setup will need to be inside the same security group for networking purposes and the help identify them. To create a security simply run this command:
1 2 3 4 |
yves@yves-laptop:~$ export EC2_CERT=cert-yves.pem yves@yves-laptop:~$ export EC2_PRIVATE_KEY=pk-yves.pem yves@yves-laptop:~$ ec2-add-group hamysql -d 'nodes for HA MySQL solution' GROUP hamysql nodes for HA MySQL solution |
From now, I’ll always assume the EC2_CERT and EC2_PRIVATE_KEY environment variables are setup in your shell. Next, we need to authorize some communications for the security group. I’ll authorize 3306/tcp (MySQL) from hamysql, 694/udp (Heartbeat) from hamysql and 22 (SSH) from everywhere. You can be more restrictive for SSH if you want to.
1 2 3 4 5 6 7 8 |
yves@yves-laptop:~$ ec2-authorize hamysql -P tcp -p 3306 -o hamysql -u 834362721059 yves@yves-laptop:~$ ec2-authorize hamysql -P udp -p 694 -o hamysql -u 834362721059 yves@yves-laptop:~$ ec2-authorize hamysql -P tcp -p 22 -s 0.0.0.0/0 yves@yves-laptop:~$ ec2-describe-group hamysql GROUP 834362721059 hamysql nodes for HA MySQL solution PERMISSION 834362721059 hamysql ALLOWS tcp 3306 3306 FROM USER 834362721059 GRPNAME hamysql PERMISSION 834362721059 hamysql ALLOWS udp 694 694 FROM USER 834362721059 GRPNAME hamysql PERMISSION 834362721059 hamysql ALLOWS tcp 22 22 FROM CIDR 0.0.0.0/0 |
Launch the instances
Now we can start creating our instances. Since this is only a proof of concept, I’ll built 2 m1.small instances, fell free to use other types. At the time I wrote this, the following AMI seems ok.
1 2 |
yves@yves-laptop:~$ ec2-describe-images ami-1cdf3775 IMAGE ami-1cdf3775 099720109477/ubuntu-images-testing/ubuntu-lucid-daily-i386-server-20100618 099720109477 available public i386 machine aki-aca44cc5 instance-store |
So, lauching 2 of these,
1 2 3 4 5 6 7 8 |
yves@yves-laptop:~$ ec2-run-instances ami-1cdf3775 -n 2 -g hamysql -t m1.small -k yves-key RESERVATION r-a29c31c9 834362721059 hamysql INSTANCE i-a23a21c9 ami-1cdf3775 pending yves-key 0 m1.small 2010-06-18T20:11:14+0000 us-east-1c aki-aca44cc5 monitoring-disabled instance-store INSTANCE i-a03a21cb ami-1cdf3775 pending yves-key 1 m1.small 2010-06-18T20:11:14+0000 us-east-1c aki-aca44cc5 monitoring-disabled yves@yves-laptop:~$ ec2-describe-instances RESERVATION r-a29c31c9 834362721059 hamysql INSTANCE i-a23a21c9 ami-1cdf3775 ec2-174-129-89-188.compute-1.amazonaws.com domU-12-31-39-02-BD-C5.compute-1.internal running yves-key 0 m1.small 2010-06-18T20:11:14+0000 us-east-1c aki-aca44cc5 monitoring-disabled 174.129.89.188 10.248.194.51 instance-store INSTANCE i-a03a21cb ami-1cdf3775 ec2-174-129-187-170.compute-1.amazonaws.com domU-12-31-39-03-A4-62.compute-1.internal running yves-key 1 m1.small 2010-06-18T20:11:14+0000 us-east-1c aki-aca44cc5 monitoring-disabled 174.129.187.170 10.249.167.144 instance-store |
I don’t know about you but I don’t like multi-lines output so I wrote a small filter script to on one line the parameters I need separated by a delimiter.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
yves@yves-laptop:~$ cat filtre_instances.pl #!/usr/bin/perl $SecGroup = ''; $IPAdd = ''; $Instance_ID = ''; while (<STDIN>) { chomp $_; #print "Processing: $_\n"; @fields = split /\t/, $_; if (/^RESERVATION/) { $SecGroup = $fields[3]; } if (/^INSTANCE/) { $IPAdd = $fields[17]; $STORE = $fields[20]; $Instance_ID= $fields[1]; $AMI_ID= $fields[2]; $PUBDNS = $fields[3]; $STATUS = $fields[5]; $START = $fields[10]; print "$SecGroup|$IPAdd|$Instance_ID|$AMI_ID|$PUBDNS|$STATUS|$START|$STORE \n" } } |
and now we have
1 2 3 |
yves@yves-laptop:~$ ec2-describe-instances | ./filtre_instances.pl | grep hamysql hamysql|10.248.194.51|i-a23a21c9|ami-1cdf3775|ec2-174-129-89-188.compute-1.amazonaws.com|running|2010-06-18T20:11:14+0000|instance-store hamysql|10.249.167.144|i-a03a21cb|ami-1cdf3775|ec2-174-129-187-170.compute-1.amazonaws.com|running|2010-06-18T20:11:14+0000|instance-store |
which is to my opinion easier to manipulate.
Configuring Heartbeat
Now, let’s configure Heartbeat. The first thing to is to set up the hostname on both host. Heartbeat identifies the host on which it is running by its hostname so that’s mandatory step.
First host:
1 2 3 |
yves@yves-laptop:~$ ssh -i ~/.ssh/yves-key.pem ubuntu@ec2-174-129-89-188.compute-1.amazonaws.com ubuntu@domU-12-31-39-07-C8-32:~$ sudo su - ubuntu@domU-12-31-39-07-C8-32:~# hostname monitor |
Second host:
1 2 3 |
yves@yves-laptop:~$ ssh -i ~/.ssh/yves-key.pem ubuntu@ec2-174-129-89-188.compute-1.amazonaws.com ubuntu@domU-12-31-38-04-E5-E4:~$ sudo su - ubuntu@domU-12-31-38-04-E5-E4:~# hostname hamysql |
We don’t really need to set /etc/hostname since it is overwritten when the instance is started, even when using EBS based AMI. The next step is to install Heartbeat and Pacemaker on both host, with Ubuntu 10.04, it is very straightforward:
1 2 3 4 5 |
root@monitor:~# apt-get install heartbeat pacemaker and root@hamysql:~# apt-get install heartbeat pacemaker |
Then we can proceed and configure Heartbeat, Pacemaker will come later. Heartbeat needs 2 configuration files, /etc/ha.d/authkeys for cluster authentication and /etc/ha.d/ha.cf which is the configuration file per say. On both host, the chosen key in the authkeys file must be identical and good way to generate unique one is to run “date | md5sum” and grab a substring from the output.
1 2 3 |
root@monitor:/etc/ha.d# cat authkeys auth 1 1 sha1 c97f2bb4b5ae90f149dc314ed |
Also don’t forget to restrict the access rights on the file like:
1 |
root@monitor:/etc/ha.d# chmod 600 authkeys |
For the /etc/ha.d/ha.cf file, since EC2 does not support neither broadcast or multicast within the security group, we need to use unicast (ucast) so both files will not be identical. The ucast entry on one host will contain the IP address on the internal network of the other host. On the monitor host, we will have:
1 2 3 4 5 6 7 8 9 10 |
root@monitor:/etc/ha.d# cat ha.cf autojoin none ucast eth0 10.249.167.144 warntime 5 deadtime 15 initdead 60 keepalive 2 crm respawn node monitor node hamysql |
and on the hamysql host:
1 2 3 4 5 6 7 8 9 10 |
root@hamysql:/etc/ha.d# cat ha.cf autojoin none ucast eth0 10.248.194.51 warntime 5 deadtime 15 initdead 60 keepalive 2 crm respawn node monitor node hamysql |
Let’s review briefly the configuration file. First we have setup “autojoin none” that means no host not listed explicitely in the configuration file can join the cluster so we know we have at most 2 members, “monitor” and “hamysql”. Next is the ucast communication channel to reach the other node and the timing parameters. “warntime” is a soft timeout in second that logs the other node is later while “deadtime” is the hard limit after which heartbeat will the consider the other node dead and start actions to restore the service. The “initdead” is just a startup delay to allow host to fully boot before attempting actions and “crm respawn” starts the Pacemaker resources manager. Finally, we have the two “node” declarations” for the cluster members.
So we are done with the configure, time to try if it works. On both hosts, run:
1 |
service heartbeat start |
and if everything is right, after at most a minute, you should be able to see both heartbeat processes chatting over the network
1 2 3 4 5 6 7 |
root@monitor:~# tcpdump -i eth0 port 694 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 20:38:36.536302 IP domU-12-31-38-04-E5-E4.compute-1.internal.57802 > domU-12-31-39-07-C8-32.compute-1.internal.694: UDP, length 211 20:38:36.928860 IP domU-12-31-39-07-C8-32.compute-1.internal.34058 > domU-12-31-38-04-E5-E4.compute-1.internal.694: UDP, length 212 20:38:38.580245 IP domU-12-31-38-04-E5-E4.compute-1.internal.57802 > domU-12-31-39-07-C8-32.compute-1.internal.694: UDP, length 211 20:38:38.938814 IP domU-12-31-39-07-C8-32.compute-1.internal.34058 > domU-12-31-38-04-E5-E4.compute-1.internal.694: UDP, length 212 |
We can also use the “crm” tool to query the cluster status.
1 2 3 4 5 6 7 8 9 10 11 |
root@monitor:~# crm status ============ Last updated: Tue Jun 29 13:56:04 2010 Stack: Heartbeat Current DC: monitor (504f45ea-7aee-4fa5-b0ee-a5ac07975ce4) - partition with quorum Version: 1.0.8-042548a451fce8400660f6031f4da6f0223dd5dd 2 Nodes configured, unknown expected votes 1 Resources configured. ============ Online: [ monitor hamysql ] |
Install MySQL
For the sake of simplicity, we will just install the MySQL version in the Ubuntu repository by doing:
1 |
root@hamysql:~# apt-get install mysql-server-5.1 |
The package install MySQL has an automatic startup script controlled by init (new to lucid). That’s fine, I will surprise you but Pacemaker will not manager MySQL, just the host running it. I’ll also skip the raid configuration of multiple EBS volumes since it is not the main purpose of this blog series.
EBS based AMI
Others have produce excellent article on how to create EBS based AMI, I will not reinvent the wheel. I followed this one: http://www.capsunlock.net/2009/12/create-ebs-boot-ami.html
Upcoming in part 3, the configuration of the HA resources.
It’s important to keep in mind that EC2 network performance degrades significantly if you attempt to cross availability zones (e.g. going from us-east-1c to us-east-1b). You’ve chosen correctly to put the HA servers in the same zone. This goes for any internal communication that might take place.
Running heartbeat with non-identical ha.cf files is a bad idea. You can just use identical ha.cf files with two ucast lines in there. Any heartbeat node will happily ignore any ucast line that matches a locally configured IP address. Check out the ucast entry in the ha.cf man page (http://www.linux-ha.org/doc/re-hacf.html).
Dear Yves Trudeau,
Thank You very much for the article.
I’m trying to setup heartbeat and pacemaker on two Ubuntu 10.04 servers. I followed your setups. I can see in the output of “tcpdump -i eth0 port 694” that msgs sent and received. But “crm status” give me error “Connection to cluster failed: connection failed”.
Any Idea, I’m quite new to heartbeat and pacemaker
Thanks in advance
ap1285,
Run the crm status command as root.
You can set the hostname and have it stick after reboot on ec2.
vi /etc/rc.local
then add a line at the bottom
hostname hamysql
Luke,
which version of the tool are you using and could you sent a sample output of ec2-describe-instance
Cleared the print line statement. now when running the line is just printed in a loop. It seems that for some reason the ec2-describe-instances isn’t being handled by the perl script, unless im mistaken.
Luke,
remove the “#” in #print “Processing: $_n”; and look at the output. maybe we don’t use the same version of the tool and that cause the regexp to fail.
Great article! really been informative.
However i am having a problem with the filtre_instances.pl script, whenever it is run the command line hangs then does nothing. any ideas? as you know this script is important in order to get the killing script working correctly.
failing command
ec2-describe-instances -K /usr/local/bin/pk-******.pem -C /usr/local/bin/cert-******.pem | /usr/local/bin/filtre_instance.pl