EmergencyEMERGENCY? Get 24/7 Help Now!

Orchestrator and ProxySQL

 | November 9, 2016 |  Posted In: MySQL, Orchestrator, ProxySQL

PREVIOUS POST
NEXT POST

In this blog post, I am going to show you how can you use Orchestrator and ProxySQL together.

In my previous blog post, I showed how to use bash scripts and move virtual IPs with Orchestrator. As in that post, I assume you already have Orchestrator working. If not, you can find the installation steps here.

In the case of a failover, Orchestrator changes the MySQL topology and promotes a new master. But who lets the application know about this change? This is where ProxySQL helps us.

ProxySQL

You can find the ProxySQL install steps here. In our test, we use the following topology:

screen-shot-2016-11-01-at-14-27-09

For this topology we need the next rules in “ProxySQL”:

See the connection pool:

It shows us “192.168.57.105” is in “hostgroup” 600, which means that server is the master.

How does ProxySQL decide who the new master is?

ProxySQL does not know what the topology looks like, which is really important. ProxySQL is monitoring the “read_only” variables on the MySQL servers, and the server where read_only=off is going to get the writes. If the old master went down and we changed our topology, we have to change the read_only variables on the new master. Of course, applications like MHA or Orchestrator can do that for us.

We have two possibilities here: the master went down, or we want to promote a new master.

Master is down

If the master goes down, Orchestrator is going to change the topology and set the read_only = OFF on the promoted master. ProxySQL is going to realize the master went down and send the write traffic to the server where read_only=OFF.

Let’s do a test. After we stopped MySQL on “192.168.56.105”, Orchestrator promoted “192.168.56.106” as the new master. ProxySQL is using it now as a master:

This happens quickly and does not require any application, VIP or DNS modification.

Promoting a new Master

When we perform a graceful-master-takeover with Orchestrator, it promotes a slave as a new master, removes the old master from the replicaset and sets read_only=ON.

From Orchestrator’s point of view, this is great. It promoted a slave as a new master, and old master is not part of the replicaset anymore. But as I mentioned earlier, ProxySQL does not know what the replicaset looks like.

It only knows we changed the read_only variables on some servers. It is going to send reads to the old master, but it does not have up-to-date data anymore. This is not good at all.

We have two options to avoid this.

Remove master from read hostgroup

If the master is not part of the read hostgroup, ProxySQL won’t send any traffic there after we promote a new master. But in this case, if we lose the slaves, ProxySQL cannot redirect the reads to the master. If we have a lot of slaves, and the replication stopped on the saves because of an error or mistake, the master probably won’t be able to handle all the read traffic. But if we only have a few slaves, it would be good if the master can also handle reads if there is an issue on the slaves.

Using Scheduler

In this great blog post from Marco Tusa, we can see that ProxySQL can use “Schedulers”. We can use the same idea here as well. I wrote a script based on Marco’s that can recognize if the old master is no longer a part of the replicaset.

The script checks the followings:

  • read_only=ON – the server is read-only (on the slave servers, this has to be ON)
  • repl_lag  is NULL – on the master, this should be NULL (if the seconds_behind_master is not defined, ProxySQL will report repl_lag is NULL)

If the read_only=ON, it means the server is not the master at the moment. But if the repl_lag is NULL, it means the server is not replicating from anywhere, and it probably was a master. It has to be removed from the Hostgroup.

Adding a Scheduler

The script has parameters like username, password or port. But we also have to define the read Hostgroup (-G).

Let’s see what happens with ProsySQL after we run the command orchestrator -c graceful-master-takeover -i rep1 -d rep2 :

As we can see, the status changed to OFFLINE_HARD:

This is because we changed the “hostgroup_id” to 9601. This is what we wanted so that the old master won’t get more traffic.

Conclusion

Because ProxySQL redirects the traffic based on the read_only  variables, it is important to start the servers with read_only=ON (even on the master). In that case, we can avoid getting writes on many servers at the same time.

If we want to use graceful-master-takeover with Orchestrator, we have to use a scheduler that can remove the old master from the read Hostgroup.

PREVIOUS POST
NEXT POST
Tibor Korocz

Tibi joined Percona in 2015 as a Consultant. Before joining Percona, among many other things, he worked at the world’s largest car hire booking service as a Senior Database Engineer. He enjoys trying and working with the latest technologies and applications which can help or work with MySQL together. In his spare time he likes to spend time with his friends, travel around the world and play ultimate frisbee.

2 Comments

  • The approach of choosing the master node based on reading the read_only value will work in most cases: orchestrator, or another tool, would remove that property from the demoted master, set it on the new master and we’re happy.
    Or, you’d set read_only=1 in /etc/my.cnf for all your servers, such that a server that panics and restarts always starts up as read_only.

    However the approach does not work in the event of network partitioning of the master: to the world it would appear to be truly dead. But no one is able to set read_only=0 on that master. If it suddenly recovers from the network partitioning, during or after new master promotion, you end up with two different servers, both claiming to be read_only=0.

    To mitigate this you’d need to be able to shoot the failing node node (e.g. if it’s AWS you can halt/restart it) through the orchestrator failover scripts.

    Otherwise it would be best to find a more holistic approach to deciding which is the true master. A service discovery (consul/zk) would be a good candidate for that. Orchestrator would be able to tell consul: oh hey, I just demoted _that_ master and promoted _that_ one; ProxySQL would periodically consult with consul as for identify of master and route write queries based on that info.

    Also see this discussion: https://github.com/sysown/proxysql/issues/789 coincidentally taking place at the same day this post was published.

  • Hi Shlomi,

    Thanks for your great comment. Yes, I have seen that discussion after the post was published and I am already testing/working on an Orchestrator+Consul+ProxySQL setup.

    I am also thinking about that, if it is just a traditional replicaset normal master-slave (not galera), ProxySQL might should disable writes if there are two servers with read_only=OFF in the same hostgroup. It might better not having writes for a short period than writing two nodes and might corrupting your data (of course this is depending on the application and the use cases). I am going to ask Rene what is his opinion about this.

    I think even a scheduler can do this but in that case writes could go to two nodes until the scheduler runs and changes the hostgroups etc..

    Thanks,
    Tibi

Leave a Reply