Orchestrator: MySQL Replication Topology Manager

March 8, 2016
Author
Tibor Korocz
Share this Post:

Orchestrator MySQL replication topology managerThis blog post discusses Orchestrator: MySQL Replication Topology Manager.

What is Orchestrator?

Orchestrator is a replication topology manager for MySQL.

It has many great features:

  • The topology and status of the replication tree is automatically detected and monitored
  • Either a GUI, CLI or API can be used to check the status and perform operations
  • Supports automatic failover of the master, and the replication tree can be fixed when servers in the tree fail – either manually or automatically
  • It is not dependent on any specific version or flavor of MySQL (MySQL, Percona Server, MariaDB or even MaxScale binlog servers)
  • Orchestrator supports many different types of topologies, from a single master -> slave  to complex multi-layered replication trees consisting of hundreds of servers
  • Orchestrator can make topology changes and will do so based on the state at that moment; it does not require a configuration to be defined with what corresponds to the database topology
  • The GUI is not only there to report the status – one of the cooler things you can do is change replication just by doing a drag and drop in the web interface (of course you can do this and much more through the CLI and API as well)

Here’s a gif that demonstrates this (click on an image to see a larger version):

Orchestrator Drag N Drop

Orchestrator’s manual is quite extensive and detailed, so the goal of this blogpost is not to go through every installation and configuration step. It will just give a global overview on how Orchestrator works, while mentioning some important and interesting settings.

How Does It Work?

Orchestrator is a go application (binaries, including rpm  and deb  packages are available for download).

It requires it’s own MySQL database as a backend server to store all information related to the Orchestrator managed database cluster topologies.

There should be at least one Orchestrator daemon, but it is recommended to run many Orchestrator daemons on different servers at the same time – they will all use the same backend database but only one Orchestrator is going to be “active” at any given moment in time. (You can check who is active under the Status  menu on the web interface, or in the database in the active_node  table.)

Using MySQL As Database Backend, Isn’t That A SPOF?

If the Orchestrator MySQL database is gone, it doesn’t mean the monitored MySQL clusters stop working. Orchestrator just won’t be able to control the replication topologies anymore. This is similar to how MHA works: everything will work but you can not perform a failover until MHA is back up again.

At this moment, it’s required to have a MySQL backend and there is no clear/tested support for having this in high availability (HA) as well. This might change in the future.

Database Server Installation Requirements

Orchestrator only needs a MySQL user with limited privileges ( SUPER, PROCESS, REPLICATION SLAVE, RELOAD) to connect to the database servers. With those permissions, it is able to check the replication status of the node and perform replication changes if necessary. It supports different ways of replication: binlog file positions, MySQL&MariaDB GTID, Pseudo GTID and Binlog servers.

There is no need to install any extra software on the database servers.

Automatic Master Failure Recovery

One example of what Orchestrator can do is promote a slave if a master is down. It will choose the most up to date slave to be promoted.

Let’s see what it looks like:

dtdggKGF0f

In this test we lost rep1 (master) and Orchestrator promoted rep4  to be the new master, and started replicating the other servers from the new master.

With the default settings, if rep1 comes back  rep4  is going to continue the replication from rep1. This behavior can be changed with the setting  ApplyMySQLPromotionAfterMasterFailover:True in the configuration.

Command Line Interface

Orchestrator has a nice command line interface too. Here are some examples:

Print the topology:

Move a slave:

Print the topology again:

As we can see,  rep2  now is replicating from rep4 .

Long Queries

One nice addition to the GUI is how it displays slow queries on all servers inside the replication tree. You can even kill bad queries from within the GUI.

Orchestrator GUI Long Queries

Orchestrator Configuration Settings

Orchestrator’s daemon configuration can be found in  /etc/orchestrator.conf.json. There are many configuration options, some of which we elaborate here:

  • SlaveLagQuery  – Custom queries can be defined to check slave lag.
  • AgentAutoDiscover  – If set to True , Orchestrator will auto-discover the topology.
  • HTTPAuthPassword  and HTTPAuthUser  –  Avoids everybody being able to access the Web GUI and change your topology.
  • RecoveryPeriodBlockSeconds  – Avoids flapping.
  • RecoverMasterClusterFilters  –  Defines which clusters should auto failover/recover.
  • PreFailoverProcesses  – Orchestrator will execute this command before the failover.
  • PostFailoverProcesses  – Orchestrator will execute this command after the failover.
  • ApplyMySQLPromotionAfterMasterFailover  –  Detaches the promoted slave after failover.
  • DataCenterPattern  – If there are multiple data centers, you can mark them using a pattern (they will get different colors in the GUI):Orchestrator thinks every server is in a different DC now.

Limitations

While being a very feature-rich application, there are still some missing features and limitations of which we should be aware.

One of the key missing features is that there is no easy way to promote a slave to be the new master. This could be useful in scenarios where the master server has to be upgraded, there is a planned failover, etc. (this is a known feature request).

Some known limitations:

  • Slaves can not be manually promoted to be a master
  • Does not support multi-source replication
  • Does not support all types of parallel replication
  • At this moment, combining this with Percona XtraDB Cluster (Galera) is not supported

Is Orchestrator Your High Availability Solution?

In order to integrate this in your HA architecture or include in your fail-over processes you still need to manage many aspects manually, which can all be done by using the different hooks available in Orchestrator:

  • Updating application connectivity:

    • VIP handling,
    • Updating DNS
    • Updating Proxy server (MaxScale , HAProxy , ProxySQL…) connections.

  • Automatically setting slaves to read only to avoid writes happening on non-masters and causing data inconsistencies
  • Fencing (STONITH) of the dead master, to avoid split-brain in case a crashed master comes back online (and applications still try to connect to it)
  • If semi-synchronous replication needs to be used to avoid data loss in case of master failure, this has to be manually added to the hooks as well

The work that needs to be done is comparable to having a setup with MHA or MySQLFailover.

This post also doesn’t completely describe the decision process that Orchestrator takes to determine if a server is down or not. The way we understand it right now, one active Orchestrator node will make the decision if a node is down or not. It does check a broken node’s slaves replication state to determine if Orchestrator isn’t the only one losing connectivity (in which it should just do nothing with the production servers). This is already a big improvement compared to MySQLFailover, MHA or even MaxScale’s failoverscripts, but it still might cause some problems in some cases (more information can be found on Shlomi Noach’s blog).

Summary

The amount of flexibility and power and fun that this tool gives you with a very simple installation process is yet to be matched. Shlomi Noach did a great job developing this at Outbrain, Booking.com and now at GitHub.

If you are looking for MySQL Topology Manager, Orchestrator is definitely worth looking at.

Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved