EmergencyEMERGENCY? Get 24/7 Help Now!

Orchestrator-agent: How to recover a MySQL database

 | April 13, 2016 |  Posted In: MySQL, Orchestrator

PREVIOUS POST
NEXT POST

orchestrator-agentIn our previous post, we showed how Orchestrator can handle complex replication topologies. Today we will discuss how the Orchestrator-agent complements Orchestrator by monitoring our servers, and provides us a snapshot and recovery abilities if there are problems.

Please be aware that the following scripts and settings in this post are not production ready (missing error handling, etc.) –  this post is just a proof of concept.

What is Orchestrator-agent?

Orchestrator-agent is a sub-project of Orchestrator. It is a service that runs on the MySQL servers, and it gives us the seeding/deploying capability.

In this context “seeding” means copying MySQL data files from a donor server to the target machine. Afterwards, the MySQL can start on the target machine and use the new data files. 

Functionalities (list from Github):
  • Detection of the MySQL service, starting and stopping (start/stop/status commands provided via configuration)
  • Detection of MySQL port, data directory (assumes configuration is /etc/my.cnf)
  • Calculation of disk usage on data directory mount point
  • Tailing the error log file
  • Discovery (the mere existence of the orchestrator-agent service on a host may suggest the existence or need of existence of a MySQL service)
  • Detection of LVM snapshots on MySQL host (snapshots that are MySQL specific)
  • Creation of new snapshots
  • Mounting/umounting of LVM snapshots
  • Detection of DC-local and DC-agnostic snapshots available for a given cluster
  • Transmitting/receiving seed data

The following image shows us an overview of a specific host (click on an image to see a larger version):

Screen Shot 2016-03-28 at 12.55.43

How does it work?

The Orchestrator-agent runs on the MySQL server as a service, and it connects to Orchestrator through an HTTP API. Orchestrator-agent is controlled by Orchestrator. It uses and is based on LVM and LVM snapshots: without them it cannot work.

The agent requires external scripts/commands example:

  • Detect where in the local and remote DCs it can find an appropriate snapshot
  • Find said snapshot on server, mount it
  • Stop MySQL on target host, clear data on MySQL data directory
  • Initiate send/receive process
  • Cleanup data after copy

If these external commands are configured, a snapshot can be created through the Orchestrator web interface example. The agent gets the task through the HTTP API and will call an external script, which creates a consistent snapshot.

Orchestrator-agent configuration settings

There are many configuration options, some of which we’ll list here:

  • SnapshotMountPoint – Where should the agent mount the snapshot.
  • AgentsServer  –  Where is the AgentServer example: “http://192.168.56.111:3001” .
  • CreateSnapshotCommand  – Creating a consistent snapshot.
  • AvailableLocalSnapshotHostsCommand  – Shows us the available snapshots on localhost.
  • AvailableSnapshotHostsCommand  – Shows us the available snapshots on remote hosts.
  • SnapshotVolumesFilter  – Free text which identifies MySQL data snapshots.
  • ReceiveSeedDataCommand  – Command that receives the data.
  • SendSeedDataCommand  – Command that sends the data.
  • PostCopyCommand  – Command to be executed after the seed is complete.

Example external scripts

As we mentioned before, these scripts are not production ready.

"CreateSnapshotCommand": "/usr/local/orchestrator-agent/create-snapshot.sh",

This small script creates a consistent snapshot what agent can use later, but it is going to Lock all the tables for 10 seconds.  (Better solutions can be exists but this is a proof of concept script.)

"AvailableLocalSnapshotHostsCommand": "lvdisplay | grep "LV Path" | awk '{print $3}'|grep my-snapshot",

We can filter the available snapshots based on the “SnapshotVolumesFilter” string.

"AvailableSnapshotHostsCommand": "echo rep4",

You can define a command that can show where the available snapshots in your topology are, or you can use a dedicated slave. In our test, we easily used a dedicated server.

"SnapshotVolumesFilter": "-my-snapshot",

“-my-snapshot” is the filter here.

"ReceiveSeedDataCommand": "/usr/local/orchestrator-agent/receive.sh",

The agent passes two parameters to the script, then it calls the script like this:

/usr/local/orchestrator-agent/recive.sh /var/lib/mysql/ 21234

The script cleans the folder (you can not start while mysqld is running; first you have to stop it on the web interface or command line), listens on the specified port and it waits for the compressed input. After it removes “auto.cnf”,  MySQL recreates a new UUID at start time. Finally, make sure every file has the right owner.

"SendSeedDataCommand": "/usr/local/orchestrator-agent/seed.sh",

The agent passes three parameters to the script:

/usr/local/orchestrator-agent/seed.sh /tmp/MySQLSnapshot rep5 21234

The first parameter is the mount point of the snapshot, and the second one is the destination host and the port number. The script easily compresses the data and sends it through “nc”.

Job details

Screen Shot 2016-03-28 at 16.40.15

A detailed log can be found from every seed. These logs can be really helpful in discovering any problems during the seed.

Why do we need Orchestrator-agent?

If you have a larger MySQL topology where you frequently have to provide new servers, or if you have a dev/staging replica set where you want to easily go back to a previous production stage, Orchestrator-agent can be really helpful and save you a lot of time. Finally, you’ll have the time for other fun and awesome stuff!

Features requests

Orchestrator-agent does its job, but adding a few extra abilities could make it even better:

  • Adding XtraBackup support.
  • Adding Mysqlpump/Mysqldump/Mydumper support.
  • Implement some kind of scheduling.
  • Batch seeding (seeding to more than one server with one job.)

Summary

Shlomi did a great job again, just like with Orchestrator.

Orchestrator and Orchestrator-agent together give us a useful platform to manage our topology and deploy MySQL data files to the new servers, or re-sync old ones.

PREVIOUS POST
NEXT POST
Tibor Korocz

Tibi joined Percona in 2015 as a Consultant. Before joining Percona, among many other things, he worked at the world’s largest car hire booking service as a Senior Database Engineer. He enjoys trying and working with the latest technologies and applications which can help or work with MySQL together. In his spare time he likes to spend time with his friends, travel around the world and play ultimate frisbee.

3 Comments

  • Tibor,

    I realize that these scripts are not production ready but I think your “create-snapshot.sh” won’t actually create a consistent snapshot because FTWRL won’t hold a global lock when the session is ended. Do you know some workaround for this?

    Regards

    • Hi Sebastian,

      Thanks for the response, and yes as I said they are not production ready, but this is a good catch , there is an easy workaround for that example you can put it in background for a few sec like this:

      mysql -e”STOP SLAVE; FLUSH TABLES WITH READ LOCK;select sleep(10);” &>/dev/null &

      Probably there is a better solution than locking the tables for 10sec but this could work. I am going to think about it what could be the best solution and come back to you.

      I update the script with this new line.

      Thanks,
      Tibi

  • Pretty Informative Post…

    One must be aware of backup plans and process….

    I like that one can even pick the backup option accordingly…. People like me who are non tech savvy… for them lot of information is been shared above in the post. Generally we hire experts for data retrieval… that can be stopped and with backup plans we can safe lot of time and money.

    I believe that experts must be hired only in the severe case…

    I liked your post and I will definitely apply all the tips and suggestions you have mentioned above in the post.

    Good Read!!

Leave a Reply