Q&A: Putting MySQL Fabric to use

Percona MySQL webinar followup: Q&AMartin Arrieta and I gave an online presentation last week on “Putting MySQL Fabric To Use.” If you missed it, you can find a recording and the slides here, and the vagrant environment we used plus a transcript of the commands we ran here (be sure to check out the ‘sharding’ branch, as that’s what we used during the webinar).

Thank you all for attending and asking interesting questions. We were unable to answer all of them in the scheduled time, so here are our replies to all the questions.

What is GTID? And how does it relate to MySQL Fabric?
GTID stands for Global Transaction Identifier, and MySQL Fabric requires MySQL replication to be configured to used this in order to work. This means that it won’t work with MySQL versions previous to 5.6. You can find more information in the manual, and in the upcoming Percona webinar, “Using MySQL Global Transaction IDs in Production,” that our colleague Stephane Combaudon will present on August 27.

During any issue, does MySQL Fabric internally do a switch over or failover?
MySQL Fabric will only detect failures and initiate failovers for a group that has been activated. Otherwise, any changes must be made manually through the mysqlfabric command.

For an alter table, how do you avoid replication lag?
We think using pt-online-schema for the alter table is a good way to avoid or minimize this problem.

Are there benchmarks available for MySQL Fabric?
We’re not aware of any, but once we wrap up our current series of blog posts on the topic, we plan to do some benchmarks ourselves. Our interest lies in the overhead the connectors can place vs a standard connector accessing a single MySQL Server, but if you have other ideas, let us know in the comments and we’ll consider adding them to our list.

Can MySQL Fabric be used to handle Big Data?
We think the HA features are usable regardless of data set size, with the caveat that MySQL Fabric does not (currently) handle provisioning the data of the managed instances. It just configures and points replication as required, but you’re responsible for making sure all servers have the same set of data.

We feel that the sharding features, however, may not be a good fit for working with very large data sets yet, specially because of the use of mysqldump/mysql on the shard split/move operations. However, since sharding usually goes hand in hand with big data (a data set size too large is one of the reasons to shard after all) we’re sure this will get improved in the future. Or, someone with some python skills can adjust the scripts to use a more efficient mechanism to move data from one server to another, like Percona XtraBackup.

Does sharding require many changes to application coding (database code on shards etc) at the MySQL Fabric level? Should we sit with the developers/architects to help them understand tables, their usage, and to plan sharding?
Sharding with MySQL Fabric is not transparent (and we’d say sharding, in general, is typically not transparent) so some application coding and DBAs/Devs interaction will be needed. But you should be working close to your developers and architects (or DBAs and sysadmins, depending on which side of the counter you stand) anyway 🙂

Would you say that MySQL Fabric is a good HA replacement removing the need for VIPs?
We don’t see them as mutually exclusive. Simplifying, the HA features of MySQL Fabric are just another way to manage replication between a set of MySQL servers. It doesn’t force any way to make writable and readable nodes available to clients. With some coding, you can use MySQL Fabric together with VIPs, or dynamic DNS entries, or a load balancer like HA Proxy.

Does global group mean fabric node?
If you did not attend the webinar, we’d recommend you take a look at slide 7 to get the required context for this question.

Global group means a group that has the schema for all tables, all the data for global tables, and no data for sharded tables. Since it’s a group, it can be one or more nodes, but at any given time, there will only be one PRIMARY node in the group, and the PRIMARY nodes on all shard groups will be replication slaves of this node.

Is the config file at the application side or on the MySQL Fabric node?
There is some configuration at both ends. The MySQL Fabric node (the ‘store’ server in our Vagrant setup) has some configuration, but applications need their own too. The separation is rather clean though. Typically, the application only needs to know how to reach the MySQL Fabric server (host, port, credentials), while the MySQL Fabric node needs to know how to connect to the nodes it manages, how long to wait before starting a failover, etc. The configuration on MySQL Fabric’s side is done by the fabric.cfg file, while on the application side, it depends on the connector used.

When setting a faulty to secondary, does it also automatically come back in replication as a slave?
Yes, though remember you’ll first have to set it to SPARE 🙂

When we set primary and secondary and it sets the replication, it never copies data and builds replication? It only starts replication from that side and we will lose all the data which is not available on secondary
Yes to the first question. As to the second, we reiterate the fact that MySQL Fabric does not handle provisioning data for instances. When you’re using it to manage nodes, it (currently) assumes the nodes all have a copy of the data. It is your responsibility to make sure that is the case, though this may change in the future. To give you a specific example, suppose you run the following commands:

mysqlfabric group create ha
mysqlfabric group add ha node1
mysqlfabric group add ha node2

Then, on node1’s MySQL’s CLI:

create database test;

and finally run:

mysqlfabric group promote ha --slave_id=<uuid_for_node2>

You’ll end up with node2 being PRIMARY, node1 being SECONDARY, without the ‘test’ database on the PRIMARY node.

Are there any limitations around adding shards? Is it possible to re-configure the sharding keys to handle the additional shard(s)?
We don’t know about any limitations, and yes, it is possible to add more shards and configure the mapping. We did not have enough time to do this at the webinar, but the transcript on the repo includes examples for splitting and moving shards. What you want in this case is to add a new group (can be a single node) and then split a shard (optionally providing a pivot value), keeping one part on the original group and the rest on the new one. Fernando encountered one problem when doing this on a busy server, though the bug is not yet verified so that may have been a user error on his side, but we’d recommend you do similar tests yourself before attempting this in production.

Is the addition of Connectors at the MySQL Fabric level or even at application side? Is an application code change a huge effort?
It’s at the application side. It’s difficult to estimate the effort for the code changes, but if your database access is reasonably modularized/abstracted from your application logic, it shouldn’t be huge. In fact, for HA, if you don’t want to split read and writes, you can migrate an application to a MySQL Fabric-aware connector with just a couple of lines changed (changing the connection string and then requesting a MODE_READWRITE connection always).

What is the minimum server requirements to setup mysqlfabric and does it use any multithreading if i have more than one core for each node. Can we get the transcripts with examples..
We don’t know of the official minimum requirements, but it is a python script which in our experience has been lightweight. Remember it does *not* act as a middle man/proxy between your app and the servers, it is only involved when a client establishes a new connection or when a change in status happens in a server. As for multithreading, we know it’s concurrent (i.e. multiple clients can establish connections at the same time) but we don’t know if it’s parallel.

The transcript with examples can be found here.

How does MySQL Fabric handle latency issues? Does latency increase with mysql traffic?
MySQL Fabric uses regular MySQL connections so latency will affect it the same as any other app using MySQL, and yes, it will increase with traffic (but not due to Fabric).

What if MySQL Fabric is hung? Will connections going to the primary stop? And how can we deduce if there is an issue at the fabric side?
If MySQL Fabric is hung, the following scenarios will be problematic:

  • New connections come in. Clients will try to connect to Fabric before trying to connect any hosts (they won’t know what hosts to connect to otherwise!) so if MySQL Fabric is down, no new connections can be established.
  • A failure happens in the managed servers. MySQL Fabric is responsible for monitoring for failures and taking the appropriate actions (including promoting another server, repointing replicas, etc), so if something fails while MySQL Fabric is down, there will be nobody to take action.

Are there any scenarios where a data loss can happen when promoting/demoting nodes (or nodes failure happens) in a production environment?
Given the use of gtid-enforce-consistency and the fact that MySQL Fabric won’t promote a node to PRIMARY until it has caught up with any pending changes, we feel this is unlikely, but we’re planning a blog post specifically focused on evaluating the potential of data loss during failover.

How to configure MySQL Fabric to use synchronous / semi-synchronous mechanisms?
We have not attempted this, but from this manual page, we think currently only async replication is supported. I would expect sync to be in the roadmap though. If this is something you’d like to see, another section on the same page has some suggestions on how to make it happen.

Thanks again for attending the webinar, and feel free to post any further questions in the comments section.

Share this post

Comments (12)

  • Moody

    Nice Q&A .. will any tests be made integrating PXC with Fabric? i.e replacing the replication topologies with xtradb cluster? I’d imagine it would give PXC the ability for write scaling and auto sharding..

    August 5, 2014 at 1:41 pm
  • Fernando Ipar


    I think trying to integrate Fabric’s HA parts with PXC would not add any benefit (that’s my personal opinion), and it would be complicated anyway, since PXC relies on Galera and not on MySQL replication.

    Sharding could work though, I don’t see a reason not to have each shard group be a PXC, with Fabric setting up async M/S replication between the global PXC and the shards. I would only use Fabric for shard routing though, and I’d never promote or activate any of the groups (as again, the HA features wouldn’t work with Galera anyway).

    Finally, to answer your specific question, I don’t have any short term plans to test this kind of integration, but of course that does not mean it may not happen if the needs comes up while working on a case 🙂

    August 5, 2014 at 1:57 pm
  • Longxing

    Hi Fernando,
    I have a question about the fault-tolerant ability of fabric server, would you please do me a favor.

    I used to read a article about fabric which says:

    The MySQL Fabric process itself is not fault-tolerant and must be restarted in the event of it failing.
    Note that this does not represent a single-point-of-failure for the server farm (HA and/or sharding) as
    the connectors are able to continue routing operations using their local caches while the MySQL
    Fabric process is unavailable.

    So I do some tests about the fault-tolerant ability. I stop the fabirc server and do some queries
    to the connector. But a error happens. The error message is as follows:

    [root@JSSIOEtest01 python]# python test_ddl.py
    Traceback (most recent call last):
    File “test_ddl.py”, line 7, in
    File “/usr/lib/python2.6/site-packages/mysql/connector/__init__.py”, line 144, in connect
    return mysql.connector.fabric.connect(*args, **kwargs)
    File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/__init__.py”, line 53, in connect
    return MySQLFabricConnection(**kwargs)
    File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 809, in __init__
    File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 948, in store_config
    File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 929, in _configure_fabric
    File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 332, in seed
    File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 763, in connect
    self._proxy = self._xmlrpc_get_proxy()
    File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 747, in _xmlrpc_get_proxy
    proxy._some_nonexisting_method() # pylint: disable=W0212
    File “/usr/lib64/python2.6/xmlrpclib.py”, line 1199, in __call__
    return self.__send(self.__name, args)
    File “/usr/lib64/python2.6/xmlrpclib.py”, line 1489, in __request
    File “/usr/lib/python2.6/site-packages/mysql/connector/fabric/connection.py”, line 272, in request
    raise InterfaceError(“Connection with Fabric failed: ” + msg)
    mysql.connector.errors.InterfaceError: Connection with Fabric failed:

    It seems that connector need to connect fabric server first rather than using the local cache.
    Am I right? Or something else?


    August 27, 2014 at 4:03 am
  • Martin Arrieta


    Correct, the first time that you use the connector it will need to read the server information from the store server, however after that it will use the local cache. Try the following example:

    1- Start your script, open the connector and keep running queries.
    2- Stop the store server
    3- Check if the status of your script

    You will see that your script will keep running without problems using the local cache even with the store server down.



    August 27, 2014 at 2:55 pm
    • Shen Longxing


      Grateful for your reply, it does helpful to me.

      And I have another question, would you please do me a favor. Thank you!
      I want to reshard the shards, and it seems that there is no api to achieve that.

      I used to read an article from Oracle saying there are tools using to reshard, but I do not know what are they.
      Can you please give me some advice?


      August 28, 2014 at 10:42 pm
  • Fernando Ipar


    To my knowledge, the only currently existing operations that change shards are split and move. You can see an example of a split here: https://github.com/martinarrieta/vagrant-fabric/blob/sharding/webinar-commands/session-0.org#splitting-a-shard

    FYI, at this moment there are two open bugs that I know about that affect the split and move functionality: http://bugs.mysql.com/bug.php?id=73212 and http://bugs.mysql.com/bug.php?id=73546


    September 1, 2014 at 7:16 pm
    • Shen Longxing


      Grateful for your reply, it does helpful to me.
      I do the split according to your example:https://github.com/martinarrieta/vagrant-fabric/blob/sharding/webinar-commands/session-0.org#splitting-a-shard.

      But, I got a error when execute “mysqlfabric sharding split_shard 1 salaries-4”.
      The log is as follows:
      [root@JSSIOEtest05 python]# mysqlfabric sharding split_shard 1 group_id-3 –split_v alue=10000
      Password for admin:
      Procedure :
      { uuid = 2cf79d32-feaa-4355-a6d4-b644d92704f1,
      finished = True,
      success = False,
      return = BackupError: (‘Error while restoring the backup using the mysql client\n, %s’, “ERROR 1840 (HY000) at line 24 in file: ‘MySQL_132.228.239.24_3317.sql’ : @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_EXECUTED is empty.\n”),
      activities =

      And you told me that there is a open bug about the split:http://bugs.mysql.com/bug.php?id=73212.
      Is it the reason for the error that I met?


      September 2, 2014 at 3:46 am
  • Fernando Ipar


    Yes, I believe that bug is the reason for the error you’re getting. In my test case for the bug, I have ongoing write activity to the shard while doing the split. Is that your case too? If you get this error even with no activity, please add that as a comment to the bug report to help the engineers working on it.


    September 2, 2014 at 8:28 am
  • Shen Longxing


    I just split the shard without any activity. And i will add this to the bug report. Thank you!


    September 2, 2014 at 9:45 am
  • Mahesh Patil

    mysqlfabric v1.5.2 is a shit product from Oracle I have never seen worst quality product than this. It’s PITA .. can’t get setup done. Always throws errors on Index size is too large and connection failures.

    October 22, 2014 at 3:18 am
  • Fernando Ipar


    I have not hit any errors like the one you mention, but I did experience odd errors (including ‘mysqlfabric manage start’ being unable to work at all) after upgrading, which where resolved by completely uninstalling the previous version of mysql-utilities and then doing a fresh install of the new one.

    If you’re using our vagrant environment, you can achieve that by doing vagrant destroy; vagrant up.
    If you’re hitting your problems in our vagrant env please file a bug report here https://github.com/percona/vagrant-fabric/issues in case the problem is our fault.

    If you’re running this on your own test systems I’d recommend filing a bug report at https://bugs.mysql.com.

    October 22, 2014 at 8:25 am
  • Fernando Ipar

    I see you’ve reported this already as http://bugs.mysql.com/bug.php?id=74492 so you can disregard my last comment.

    October 22, 2014 at 8:27 am

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.