October 31, 2014

Percona Server 5.5.15 + Galera 21.1-beta2

Codership team published beta2 of MySQL 5.5.15 with Galera replication
https://launchpad.net/codership-mysql
and we made port to Percona Server:

source code:
lp:~percona-dev/percona-server/percona-server-galera-5.5.15
binaries for RedHat/CentOS 6:
http://www.percona.com/downloads/TESTING/Galera/Percona-XtraDB-Galera-5.5.15.tar.gz

What difference between Percona Server+Galera and MySQL 5.5.15 ?
First of course, Percona Server+Galera is based on our XtraDB engine.
Second, we provide wsrep_sst_xtrabackup script, which allows to use Percona XtraBackup for node provisioning.
Percona Server+Galera is still on early stage, and we make it available so you can play it to gain some play-and-touch experience.

So What is Percona Server+Galera at the end ?
I wrote about this previously, but I want to highlight some points again.

1. It is new High Availability + Scalability solution for MySQL.
And this solution is radically different from regular MySQL replication.

You can actually think about N-nodes Percona Server+Galera setup as Cluster.

where each node is active (accepts both reads and writes).
You can perform writes to ANY node.

Setup of 3-master active MySQL Replication is practically impossible.

2. In contrast to MySQL replication, in Percona Server+Galera schema
all nodes are CONSISTENT. Transaction either is commited on all nodes or
not commited at all. Forget about slaves being out-of-sync with master.

3. Nodes are able to apply events in parallel. And this is true parallel replication, not “per-schema” as in MySQL 5.6.

4. New node automatically joins to Cluster. No manual cloning of slave and copying to new box. Using Percona XtraBackup as transport to transfer data between nodes provides you minimal locking time.

You are welcome to try it.



About Vadim Tkachenko

Vadim leads Percona's development group, which produces Percona Clould Tools, the Percona Server, Percona XraDB Cluster and Percona XtraBackup. He is an expert in solid-state storage, and has helped many hardware and software providers succeed in the MySQL market.

Comments

  1. Step 2 above sounds like writes are synchronous across all servers, which is going to be a performance bottleneck and probably worse than standard replication (which can just write to a local binlog and carry on without waiting for slaves to catch up). Can you explain how it reconciles that with the claims of scalability? Does it speculatively replicate queries before they are committed (so they don’t have to wait for the initiating server to commit before starting their copy of the transaction)?
    It’s great that it parallelises replication, but it seems it comes at the expense of write performance?

  2. William says:

    Yes, it would be *very* interesting to see what the performance hit was for this kind of solution, as well as what would happen in the event of a failure of a node.

  3. Mark Callaghan says:

    Vadim – Is there an easy way to describe why Galera doesn’t have a bottleneck from doing sync commit? A naive approach that tried to do sync commit on the innodb transaction log would serialize on it and suffer greatly from network round trip times. But Galera does something clever which I need to learn more about.

  4. Galera replication guarantees that transaction’s write set is entered in slave queue for each node in the cluster, and the commit returns to the client when this point is reached. .i.e. write set will not be applied in slaves during replication, the replication delay is proportional to network RTT.

    Note that Galera does not need to write binlog files, and flush_log_at_trx_commit can be safely set to 0, this compensates quite a lot in master side processing. In slave processing, Galera uses parallel applying with row level granularity. This has turned out to be very effective for certain SQL load profiles.

    MySQL asynchronous replication can indeed enable master to process faster, with the expense of slave lag. But is this a nice situation after all, what are these slaves good for then?

  5. Alex says:

    Hi everybody,

    About RTT delays and how bad they are.

    At first glance, yes, with Galera replication every transaction suffers from an RTT delay (+ some additional overhead) in the commit phase. And that will make each individual transaction processing slower by that amount. However:

    1. Contrast one RTT to overall transaction execution time (including many client-server trips for multistatement transaction) – in LAN it is simply insignificant. In fact you should always give the fastest interface to client connections as they tend to be the main bottleneck.

    2. The point which many are missing when speculating about synchronous replication: transactions don’t have to be replicated one at a time. Galera replicates transactions that reached commit point concurrently (so this is what should be called “parallel replication”) and even does transaction aggregation when appropriate. So this 1/RTT transaction rate arithmetic is just wrong. We could achieve ~450 sysbench transactions per second (standalone server figures) on a link with 0.1 second RTT. That amounts to ~45 transactions in replication concurrently.

    3. Commits are serialized _after_ transaction is replicated, so RTT is not participating in commit serialization. Transactions are committed at a maximum rate possible (and in the absence of log flushing it is very fast).

    So all in all, what Galera replication does to transaction – it makes it a little bit longer (depending on the replication network RTT), and single connection transaction rate a little bit lower. And it can be compensated by simply increasing the number of concurrent client connections (which is done anyways to compensate for many other transaction processing latencies and to utilize all CPU cores).

    Finally you can check out some benchmarks:
    http://www.codership.com/content/synchronous-replication-loves-you-again
    http://openlife.cc/blogs/2011/august/running-sysbench-tests-against-galera-cluster
    http://linsenraum.de/erkules/2011/06/momentum-galera.html
    …or try it yourself!

  6. I’ve had an opportunity to really benchmark the gts out of Galera in August:
    http://openlife.cc/category/topic/galera

    For in-memory workloads there isn’t really any overhead when you write to one master. (And if accounting for the group-commit bug with sync_binlog=1 there’s of course a huge performance boost, but I don’t know if that is a fair comparison.) Then you can write to 2 or more masters which will usually result in better performance than you can get with a single master – this assumes that your workload isn’t 100% writes.

    For disk bound workloads it’s still good, but it apparently aggravates the issues with InnoDB redo log flushing, which Vadim has recently blogged about.

  7. Andy says:

    Seppo,

    >Note that Galera does not need to write binlog files

    In that case can Galera run without binlog enabled at all?

    And if binlog is required, can sync_binlog be set to 0?

  8. Andy,

    Galera catches binlog events directly from MySQL transaction cache, before they are written to binlog files. Therefore, log-bin does not even need to be enabled for Galera replication to work.

    You can enable binlogging just by setting ‘log-bin’ and ‘log-slave-updates’ options and do further tuning with sync_binlog to your liking. To my understanding binlog files are usually not enabled in Galera deployments, there is not much use for these files when Galera replicates everything anyways. However, I know that there are some experiments in using Galera cluster in MySQL master or slave roles. But these use cases will require quite careful MySQL replication management.

  9. marrtins says:

    How about Infiniband? Such replications over Infiniband network should be much faster and with much less latency? Are there plans for native Infiniband support (not IP-over-Infiniband)?

  10. marrtins,

    Yes, I think Infiniband would give improvement.
    There is no direct plans for supporting Infiniband, but it may be implemented if there are such requests.

  11. wsrep_sst_xtrabackup script = amazing idea. I was thinking of trying this myself. Thanks guys!

    Tim

  12. Allan D says:

    Hello, please let me qualify by saying that I am certainly no expert on these isues. I am investigating the feasibility of running solaris 11 with zones, using the integrated load balancer, and installing percona server + galera to produce a high availivility load balanced database backend for a website. Can you tell me if the percona + galera runs on Solaris 11? Also, is the Galera the CLUSTER software required to make perkona server be clustered? any other solutions or is this the primary direction you are going for clustering? I am going to be using Solaris ZFS Pools as a backing source for the data files, etc. I appreciate your comments.

    Allan

  13. Allan D says:

    Vadim,
    I have heard that Percona has created beta builds of Percona Server with Galera. This is fantastic as I am very interested in trying out the combination of products for my needs. I think its exactly what I need.

    However, as you may have figured out from my previous post, I am using Solaris 11 because of the advantages of using load balanced zones, and ZFS data set backing. I am thinking better performance because of the ZFS and tuning options available. Alex was saying that he does not have a build ready for Solaris and that the configurator at Severalnines does not have even an option for Solaris.

    So.. My question for you is this. Do you have any binaries for Solaris 11 at this time that include Percona Server + Galera?
    Do you have Percona Server compiled for Solaris? How do I obtain this software? I am very eager to get a test environment setup so I can decide if it will work for me.

    Thank you,
    Allan

  14. Allan,

    We do not have binaries for Percona Server + Galera for Solaris,
    and actually we do not have it in plans, unless there is commercial interest of this package on Solaris.

  15. Mark Rose says:

    Any chance of a debian package? :-)

  16. Mikael K. says:

    Do you have a timetable for this project? And will you release the complete source code for the Percona Server + Galera when you feel it is ‘production ready’? I have also not been able to find out if Galera has parameters to tweak what happens if communication between members in the cluser is severed.

    We are currently looking for a viable clustering solution for MySQL to implement within a year and this is by far the most promising one technically.

  17. Mikael,

    I can’t provide timetable yet, but you may follow our blog, we may have something in couple weeks.

    All our projects are open source, you can find current source code there:

    https://code.launchpad.net/~percona-dev/percona-server/percona-server-galera-5.5.15

    Good reference on Galera parameters is there:
    http://codership.com/wiki/doku.php?id=reference

  18. This is great news, this looks like the best bet for people who want open source master-master synchronous replication and are not interested in asynchronous replication (again, what good is a slave if it is out of date?)

Speak Your Mind

*