The biggest innovation in TokuDB v7.5 is Read Free Replication (RFR). I blogged a few days ago posting a benchmark showing how much additional throughput can be achieved on a replication slave, while at the same time lowering the read IO operations to almost zero. The official documentation on the feature is available here.
In this second blog I want to cover the requirements for RFR, as well as some interesting use-cases for the technology.
The only requirement on the master is that replication logging must be row based (BINLOG_FORMAT=ROW). This is key, because the optimization requires the slave servers to know the before and after images of the SQL operations in order to avoid read IO.
There are a few requirements on the slave: (1) The server must be in “read only” mode (read_only=1) and (2) you must disable the default uniqueness checking (tokudb_rpl_unique_checks=0) and/or read-before-write behavior (tokudb_rpl_lookup_rows=0).
We decided to disable the RFR optimizations by default (both default to 1) as we want to make sure users request this change in behavior. Even in read only mode, MySQL slaves allow insert/update/delete operations from users with SUPER privileges. And without uniqueness checks and read-modify-write there is no way for the slave to know if there is drift. It is, however, perfectly fine for your slave to have more or less indexes than the master.
Use Case 1 : Read Scaling
If you care about your data, you likely have one or two slaves online should your master fail. Another great use for slaves is read scaling, meaning you are sending queries to your slaves to put less load on your master. TokuDB’s read free replication means that ALL of your slaves read IO capacity is available for these queries. Plus, lag free replication means you can send more of your queries to the slave without worrying about how current the data on the slave is.
Use Case 2 : Mixed Server Environments
Since MySQL replication is independent from the storage engine, it’s always been possible to add a TokuDB slave to an existing environment using InnoDB (or MyISAM). By adding a TokuDB slave to your environment you can measure the reduction in IO, increased replication throughput, and how much compression you can achieve with your own data. No benchmark compares to your own actual data and workload.
Use Case 3 : Big Masters, Small Slaves
Adding an additional TokuDB slave can be much more cost effective since it uses far less CPU, RAM, and IO. Keep in mind that if you’re using MySQL replication for availability you need to make sure that one of your slaves is powerful enough to be promoted to master and handle your workload.
What about MySQL 5.6 and MariaDB 10.x
RFR is 100% compatible with the parallel replication features in MySQL 5.6 and MariaDB 10.0. There is effort for the implementation, but the features are compatible.
As evidenced by my benchmarks, TokuDB fundamentally changes the landscape of slave replication performance. But as we’ve seen before, removing one bottleneck only opens your eyes to the next one. Next up is reducing the number of fsync() operations on the slave which will leave even more IO capability available for read scaling. Keep an eye on this blog, we’ll have updates as we make progress.
To learn more about TokuDB:
Download it, check out the documentation, or get active in our Jira.