While we should be aiming for small and fast transactions with Galera, it is always possible at some point you might want a single large transaction, but what is involved?

First, this is supposed to be controlled by two settings, wsrep_max_ws_rows  and wsrep_max_ws_size . The first variable is not yet enforced and has no effect – see here and here – so don’t bother tuning this knob just yet. In my opinion, I would rather implement only one – having a limit by rows is hard to control as a DBA since each row’s size can be very different per workload.

The second variable restricts the writeset size in bytes and has better control on cluster performance. If your network and CPU can only process N amount of bytes per second, this is a good variable to enforce. Additionally, the maximum allowed value for this setting is only 2GB, with a default of 1GB. You can actually set this higher than 2GB, but only the 2GB is being enforced in my tests.

Let’s see what happens when we adjust this setting to large values. I have a sysbench table with 10M rows below and will update all rows in a single transaction.

On the MySQL error log this translates to:

Let’s see what happens if we set this to 4GB:

We got a warning that it exceeds the allowed value, not the 2GB, but something else a bit lower than the 32bit UNSIGNED MAX VALUE. So let’s try the transaction again:

We clearly got the same error on the client, and on the MySQL error log, it’s a bit different:

If you really must process a large number of rows, one way to try and reduce the amount of writeset size is to set binlog_row_image  to minimal . Let’s test this again with a 2GB wsrep_max_ws_size  and see how it goes:

In our tests, we reduced it to at least 10x the original writeset size. Thanks to my colleague Przemyslaw for pointing this out.

Now we know how big we can go in terms of size and how to go further with minimal row image, although this does not mean that you can and should be allowing it. Make sure to set a sane limit on the sizes depending on your workload and realistic performance expectations. Large transactions will not only cause unexpected performance issues with your cluster, but will lead to usability issues as well in terms of increased deadlocks. Lastly, make sure to review what limitations there would be when switching to Galera-based clusters here for an enjoyable experience :).

Newest Most Voted
Inline Feedbacks
View all comments
Peter Zaitsev


I think there is another interesting note, thing to test when it comes to large transaction sizes.
If I’m not mistaken the write set serialization is serialized, which means if you have transaction which modifies 10M or rows and it takes 2sec to certify (for example) the cluster will have to pause all other write traffic for this duration. Or was this changed in more recent version ?

Peter Zaitsev


Is this instrumented some way ? Is it possible to see how long certification is taking ?



Really none to be specific. You can check flow control and wsrep_cert_deps_distance but these can be approximations. From this idea I opened https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1512940

Perhaps we can also extend slow logging to include certification time.


the only way to avoid this error “Got error 90 “Message too long” during COMMIT”, is to split the transactions into smaller amounts?