October 21, 2014

trx descriptors: MySQL performance improvements in Percona Server 5.5.30-30.2

Percona Server for MySQLOne major problem in terms of MySQL performance that still stands in the way of InnoDB scalability is the trx_list scan on consistent read view creation. It was originally reported as a part of MySQL bug #49169 and can be described as follows. Whenever a connection wants to create a consistent read, it has to make a snapshot of the transaction states to determine which transactions are seen in the view later. To this end, InnoDB scans trx_list (i.e. the list of currently open transactions) and copies IDs of transactions that have not yet been committed at the current point in time, and thus should not be visible in the consistent read. For the REPEATABLE_READ isolation level, the snapshot is created on the first SELECT for each transaction. For lower isolation levels it is created for each SELECT, even within the same transaction.

Why is that a major problem for MySQL scalability? The scan is performed under kernel_mutex (or trx_sys->mutex in 5.6) and that is one of the top reasons for contention on that mutex.

Percona Server as of version 5.5.30-30.2 significantly reduces the impact of the list scan by replacing it with more efficient operations. In a nutshell, it maintains an array of active transactions which we call the trx descriptors array. Creating a snapshot of transaction states is then a matter of copying a relatively small memory block which, depending on many factors, can be from few times to orders of magnitudes faster than a list scan.

You may be curious if MySQL 5.6 brings any notable improvements in this area. It does, but only for a special case. Read-only transactions, while a great optimization in itself, applies only to the following kinds of transactions:

  • MySQL transactions started with START TRANSACTION READ ONLY
  • MySQL transactions created by a non-locking SELECT statement in the autocommit mode.

The relevant optimization in MySQL 5.6 boils down to the fact that read-only transactions are not added to the list of open transactions. Since they don’t modify any data, it is irrelevant whether a consistent read view sees them or not.

The main limitation is obvious: if you want to benefit from that optimization, you basically want all or most of your transactions to be read-only. Sometimes you may also want to modify your applications to issue START TRANSACTION READ ONLY appropriately. Which is not always applicable or possible in practice.

The “trx descriptors” optimization in Percona Server attacks this problem from a different (and a wider!) angle. It requires neither read-only server nor rewriting application queries. Furthermore, it can be combined with the read-only optimization in MySQL 5.6 to achieve even better scalability while removing the read-only limitations.

Let’s take a look at the following benchmark:

point_select_qps_1024

This is the so-called QPS-mode POINT_SELECT benchmark, i.e. all queries are single-statement SELECT statements in autocommit. Which is the perfect case for read-only optimization in MySQL 5.6 and as you can see MySQL 5.6 shines in this benchmark. This is also the case that Oracle seems to focus the most in their MySQL 5.6 benchmarks. However, the “trx descriptors” optimization in Percona Server 5.5 is fairly close in terms of scalability, while the cost of the trx_list scan in MySQL 5.5 basically kills throughput on high concurrency.

The picture changes dramatically when the read-only optimization in MySQL 5.6 does not kick in. The default sysbench POINT_SELECT mode benchmark does PK lookup SELECT queries wrapped into regular (i.e. not marked as read-only) transactions.

point_select_trx

In this case MySQL 5.6 scalability goes back to the good old MySQL 5.5 times and we see identical results with both, i.e. there’s a notable drop in throughput starting from 256 threads.

But Percona Server’s scalability is the same as in the previous benchmark. The overhead of maintaining the descriptors array is negligible as compared to the list scan and it provides stable throughput even at 4096 concurrent connections.

So both the general “trx descriptors” optimization introduced in Percona Server 5.5.30-30.2 and the special-case “read-only transactions” optimization in MySQL 5.6 are nice ways to improve scalability in highly concurrent workloads. We also have plans to combine them in the future releases of Percona Server 5.6, which is currently in the alpha stage.

We are going to blog about other MySQL performance improvements introduced in our latest Percona Server release, stay tuned.

About Alexey Kopytov

Alexey Kopytov is a Principal Software Engineer at Percona. Before joining Percona in 2010 he was a member of the MySQL development team at Oracle. His focus at Percona is development of both Percona Server and Percona XtraBackup.

Comments

  1. Andy says:

    1) In the 1st benchmark Percona server runs at close to 250K qps. In the 2nd benchmark it is closer to 150K qps. What accounts for the drop?

    2) The MySQL Bug #49169 was closed with the comment “lets give this issue a rest because there is a good reason why we have decided to close this bug.” Does that mean they don’t agree with the Percona server fix? Is there any downside to the Percona fix?

  2. Alexey Kopytov says:

    Andy,

    1) The 1st benchmark is single SELECT queries (i.e. the –skip-trx option to sysbench). The 2nd one is the default sysbench mode, i.e. “BEGIN; SELECT …; COMMIT” which is expected to provide lower TPS values than the “QPS” mode.

    2) There was no fix from Percona at the time MySQL bug #49169 was closed. However, as you can see from the comments there’s some disagreement on whether the Closed status correctly reflects the current state of things.

  3. hickey.liu@gmail.com says:

    I backed port percona’s perf tuning for read_view to AliMySQL-5.5, with 1K connections case, RO 13187 -> 40797, RW QPS 2W -> 3W TPS 5K-> 8.5K Amazing ~

Speak Your Mind

*