A TokuDB Stall Caused by Conflicting Transactions When Opening a Table

One of our customers reported that ‘create table select from’ statements stall for a period of time equal to the TokuDB lock timeout.  This indicated a lock conflict between multiple transactions.  In addition, other MySQL clients that were opening unrelated tables were also stalled.  This indicated that some shared mutex is held too long.  We discuss details about this bug and how it was fixed.  The bug fix will be distributed in TokuDB 7.1.0.

Suppose that we set the tokudb lock timeout to 60 seconds just to exaggerate the stall.

mysql> set global tokudb_lock_timeout=60000;
Query OK, 0 rows affected (0.00 sec)

We then create a simple table.

mysql> create table s (id int primary key);
Query OK, 0 rows affected (0.02 sec)

When we create new table (t) and populate it from some other table (s), the table creation stalls until the TokuDB lock timeout expires.

mysql> create table t select * from s;
Query OK, 0 rows affected (1 min 0.05 sec)
Records: 0 Duplicates: 0 Warnings: 0

The long table creation time is the first bug.

Meanwhile, other MySQL client stall when opening an unrelated table.

other mysql> create table x (id int primary key);
Query OK, 0 rows affected (0.02 sec)

other mysql> select * from x;
Empty set (51.98 sec)

The Bug
MySQL will create a TokuDB handler object for each MySQL client that opens a TokuDB table.  Since these handler objects share handles to the fractal trees, there must be some coordination when opening and closing these objects.  The TokuDB storage engine uses the ‘tokudb_mutex’ to serialize parts of the table open and close functions.

The ‘create table select from’ statement uses a single TokuDB transaction to create the new table, open the new table, and populate the new table from the source table.  The table creation uses that transaction to grab a table lock.  The table open uses an unrelated transaction to find out if the table is empty.  This unrelated transaction also tries to grab a lock on the table but it conflicts with the statement transaction. Eventually, the lock request times out.  This is the cause of the ‘create table select from’ stall.

The TokuDB table open function holds a lock on the ‘tokudb_mutex’ when it is checking the table to see if it is empty.  This causes clients that are opening or closing a TokuDB table to stall on the ‘tokudb_mutex’.

The Bug Fix
The bug fix is quite simple.  Just use the statement transaction rather than a new transaction to check whether or not the table is empty when the table is being opened. This avoids the lock timeout since lock conflicts are impossible with a single transaction. Since the lock timeout is avoided, the ‘tokudb_mutex’ is not held for long periods of time, which solves the serialization problem.

Using the TokuDB lock tree tables in the information schema
We added tables in the information schema that provide access to the TokuDB locks.  This feature will be available in TokuDB 7.1.0.

Suppose that a ‘create table select from’ statement is executing and we still have not fixed the bug.  We can use another MySQL client to show the state of the TokuDB locks to figure out what is going on.

mysql> select * from information_schema.tokudb_lock_waits;
| requesting_trx_id | blocking_trx_id | lock_waits_dname | lock_waits_key_left | lock_waits_key_right | lock_waits_start_time |
| 464               | 462             | ./test/t-main    | -infinity           | +infinity            | 1380718930479         |

This shows that transaction 464 is blocked by transaction 462 when trying to get a table lock on the table ‘t’ in the ‘test’ database.

What clients do these transactions belong to?

mysql> select * from information_schema.tokudb_trx;
| trx_id | trx_mysql_thread_id |
| 462    | 1                   |
| 464    | 1                   |

We find that both transactions were created for MySQL client 1, which is executing the ‘create table select from’ statement.

mysql> select * from information_schema.tokudb_trx,information_schema.processlist where trx_mysql_thread_id = id;
| trx_id | trx_mysql_thread_id | ID | USER | HOST      | DB   | COMMAND | TIME  | STATE        | INFO                           |
| 462    | 1                   | 1  | root | localhost | test | Query   | 40    | After create | create table t select * from s |
| 464    | 1                   | 1  | root | localhost | test | Query   | 40    | After create | create table t select * from s |

Note, I merely documented this problem, Zardosht did all of the debugging.

Share this post

Leave a Reply