How to deal with MySQL deadlocksPeiran Song
A deadlock in MySQL happens when two or more transactions mutually hold and request for locks, creating a cycle of dependencies. In a transaction system, deadlocks are a fact of life and not completely avoidable. InnoDB automatically detects transaction deadlocks, rollbacks a transaction immediately and returns an error. It uses a metric to pick the easiest transaction to rollback. Though an occasional deadlock is not something to worry about, frequent occurrences call for attention.
Before MySQL 5.6, only the latest deadlock can be reviewed using SHOW ENGINE INNODB STATUS command. But with Percona Toolkit’s pt-deadlock-logger you can have deadlock information retrieved from SHOW ENGINE INNODB STATUS at a given interval and saved to a file or table for late diagnosis. For more information on using pt-deadlock-logger, see this post. With MySQL 5.6, you can enable a new variable innodb_print_all_deadlocks to have all deadlocks in InnoDB recorded in mysqld error log.
Before and above all diagnosis, it is always an important practice to have the applications catch deadlock error (MySQL error no. 1213) and handle it by retrying the transaction.
How to diagnose a MySQL deadlock
A MySQL deadlock could involve more than two transactions, but the LATEST DETECTED DEADLOCK section only shows the last two transactions. Also it only shows the last statement executed in the two transactions, and locks from the two transactions that created the cycle. What are missed are the earlier statements that might have really acquired the locks. I will show some tips on how to collect the missed statements.
Let’s look at two examples to see what information is given. Example 1:
1 141013 6:06:22
2 *** (1) TRANSACTION:
3 TRANSACTION 876726B90, ACTIVE 7 sec setting auto-inc lock
4 mysql tables in use 1, locked 1
5 LOCK WAIT 9 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 4
6 MySQL thread id 155118366, OS thread handle 0x7f59e638a700, query id 87987781416 localhost msandbox update
7 INSERT INTO t1 (col1, col2, col3, col4) values (10, 20, 30, 'hello')
8 *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
9 TABLE LOCK table `mydb`.`t1` trx id 876726B90 lock mode AUTO-INC waiting
10 *** (2) TRANSACTION:
11 TRANSACTION 876725B2D, ACTIVE 9 sec inserting
12 mysql tables in use 1, locked 1
13 876 lock struct(s), heap size 80312, 1022 row lock(s), undo log entries 1002
14 MySQL thread id 155097580, OS thread handle 0x7f585be79700, query id 87987761732 localhost msandbox update
15 INSERT INTO t1 (col1, col2, col3, col4) values (7, 86, 62, "a lot of things"), (7, 76, 62, "many more")
16 *** (2) HOLDS THE LOCK(S):
17 TABLE LOCK table `mydb`.`t1` trx id 876725B2D lock mode AUTO-INC
18 *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
19 RECORD LOCKS space id 44917 page no 529635 n bits 112 index `PRIMARY` of table `mydb`.`t2` trx id 876725B2D lock mode S locks rec but not gap waiting
20 *** WE ROLL BACK TRANSACTION (1)
Line 1 gives the time when the deadlock happened. If your application code catches and logs deadlock errors,which it should, then you can match this timestamp with the timestamps of deadlock errors in application log. You would have the transaction that got rolled back. From there, retrieve all statements from that transaction.
Line 3 & 11, take note of Transaction number and ACTIVE time. If you log SHOW ENGINE INNODB STATUS output periodically(which is a good practice), then you can search previous outputs with Transaction number to hopefully see more statements from the same transaction. The ACTIVE sec gives a hint on whether the transaction is a single statement or multi-statement one.
Line 4 & 12, the tables in use and locked are only with respect to the current statement. So having 1 table in use does not necessarily mean that the transaction involves 1 table only.
Line 5 & 13, this is worth of attention as it tells how many changes the transaction had made, which is the “undo log entries” and how many row locks it held which is “row lock(s)”. These info hints the complexity of the transaction.
Line 6 & 14, take note of thread id, connecting host and connecting user. If you use different MySQL users for different application functions which is another good practice, then you can tell which application area the transaction comes from based on the connecting host and user.
Line 9, for the first transaction, it only shows the lock it was waiting for, in this case the AUTO-INC lock on table t1. Other possible values are S for shared lock and X for exclusive with or without gap locks.
Line 16 & 17, for the second transaction, it shows the lock(s) it held, in this case the AUTO-INC lock which was what TRANSACTION (1) was waiting for.
Line 18 & 19 shows which lock TRANSACTION (2) was waiting for. In this case, it was a shared not gap record lock on another table’s primary key. There are only a few sources for a shared record lock in InnoDB:
1) use of SELECT … LOCK IN SHARE MODE
2) on foreign key referenced record(s)
3) with INSERT INTO… SELECT, shared locks on source table
The current statement of trx(2) is a simple insert to table t1, so 1 and 3 are eliminated. By checking SHOW CREATE TABLE t1, you could confirm that the S lock was due to a foreign key constraint to the parent table t2.
Example 2: With MySQL community version, each record lock has the record content printed:
1 2014-10-11 10:41:12 7f6f912d7700
2 *** (1) TRANSACTION:
3 TRANSACTION 2164000, ACTIVE 27 sec starting index read
4 mysql tables in use 1, locked 1
5 LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
6 MySQL thread id 9, OS thread handle 0x7f6f91296700, query id 87 localhost ro ot updating
7 update t1 set name = 'b' where id = 3
8 *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
9 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164000 lock_mode X locks rec but not gap waiting
10 Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0
11 0: len 4; hex 80000003; asc ;;
12 1: len 6; hex 000000210521; asc ! !;;
13 2: len 7; hex 180000122117cb; asc ! ;;
14 3: len 4; hex 80000008; asc ;;
15 4: len 1; hex 63; asc c;;
17 *** (2) TRANSACTION:
18 TRANSACTION 2164001, ACTIVE 18 sec starting index read
19 mysql tables in use 1, locked 1
20 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1
21 MySQL thread id 10, OS thread handle 0x7f6f912d7700, query id 88 localhost r oot updating
22 update t1 set name = 'c' where id = 2
23 *** (2) HOLDS THE LOCK(S):
24 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164001 lock_mode X locks rec but not gap
25 Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0
26 0: len 4; hex 80000003; asc ;;
27 1: len 6; hex 000000210521; asc ! !;;
28 2: len 7; hex 180000122117cb; asc ! ;;
29 3: len 4; hex 80000008; asc ;;
30 4: len 1; hex 63; asc c;;
32 *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
33 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164001 lock_mode X locks rec but not gap waiting
34 Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0
35 0: len 4; hex 80000002; asc ;;
36 1: len 6; hex 000000210520; asc ! ;;
37 2: len 7; hex 17000001c510f5; asc ;;
38 3: len 4; hex 80000009; asc ;;
39 4: len 1; hex 62; asc b;;
Line 9 & 10: The ‘space id’ is tablespace id, ‘page no’ gives which page the record lock is on inside the tablespace. The ‘n bits’ is not the page offset, instead the number of bits in the lock bitmap. The page offset is the ‘heap no’ on line 10,
Line 11~15: It shows the record data in hex numbers. Field 0 is the cluster index(primary key). Ignore the highest bit, the value is 3. Field 1 is the transaction id of the transaction which last modified this record, decimal value is 2164001 which is TRANSACTION (2). Field 2 is the rollback pointer. Starting from field 3 is the rest of the row data. Field 3 is integer column, value 8. Field 4 is string column with character ‘c’. By reading the data, we know exactly which row is locked and what is the current value.
What else can we learn from analysis? Since most MySQL deadlocks happen between two transactions, we could start the analysis based on that assumption. In Example 1, trx (2) was waiting on a shared lock, so trx (1) either held a shared or exclusive lock on that primary key record of table t2. Let’s say col2 is the foreign key column, by checking the current statement of trx(1), we know it did not require the same record lock, so it must be some previous statement in trx(1) that required S or X lock(s) on t2’s PK record(s). Trx (1) only made 4 row changes in 7 seconds. Then you learned a few characteristics of trx(1): it does a lot of processing but a few changes; changes involve table t1 and t2, a single record insertion to t2. These information combined with other data could help developers to locate the transaction.
Where else can we find previous statements of the transactions? Besides application log and previous SHOW ENGINE INNODB STATUS output, you may also leverage binlog, slow log and/or general query log. With binlog, if binlog_format=statement, each binlog event would have the thread_id. Only committed transactions are logged into binlog, so we could only look for Trx(2) in binlog. In the case of Example 1, we know when the deadlock happened, and we know Trx(2) started 9 seconds ago. We can run mysqlbinlog on the right binlog file and look for statements with thread_id = 155097580. It is always good to then cross refer the statements with the application code to confirm.
$ mysqlbinlog -vvv --start-datetime=“2014-10-13 6:06:12” --stop-datatime=“2014-10-13 6:06:22” mysql-bin.000010 > binlog_1013_0606.out
With Percona Server 5.5 and above, you can set log_slow_verbosity to include InnoDB transaction id in slow log. Then if you have long_query_time = 0, you would be able to catch all statements including those rolled back into slow log file. With general query log, the thread id is included and could be used to look for related statements.
How to avoid a MySQL deadlock
There are things we could do to eliminate a deadlock after we understand it.
– Make changes to the application. In some cases, you could greatly reduce the frequency of deadlocks by splitting a long transaction into smaller ones, so locks are released sooner. In other cases, the deadlock rises because two transactions touch the same sets of data, either in one or more tables, with different orders. Then change them to access data in the same order, in another word, serialize the access. That way you would have lock wait instead of deadlock when the transactions happen concurrently.
– Make changes to the table schema, such as removing foreign key constraint to detach two tables, or adding indexes to minimize the rows scanned and locked.
– In case of gap locking, you may change transaction isolation level to read committed for the session or transaction to avoid it. But then the binlog format for the session or transaction would have to be ROW or MIXED.