Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

3 handy tools to remove problematic MySQL processes

February 27, 2015

Author

Share this Post:

DBAs often encounter situations where they need to kill queries to ensure there are no long-running queries on a MySQL server that would impact performance. Long-running queries can be the result of many factors. Fortunately, Percona Server contains some handy tools to remove problematic MySQL processes. I will highlight all of the tools via some examples in this post.

pt-kill:
There have been some good posts on this blog about the pt-kill tool, like this one by Arunjith Aravindan titled “How a set of queries can be killed in MySQL using Percona Toolkit’s pt-kill.” Let’s dive into pt-kill a bit further with a few more examples. What does pt-kill do? It kills MySQL connections. Say you wanted to run pt-kill from a cronjob and then get an email on every killed process/query. Here is typical example for that.

 pt-kill --interval 1 --run-time 1 --busy-time 5 --log /path/to/kill_long_running_thread.log --match-info "^(select|SELECT|Select)" --kill —print —daemonize --user=xxxxxx --password=xxxxxxxxxx

1	pt-kill --interval 1 --run-time 1 --busy-time 5 --log /path/to/kill_long_running_thread.log --match-info "^(select\|SELECT\|Select)" --kill —print —daemonize --user=xxxxxx --password=xxxxxxxxxx

Assume this is running from a cronjob, When pt-kill executes, it will kill queries longer than 5 seconds. By default, pt-kill runs forever –run-time option tells how long pt-kill to run before exiting If –interval and –busy-time parameters are used together then the explicit –interval value is used. Otherwise the default interval is 30 seconds. Note: this will only kill all read queries as per the –match-info parameter.

he above command will log all killed queries in the file referenced with the –log option and should be used with —daemonize for logging. If you need to be notified via email for every killed query, the command below will do it. Off-course, you need to have the system configured to send e-mail.

#!/bin/bash
tail -n 0 -F /path/to/kill_long_running_thread.log | while read LOG
do
echo "$LOG" | mail -s "pt-kill alert" sample@test.com
done

#!/bin/bash

tail -n 0 -F /path/to/kill_long_running_thread.log | while read LOG

echo "$LOG" | mail -s "pt-kill alert" sample@test.com

done

You can execute this shell script in the background within screen or with the nohup utility continuously to monitor the log file. It will send an email whenever any new killed query arrives to the referenced log file in the pt-kill command. Unfortunately, there is no option to notify-by-email in pt-kill at the moment, so this is sort of a workaround.

In order to log all killed queries into a database table you will need to use the –log-dsn option as per the example below.

$ pt-kill --interval 1 --busy-time 1 --create-log-table --log-dsn=h=localhost,D=percona,t=kill_log --daemonize --match-info "^(select|SELECT|Select)" --kill

1	$ pt-kill --interval 1 --busy-time 1 --create-log-table --log-dsn=h=localhost,D=percona,t=kill_log --daemonize --match-info "^(select\|SELECT\|Select)" --kill

All killed queries will be logged into percona.kill_log table. The –daemonize option will run this command in the background forever and will kill all SELECT queries running longer than 1 second (–busy-time 1). The –interval option instructs pt-kill to scan processes every 1 second (–interval 1).

mysql> select * from kill_log;
+---------+-----------+---------------------+-------------------------+------------+-------+------+-----------+------+---------+------+------------+-----------------+---------+
| kill_id | server_id | timestamp           | reason                  | kill_error | Id    | User | Host      | db   | Command | Time | State      | Info            | Time_ms |
+---------+-----------+---------------------+-------------------------+------------+-------+------+-----------+------+---------+------+------------+-----------------+---------+
|      17 |         1 | 2015-01-10 08:38:33 | Query matches Info spec |            | 35146 | root | localhost | NULL | Query   |    0 | User sleep | SELECT SLEEP(5) |    NULL |
|      20 |         1 | 2015-01-10 08:38:34 | Query matches Info spec |            | 35223 | root | localhost | NULL | Query   |    0 | User sleep | SELECT SLEEP(5) |    NULL |
+---------+-----------+---------------------+-------------------------+------------+-------+------+-----------+------+---------+------+------------+-----------------+---------+

mysql> select * from kill_log;

+---------+-----------+---------------------+-------------------------+------------+-------+------+-----------+------+---------+------+------------+-----------------+---------+

+---------+-----------+---------------------+-------------------------+------------+-------+------+-----------+------+---------+------+------------+-----------------+---------+

+---------+-----------+---------------------+-------------------------+------------+-------+------+-----------+------+---------+------+------------+-----------------+---------+

With the help of logging killed queries into a database tables. You can easily get all the trends/and /statistics on killed queries via SQL.

By default the tool kills the oldest of the queries that would have been running for more than a given –busy-time. If you need to kill all the threads that have been busy for more than a specified –busy-time, then this will do it:

$ pt-kill --victims=all --busy-time=60

1	$ pt-kill --victims=all --busy-time=60

Statement Timeout in Percona Server:
The max-statement-time feature is ported from the Twitter patches. This feature can be used to limit the query execution time by specifying the timeout value in the max_statement_time variable. When the specified number of milliseconds is reached the server aborts the statement and returns the error below to the client.

ERROR 1877 (70101): Query execution was interrupted, max_statement_time exceeded

1	ERROR 1877 (70101): Query execution was interrupted, max_statement_time exceeded

Let me demonstrate this through another example:

mysql [localhost] {msandbox} (world) > SET max_statement_time=1;
Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (world) > show variables like 'max_statement_time';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| max_statement_time | 1     |
+--------------------+-------+
1 row in set (0.00 sec)

mysql [localhost] {msandbox} (world) > SELECT * FROM City WHERE District = 'abcd';
ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded

mysql [localhost] {msandbox} (world) > UPDATE City SET District='abcd' WHERE ID = 2001;
ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded

mysql [localhost] {msandbox} (world) > ALTER TABLE City ADD INDEX district_idx (district);
ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded

mysql [localhost] {msandbox} (world) > SET max_statement_time=1;

Query OK, 0 rows affected (0.00 sec)

mysql [localhost] {msandbox} (world) > show variables like 'max_statement_time';

+--------------------+-------+

| Variable_name | Value |

+--------------------+-------+

| max_statement_time | 1 |

+--------------------+-------+

1 row in set (0.00 sec)

mysql [localhost] {msandbox} (world) > SELECT * FROM City WHERE District = 'abcd';

ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded

mysql [localhost] {msandbox} (world) > UPDATE City SET District='abcd' WHERE ID = 2001;

ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded

mysql [localhost] {msandbox} (world) > ALTER TABLE City ADD INDEX district_idx (district);

ERROR 1885 (70101): Query execution was interrupted, max_statement_time exceeded

As you can see from this example statement, the timeout feature works for all statements including SELECT/DML/DDL queries.

mysql [localhost] {msandbox} (world) > show status like 'Max_statement%';
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
| Max_statement_time_exceeded   | 3     |
| Max_statement_time_set        | 19    |
| Max_statement_time_set_failed | 0     |
+-------------------------------+-------+
3 rows in set (0.00 sec)

mysql [localhost] {msandbox} (world) > show status like 'Max_statement%';

+-------------------------------+-------+

| Variable_name | Value |

+-------------------------------+-------+

| Max_statement_time_exceeded | 3 |

| Max_statement_time_set | 19 |

| Max_statement_time_set_failed | 0 |

+-------------------------------+-------+

3 rows in set (0.00 sec)

The above mentioned status variables are stats for a statement timeout feature. Max_statement_time_exceeded will inform you that the total number of statements exceeded the defined timeout. Max_statement_time_set defines the number of statements for which execution time limit was set. You can find more details in this documentation. The statement timeout feature was introduced in Percona Server 5.6. You can check if your specific version of Percona Server supports this feature or not via the have_statement_timeout variable.

mysql [localhost] {msandbox} (world) > show global variables like 'have_statement_timeout';
+------------------------+-------+
| Variable_name          | Value |
+------------------------+-------+
| have_statement_timeout | YES   |
+------------------------+-------+
1 row in set (0.00 sec)

mysql [localhost] {msandbox} (world) > show global variables like 'have_statement_timeout';

+------------------------+-------+

| Variable_name | Value |

+------------------------+-------+

| have_statement_timeout | YES |

+------------------------+-------+

1 row in set (0.00 sec)

Bugs you should be aware of:

https://bugs.launchpad.net/percona-server/+bug/1388533 -> This affects how the feature interacts with stored procedures. If you use stored procedures, max_statement_time might not behave as you expect.
https://bugs.launchpad.net/percona-server/+bug/1307432 -> This is documentation bug. Percona Server timeouts might not be safe for some statements like DDL and should not be used with such queries, The documentation does not reflect this. You should be very careful if you set a global statement timeout, It affects data changing queries as well. For best results set the max_statement_time variable in a session before running queries that you want to be killed if they execute too long, instead of using a global variable.
https://bugs.launchpad.net/percona-server/+bug/1376934 -> This affects the statement timeout feature on the query level. You must set max_statement_time in a session or globally instead however, this bug is fixed in latest version i.e. Percona Server 5.6.22-72.0

InnoDB Kill Idle Transactions:
This feature was introduced in Percona Server 5.5. It limits the age of idle XtraDB transactions and will kill idle transactions longer than a specified threshold for innodb_kill_idle_transaction. This feature is useful when autocommit is disabled on the server side and you are relying on the application to commit transactions and want to avoid long running transactions that are uncommitted. Application logic errors sometimes leaves transactions uncommitted. Let me demonstrate it quickly through one example:

mysql [localhost] {msandbox} (world) > show variables like 'autocommit';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| autocommit    | OFF   |
+---------------+-------+

mysql [localhost] {msandbox} (world) > show global variables like 'innodb_kill_idle_transaction';
+------------------------------+-------+
| Variable_name                | Value |
+------------------------------+-------+
| innodb_kill_idle_transaction | 10    |
+------------------------------+-------+

mysql [localhost] {msandbox} (world) > START TRANSACTION; SELECT NOW(); INSERT INTO City_backup (Name,CountryCode,District,Population) VALUES ('Karachi','PK','Sindh','1000000');
Query OK, 0 rows affected (0.00 sec)

+---------------------+
| NOW()               |
+---------------------+
| 2015-01-31 07:11:39 |
+---------------------+
1 row in set (0.00 sec)

Query OK, 1 row affected (0.01 sec)

mysql [localhost] {msandbox} (world) > SHOW ENGINE INNODB STATUSG
*************************** 1. row ***************************
------------
TRANSACTIONS
------------
---TRANSACTION 173076, ACTIVE 10 sec
1 lock struct(s), heap size 360, 0 row lock(s), undo log entries 1
MySQL thread id 15, OS thread handle 0x7f862e9bf700, query id 176 localhost msandbox init
SHOW ENGINE InnoDB STATUS
TABLE LOCK table `world`.`City_backup` trx id 173076 lock mode IX
----------------------------
END OF INNODB MONITOR OUTPUT
============================
mysql [localhost] {msandbox} (world) > SELECT NOW(); SELECT * FROM City_backup;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    16
Current database: world

+---------------------+
| NOW()               |
+---------------------+
| 2015-01-31 07:12:06 |
+---------------------+
1 row in set (0.01 sec)

Empty set (0.00 sec)

mysql [localhost] {msandbox} (world) > show variables like 'autocommit';

+---------------+-------+

| Variable_name | Value |

+---------------+-------+

| autocommit | OFF |

+---------------+-------+

mysql [localhost] {msandbox} (world) > show global variables like 'innodb_kill_idle_transaction';

+------------------------------+-------+

| Variable_name | Value |

+------------------------------+-------+

| innodb_kill_idle_transaction | 10 |

+------------------------------+-------+

mysql [localhost] {msandbox} (world) > START TRANSACTION; SELECT NOW(); INSERT INTO City_backup (Name,CountryCode,District,Population) VALUES ('Karachi','PK','Sindh','1000000');

Query OK, 0 rows affected (0.00 sec)

+---------------------+

| NOW() |

+---------------------+

| 2015-01-31 07:11:39 |

+---------------------+

1 row in set (0.00 sec)

Query OK, 1 row affected (0.01 sec)

mysql [localhost] {msandbox} (world) > SHOW ENGINE INNODB STATUSG

*************************** 1. row ***************************

------------

TRANSACTIONS

------------

---TRANSACTION 173076, ACTIVE 10 sec

1 lock struct(s), heap size 360, 0 row lock(s), undo log entries 1

MySQL thread id 15, OS thread handle 0x7f862e9bf700, query id 176 localhost msandbox init

SHOW ENGINE InnoDB STATUS

TABLE LOCK table `world`.`City_backup` trx id 173076 lock mode IX

----------------------------

END OF INNODB MONITOR OUTPUT

============================

mysql [localhost] {msandbox} (world) > SELECT NOW(); SELECT * FROM City_backup;

ERROR 2006 (HY000): MySQL server has gone away

No connection. Trying to reconnect...

Connection id: 16

Current database: world

+---------------------+

| NOW() |

+---------------------+

| 2015-01-31 07:12:06 |

+---------------------+

1 row in set (0.01 sec)

Empty set (0.00 sec)

Conclusion:
In this post, I shared some tools that can help you get rid of long-running transactions to help ensure that you don’t run into performance issues. This is one of the many good reasons to use Percona Server, which has some extra features on top of vanilla MySQL server.

0 0 votes

Article Rating

3 Comments

Oldest

Newest Most Voted

Aftab

11 years ago

I think you may also highlight the need to identify and optimise slow queries.

Jaime Crespo

11 years ago

MAX_STATEMENT_TIME twitter’s feature is also in upstream 5.7, both as a SELECT syntax http://dev.mysql.com/doc/refman/5.7/en/select.html and as a global an session variable: http://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_max_statement_time

Masroor Farooqi

11 years ago

Excellent review of these tools and helpful examples to illustrate them. It’s good to have these tools under your belt for situations when they are needed.

However, I think Aftab’s comment was getting to the root of the problem. One needs to be worried about long queries, but I would be equally worried about having a process having the power of a sledgehammer. In a case of a simple of a badly formed select query doing a table scan can be killed easily by any of these tools. But imagine an update query running in a MyISAM table. The update runs for a long time and the gets quashed by before all the updates are complete. Now I have committed the carnal sin of a DBA; I have left the database in an inconsistent state – definitely not cool.

Even in a transaction safe world, I do an insert/update, and half way through Brutus (my pt-kill process name :-)) comes and kills the process. Now the database has to do a rollback. We all know the rollback is an expensive process, so at this point I’m stuck between a rock and a hard place – also not cool.

The point I’m trying to make is that automated tools are excellent, but they come at a price. In the database word the price is a risk (small as it may be) that there may be unintended consequences. As long as we are ready to take those risks – bring them on.