Here is part two of my MySQL with Diagrams series (Here’s part one – MySQL with Diagrams Part One: Replication Architecture). We are going to explore how MySQL handles thread termination using the KILL command, as visualized in the provided diagram, and provide sample demonstrations to help you better understand.
Many people think they know this topic, but it is unknown or known incorrectly. KILLs are not handled by the thread that runs the KILL command; they are handled by the thread itself, which is killed by another thread. It’s a bit confusing, which is why diagrams work well for this.
The diagram illustrates the interaction between two threads:
- Thread ID 10 represents a worker thread that is actively executing a query.
- Thread ID 12 issues the KILL 10 command to terminate Thread ID 10.
Thread 10: The thread enters a loop where it processes a query in chunks. For operations like ORDER BY
or GROUP BY
or ALTER TABLE
, it reads blocks of rows, processes them, and acknowledges the rows. After processing each block, it checks the thd_killed()
flag to determine if it should continue or terminate.
Thread 12: This thread sends a KILL command, which sets the kill flag for Thread 10 using the function thd_set_kill_status()
.
Kill flag behavior
If the kill flag remains unset (thd_killed()=0
), Thread 10 continues its processing. When the flag is set (thd_killed()=1
), the query execution is aborted, the temporary table is discarded, and any active transactions are rolled back.
MySQL function: is_killed()
The function is_killed() checks whether the thread should terminate:
1 2 3 4 5 6 |
bool Sql_data_context::is_killed() const { const auto kill = thd_killed(get_thd()); DBUG_LOG("debug", "is_killed:" << kill); if (0 == kill) return false; return ER_QUERY_INTERRUPTED != kill; } |
If kill == 0
, the thread continues running. If kill != 0the thread is interrupted,
further execution is stopped.
Conclusion
By understanding the KILL command and how MySQL manages thread lifecycle, you can decide or understand why it takes more time than expected. As always, double-check your query and avoid killing threads indiscriminately in production environments, as this could disrupt critical operations. It was good to be safe than sorry.
Also, what should be next in the series? Comments are more than welcome, as always.
Make a drawing for
UPDATE t SET a=a+1 WHERE id=17
Show the redo log, undo log, and tablespace pages in memory and on disk, and what happens on
COMMIT
and on checkpointing.Hi Kristian,
Thank you for your comments. I will definitely consider them. However, what I’m trying to illustrate is general behavior regarding various topics. It’s quite challenging to address specific queries, isolation levels, and so on. Understanding the main topics will allow you to make a comment for specific cases.
Make a drawing for
UPDATE t SET a=a+1 WHERE id=17
.Show multiple concurrent threads:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
.SET TRANSACTION ISOLATION LEVEL READ COMMITTED
.REPEATABLE READ
isolation level, but doesSTART TRANSACTION READ ONLY
, selects twice the same interval, and does aCOMMIT
, starts a new transaction and selects again.SERIALIZED
in a similar setup as theREPEATABLE READ
thread.Optionally: Also show the purge thread.
A thread is starting a
START TRANSACTION READ ONLY
which takes 8 hours (mysqldump
on a large database).Other threads make changes, other threads read. The database is getting slow.
The
mysqldump
finishes. Within a minute the system recovers.Show on the graphic where index bloat and undo log bloat hurt you.
Using the first graphics
> Show the redo log, undo log, and tablespace pages in memory and on disk, and what happens on
COMMIT
and on checkpointing.make many changes, and have the database crash and recover before
COMMIT
, afterCOMMIT
and before checkpointing and, this is simple, after checkpointing.Show in the graphic why data is lost or not lost and how recovery works.
Show a script (in Python?) that exercises this scenario on a server in a loop, demonstrating verifiably that all this is actually true (replication breaks if the server crashes with
CREATE TEMPORARY TABLE
active and SBR is used – why?)