Replay the Execution of MySQL With RR (Record and Replay)

MySql Record and ReplayChasing bugs can be a tedious task, and multi-threaded software doesn’t make it any easier. Threads will be scheduled at different times, instructions will not have deterministic results, and in order for one to reproduce a particular issue, it might require the exact same threads, doing the exact same work, at the exact same time. As you can imagine, this is not straightforward.

Let’s say your database is crashing or even having a transient stall.  By the time you get to it, the crash has happened and you are stuck restoring service quickly and doing after-the-fact forensics.  Wouldn’t it be nice to replay the work from right before or during the crash and see exactly what was happening?

Record and Replay is a technique where we record the execution of a program allowing it to be replayed over and over producing the same result. Engineers at Mozilla have created RR, and basically, this open source tool allows you to record the execution of the software and replay it under the well-known GDB.

A Backup Problem

To demonstrate how powerful the tool is, we will be walking through how we used it to narrow down the issue from PXB-2180 (Special thanks to Satya Bodapati, who helped with all the InnoDB internals research for this bug). 

In summary, we were seeing Percona XtraBackup crashing at the prepare stage (remember, always test your backup!). The crash was happening randomly, sometimes after the second incremental, sometimes after the 10th incremental, with no visible pattern.

The stack trace was also not always the same. It was crashing on different parts of InnoDB, but here we had one commonality from all crashes – it always happened while trying to apply a redo log record to the same block page and space id:

Our suspicion was that the page layout on this block diverged between MySQL and XtraBackup. When working with these types of bugs, the crash is always the consequence of something that happened earlier, eg.: a crash on the sixth incremental backup could be the consequence of an issue that happened on the fourth incremental. 

The main goal at this step is to prove and identify where the page layout has diverted.

With this information, we ran MySQL under RR and reran the backup until we saw the same issue at prepare. We can now replay the MySQL execution and check how it compares. Our idea is to:

  1. Read the LSNs for this same page before/after each backup prepare.
  2. Identify all changes to  m_space = 4294967294 & m_page_no = 5 at mysqld.

Before we progress further, let’s explain a few things:

  1. m_space = 4294967294 correspond to the MySQL data dictionary (mysql.ibd) – dict0dict.h:1146
  2. On disk page, LSN is stored at the 16th byte of the page and has a size of 8 bytes – fil0types.h:66
  3. Pages are written sequentially to disk, as an example, for the default 16k page size, from bytes 1 to 16384 will have the data for page 0, from byte 16385 to 32768 data from page 1, and so on. 
  4. Frame is raw data of a page – buf0buf.h:1358

Replaying the Execution

To start, let’s read what LSN we have on mysql.ibd for page five before the backup. We will be using od (check man od for more information) and the information explained above:

And check if it matches an LSN stamp from mysqld. For that we will add a conditional breakpoint on the replay execution of MySQL at function buf_flush_note_modification:

We can see the LSN stamp from before the preparation of full backup and the first stamp from the replay session match. Time to prepare the backup, advance the replay execution, and recheck:

Same LSN stamp on both, server and backup. Time to move on and start to apply the incrementals: