The need for parallel crash recovery in MySQLVadim Tkachenko
I recently filed an Oracle feature request to make crash recovery faster by running in multiple threads.
This might not seem very important, because MySQL does not crash that often. When it does crash, however, crash recovery can take 45 mins – as I showed in this post:
Even in that case, it still might not be a big issue as you often failover to a slave.
However, crash recovery plays important part in the following processes:
- Backups with Percona XtraBackup (and MySQL Enterprise Backups) and backups with filesystem snapshots.
- Crash recovery is part of the backup process, and it is important to make the backup task faster.
- State Snapshot Transfer in Percona XtraDB Cluster.
- SST, either XtraBackup or rsync bases, also relies on the crash recovery process – so the faster it is done, the faster a new node joins the cluster.
- It might seem that Oracle shouldn’t care about Percona XtraDB Cluster. But they are working on MySQL Group Replication. I suspect that when Group Replication copies data to the new node, it will also rely on some kind of snapshot technique. Unless they aren’t serious about this feature and will recommend mysqldump/mysqlpump for data copying).
- My recent proof of concept for Automatic Slave propagation in Docker environment also uses Percona XtraBackup, and therefore crash recovery for new slaves.
In general, any process that involves MySQL/InnoDB data transfer will benefit from a faster crash recovery. In its current state uses just one thread to read and process data. This limits performance on modern hardware, which uses multiple CPU cores and fast SSD drives.
It is also important to consider that the crash recovery time affects how big log files can be. If we improve the crash recovery time, we can store very big InnoDB log files (which positively affects performance in general).
Percona is working on ways to make it faster. However, if faster recovery times are important to you environment, I encourage you to let Oracle know that you want to see parallel crash recovery in MySQL.