Demystifying MySQL Replication Crash Safety
Up to MySQL 5.5, replication was not crash safe: after a crash, it would fail with "duplicate key" or "row not found" error, or might generate silent data corruption. It looks like 5.6 is much better, right? The short answer is maybe: in the simplest case, it is possible to achieve replication crash safety but it is not the default setting. MySQL 5.7 is not much better, 8.0 has safer defaults but it is still easy to get things wrong.
Crash safety is impacted by replication positioning (File+Pos or GTID), type (single-threaded or MTS), MTS settings (Database or Logical Clock, and with or without slave preserve commit order), the sync-ing of relay logs, the presence of binary logs, log-slave-updates and their sync-ing. This is very complicated stuff and even the manual is confused about it.
In this talk, I will explain the impact of above and help you finding the path to crash safety nirvana. I will also give details about replication internals, so you might learn a thing or two.