Preventing downtime-causing emergencies in MySQL can be difficult because they are caused by complex combinations of several things going wrong. Efforts to be proactive may be sincere, but without knowledge of the causes of emergencies, they often fail to prevent further problems. This white paper explains dozens of ways that real emergencies could have been prevented in production systems, and suggests specific actions to accomplish these. It grew from a study of hundreds of downtime-causing emergencies for Percona's customers, and is the companion to an article in the Q1 2011 issue of IOUG's SELECT magazine, which presents analysis of the causes and natures of the incidents. Each recommendation in this paper could have prevented at least one incidence of production downtime. The paper includes checklists that can be used to help perform and document the measures discussed.
Our MySQL white papers are free. They are released under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 license.