Where the open source database community meets: Use code PERCONA75 and secure your spot for Percona Live.  Register

White Paper: Preventing MySQL Emergencies

February 7, 2011
Author
Baron Schwartz
Share this Post:

About a year ago, I started a study of emergency incidents that our customers filed with us. What I found was really surprising, and defied conventional wisdom. I learned a lot about preventing emergencies. I just published the outcome as a white paper, including checklists that you can use and modify for your own servers. (Analysis of the nature and causes of emergencies is due to be published in the next issue of IOUG’s quarterly SELECT journal.)

0 0 votes
Article Rating
Subscribe
Notify of
guest

7 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Sean C
Sean C
15 years ago

Nice paper. Most production issues/outages can avoided with proper controls and checklists. A book I highly recommend reading and that compliments this post: Visible Ops Handbook (http://www.amazon.com/Visible-Ops-Handbook-Starting-Practical/dp/0975568604)

Shlomi Noach
15 years ago

Wow! Thank for this effort!

May I add some shameless plugs?

With regard to shutdowns (3.5, 3.6): openark kit’s oak-prepare-shutdown will perform both actions: reducing the dirty pages percent until no improvement is seen, as well as verifying no temporary tables are open on slave. It will also gracefully stop replication beforehand.

check configuration before shutdown (3.8): mycheckpoint (monitoring tool for MySQL) records all status variables, as well as all server variables. By querying a single view, you get the entire history of parameter changes, timestamps and all. Actually, it is also easy to cross reference with the “uptime” value per entry, so you may easily relate changes that were suspiciously close to a restart (indication someone may have changed a variable dynamically yet forgot to update cnf file).

Security (10.4, 10.4, 10.5) all these and *many* others (e.g. different users sharing the same password) are handled by openark kit’s oak-security-audit. Basically, the tool provides you with a report and recommendations on your MySQL’s security flaws.

Vojtech Kurka
Vojtech Kurka
15 years ago

“Before shutting down the server, stop replication
and issue SHOW SLAVE STATUS. Save the result to
a file and refer to it after restart to ensure that replication starts in the correct position”

Can please anyone explain me, why to do this? AFAIK mysql saves this information into the master.info file, so you only need to sync it to disk before you power the machine off (which does the OS for you).
I probably missed some other reasons to do this?

Vojtech

Vojtech Kurka
Vojtech Kurka
15 years ago

“Before shutting down the server, stop replication
and issue SHOW SLAVE STATUS. Save the result to
a file and refer to it after restart to ensure that replication starts in the correct position”

Can please anyone explain me, why to do this? AFAIK mysql saves this information into the master.info file, so you only need to sync it to disk before you power the machine off (which does the OS for you).
I probably missed some other reasons to do this?

Vojtech

Far
Enough.

Said no pioneer ever.
MySQL, PostgreSQL, InnoDB, MariaDB, MongoDB and Kubernetes are trademarks for their respective owners.
© 2026 Percona All Rights Reserved