You may have heard about Unisuper’s problem last week; see here. The $135 billion dollar company’s data was wiped off its cloud, and everything disappeared—zip, zero, nada. Everything they relied upon was gone. Their Google Cloud experience has been labeled an “unprecedented occurrence.” Wouldn’t you like to hear the audio recording of the help desk call to Google asking where Unisuper’s data was?
Yikes. Imagine waking up one day to find your total online presence was gone. Google said: “The disruption of UniSuper services was caused by a combination of rare issues at Google Cloud that resulted in an inadvertent misconfiguration during the provisioning of UniSuper’s Private Cloud, which triggered a previously unknown software bug that impacted UniSuper’s systems.”
Luckily, Unisuper had backups that were not on Google’s Cloud. “UniSuper had backups in place with an additional service provider. These backups have minimized data loss and significantly improved the ability of UniSuper and Google Cloud to complete the restoration.” That quote was from a press release from Google Cloud and Unisuper. Quantifying minimized for your operation may have different valuations, or just how stuck would you be?
Thankfully, Unisuper is back online. A $135 billion company dealing with retiree savings can ride out this event better than your organization. In 2023, French cloud provider OVH had a data center fire and lost data that could not be recovered. As the cloud mushrooms increase in size, the potential for a ‘mushroom cloud’ disaster incrementally increases. Paranoia in dealing with data is not a bad thing. So, in light of these events, ask yourself in a Clint Eastwood voice, “How lucky do you feel?”
Your lesson: Create a backup plan
What does this mean to you and your databases? Could you be down for a week or more, waiting for the data to be restored? What if this unprecedented bug hits you again? Can you switch cloud providers quickly? And what does the SLA you signed say about this sort of thing anyway? Here are three steps to help you keep your data safe.
Step one: make sure that you are making backups. Tools like Percona XtraBackup or Percona Backup for MongoDB simplify performing incremental and complete backups. Taking it a step further, be sure to perform practice restorations to ensure your methods are safe and to give you an approximation of how long you will need to start from zero until you are back operational. So you have the backup and are practicing, timing, and documenting what it is like to get your data back online.
Step two: disk space is relatively cheap these days. Keep copies of your backups in multiple locations. Make sure your staff knows how to access these alternatives.
Step three: Plan your actions for a similar situation to Unisuper’s. Document your secondary and tertiary operational bases with checklists of what needs to be done. You already have a plan to back up your data (see step one), and now you plan what to do when things go wrong. You are documenting who is doing what, where they are doing it, and how to get it all done with minimal service interruption. It is better to plan and never need that plan than to have no strategy and scramble.
Now that you have a plan, it may make sense to evaluate your cloud provider. Some cloud providers charge expensive fees to move data out of their environment. In light of what happened to Unisuper, it is worth showing them how this event was catastrophic for both parties involved and asking how their mitigation of these charges would benefit both of you. After completing all the steps and talking with your provider, you can ask yourself again, in your best Clint Eastwood voice, “How lucky do you feel?” And be assured in your answer – you’re not lucky. You are prepared.