Tame Black Friday Gremlins — Optimize Your Database for High Traffic Events

Optimize Your Database for High Traffic EventsIt’s that time of year! The Halloween decorations have come down and the leaves have started to change and the Black Friday/Cyber Monday buying season is upon us!

For consumers, it can be a magical time of year, but for those of us that have worked in e-commerce or retail, it usually brings up…different emotions. It’s much like the Gremlins — cute and cuddly unless you break the RULES:

  1. Don’t expose them to sunlight,
  2. Don’t let them come in contact with water,
  3. NEVER feed them after midnight!

I love this analogy and how it parallels the difficulties that we experience in the database industry — especially this time of year. When things go well, it’s a great feeling. When things go wrong, they can spiral out of control in destructive and lasting ways.

Let’s put these fun examples to work and optimize your database!

Don’t Expose Your Database to “Sunlight”

One sure-fire way to make sure that your persistent data storage cannot do its job, and effectively kill it is to let it run out of storage. Before entering the high-traffic holiday selling season, make sure that you have ample storage space to make it all the way to the other side. This may sound basic, but so is not putting a cute, fuzzy pet in the sunlight — it’s much harder than you think!

Here are some great ways to ensure the storage needs for your database are met (most obvious to least obvious):

  1. If you are on a DBaaS such as Amazon RDS, leverage something like Amazon RDS Storage Auto Scaling
  2. In a cloud or elastic infrastructure:
    1. make sure network-attached storage is extensible on the fly, or
    2. properly tune the database mount point to be leveraging logical volume management or software raid to add additional volumes (capacity) on the fly.
  3. In an on-premise or pre-purchased infrastructure, make sure you are overprovisioned — even by end of season estimates — by ~25%.
  4. Put your logs somewhere else than the main drive. The database may not be happy about running out of log space, but logs can be deleted easily — data files cannot!

Don’t Let Your Database Come in “Contact With Water”

We don’t want to feed or allow simple issues to multiply. Actions we take to get out of a bind in the near term can cause problems that require more attention in the future — just like when you put water on a Gremlin, it will multiply!

What are some of these scenarios?

  1. Not having a documented plan of action can cause confusion and chaos if something doesn’t go quite right. Having a plan documented and distributed will keep things from getting overly complicated when issues occur.
  2. Throwing hardware at a problem. Unless you know how it will actually fix an issue, it could be like throwing gasoline on a fire and throw your stack into disarray with blocked and unblocked queries. It also mandates database tuning to be effective.
  3. Understanding (or misunderstanding) how users behave when or if the database slows down:
    1. Do users click to retry five times in five seconds causing even more load?
    2. Is there a way to divert attention to retry later?
    3. Can your application(s) ignore retries within a certain time frame?
  4. Not having just a few sources of truth, with as much availability as possible:
    1. Have at least one failover candidate
    2. Have off-server transaction storage (can you rebuild in a disaster?)
    3. If you have the two above, then delayed replicas are your friend!

Never “Feed” Your Database After “Midnight”

What’s the one thing that can ensure that all heck breaks loose on Black Friday? CHANGE is the food here, and typically, BLACK FRIDAY is the midnight.

Have you ever felt like there is just one thing that you missed and want to get off your backlog? It could be a schema change, a data type change, or an application change from an adjacent team. The ‘no feeding’ rule is parallel to CODE FREEZE in production.

Most companies see this freeze start at the beginning of November when the most stable prod is the one that is already out there, not the one that you have to make stable after a new release:

  1. Change Management is your friend; change that needs to happen should still have a way to happen.
  2. Observability is also your friend; know in absolute terms what is happening to your database and stack so you don’t throw a wrench in it (Percona Monitoring and Management can help).
  3. Educate business stakeholders on the release or change process BEFORE the event, not DURING the event.
  4. Don’t be afraid to “turn it off” when absolute chaos is happening. Small downtime is better than an unusable site over a longer period of time.

Conclusion

Black Friday, Cyber Monday, and the Holidays can be the most wonderful time of the year — and now that we’ve covered the rules, some of the “Gremlins” can stay small and fuzzy and your business won’t get wrecked by pesky database issues or outages.

How Percona Can Help

Percona experts optimize your database performance with open source database support, highly-rated training, managed services, and professional services.

Contact Us to Tame Your Database Gremlins!

Share this post

Comment (1)

  • Rick Vasquez Reply

    Feel free to ask any questions, I will do my best to reply to all comments!

    November 17, 2020 at 11:32 am

Leave a Reply