This blog post discusses how you can protect your e-commerce database from a high traffic disaster.
Databases power today’s e-commerce. Whether it’s listing items on your site, contacting your distributor for inventory, tracking shipments, payments, or customer data, your database must be up, running, tuned and available for your business to be successful.
There is no time that this is more important than high-volume traffic days. There are specific events that occur throughout the year (such as Black Friday, Cyber Monday, or Singles Day) that you know are going to put extra strain on your database environment. But these are the specific times that your database can’t go down – these are the days that can make or break your year!
So what can you do to guarantee that your database environment is up to the challenge of handling high traffic events? Are there ways of preparing for this type of traffic?
Yes, there are! In this blog post, we’ll look at some of the factors that can help prepare your database environment to handle large amounts of traffic.
Before moving to strategies, we need to discuss the difference between synchronous and asynchronous applications.
In most web-based applications, user input starts a number of requests for resources. Once the server answers the requests, no communication stops until the next input. This type of communication between a client and server is called synchronous communication.
Restricted application updates limit synchronous communication. Even synchronous applications designed to automatically refresh application server information at regular intervals have consistent periods of delay between data refreshes. While usually such delays aren’t an issue, some applications (for example, stock-trading applications) rely on continuously updated information to provide their users optimum functionality and usability.
Web 2.0-based applications address this issue by using asynchronous communication. Asynchronous applications deliver continuously updated data to users. Asynchronous applications separate client requests from application updates, so multiple asynchronous communications between the client and server can occur simultaneously or in parallel.
The strategy you use to scale the two types of applications to meet growing user and traffic demands will differ.
When it comes to synchronous applications, you really have only one option for scaling performance: sharding. With sharding, the tables are divided and distributed across multiple servers, which reduces the total number of rows in each table. This consequently reduces index size, and generally improves search performance.
A shard can also be located on its own hardware, with different shards added to different machines. This database distribution over a large multiple of machines spreads the load out, also improving performance. Sharding allows you to scale read and write performance when latency is important.
Generally speaking, it is better to avoid synchronous applications when possible – they limit your scalability options.
When it comes to scaling asynchronous applications, we have many more options than with synchronous applications. You should try and use asynchronous applications whenever possible:
As you scale up in terms of database workload, you need to be able to avoid bad queries or patterns from your applications.
Scaling horizontally means adding more nodes to a system, such as adding a new server to a database environment to a distributed software application. For example, scaling out from one Web server to three.
Another option to handle the increased traffic load is adding more hardware to your environment: more servers, more CPUs, more memory, etc. This, of course, can be expensive. One option here is to pre-configure your testing environment to become part of the production environment if necessary. Another is to pre-configure more Database-as-a-Service (DaaS) instances for the event (if you are a using cloud-based services).
Whichever method, be sure you verify and test your extra servers and environment before your drop-dead date.
As always, in any situation where your environment is going to be stressed beyond usual limits, testing under real-world conditions is a key factor. This includes not only testing for raw traffic levels, but also the actual workloads that your database will experience, with the same volume and variety of requests.
Finally, it’s important that you understand what applications will be used and querying the database. This sort of common sense idea is often overlooked, especially when teams (such as the development team and the database/operations team) get siloed and don’t communicate.
Get to know who is developing the applications that are using the database, and how they are doing it. As an example, a while back I had the opportunity to speak with a team of developers, mostly to just understand what they were doing. In the process of whiteboarding the app with them, we discovered a simple query issue that – now that we were aware of it – took little effort to fix. These sorts of interactions, early in the process, can save a great deal of headache down the line.
There are many strategies that can help you prepare for high traffic events that will impact your database. I’ve covered a few here briefly. For an even more thorough look at e-commerce database strategies, attend my webinar “Black Friday and Cyber Monday: How to Avoid an E-Commerce Disaster” on Thursday, September 22, 2016 10:00 am Pacific Time.
Resources
RELATED POSTS