Ensuring that databases are highly available is not just a thing these days, it’s the thing

Downtime, whether scheduled or unplanned, is barely, if at all, tolerated by end users. The consequences of downtime can be severe and may include things like loss of customers, damage to your reputation, or penalties for not meeting Service Level Agreements (SLAs). Making your database environment highly available, then, is a top priority that you need to get right. The good news is, you can build a great high availability (HA) solution using open source databases. We’ll touch on that shortly, but let’s start with some basics.

What is high availability and how do you get it?

High availability refers to the continuous operation of a system so that services to end users are largely uninterrupted. A basic high availability database system provides failover (preferably automatic) from a primary database node to redundant nodes within a cluster. 

HA is sometimes confused with “fault tolerance.” Although the two are related, the key difference is that an HA system provides quick recovery of all system components to minimize downtime. Some disruption may occur but will be minimal. Fault tolerance aims for zero downtime and data loss. As such, fault tolerance is much more expensive to implement because it requires dedicated infrastructure that completely mirrors the primary system. It also demands a lot of resources to maintain it. 

Achieving a database HA solution rests on putting three key principles into practice:

  • Single point of failure (SPOF) – Eliminating any single point of failure in the database environment, including the physical or virtual hardware the database system relies on and which would cause it to fail.
  • Redundancy – Ensuring sufficient redundancy of all components within the database environment and reliable crossover to these components in the event of failure.
  • Failure detection – Monitoring the entire database environment for failures.

How to measure high availability

HA does not guarantee 100% uptime, but it allows you to get pretty close. Within IT, the gold standard for high availability is 99.999%, or “five-nines” of availability, but the level of HA needed really depends on how much downtime you can bear. Streaming services, for example, run mission-critical systems in which excessive downtime could result in significant financial and reputational losses for the business. But many organizations can tolerate a few minutes of downtime without negatively impacting their end users.

The following table shows the amount of downtime for each level of availability from two to five nines. 

Availability % Downtime per year Downtime per month Downtime per week Downtime per day
99% (“two nines”) 3.65 days 7.31 hours 1.68 hours 14.40 minutes
99.5% (“two nines five”) 1.83 days 3.65 hours 50.40 minutes 7.20 minutes
99.9% (“three nines”) 8.77 hours 43.83 minutes 10.08 minutes 1.44 minutes
99.95% (“three nines five”) 4.38 hours 21.92 minutes 5.04 minutes 43.20 seconds
99.99% (“four nines”) 52.60 minutes 4.38 minutes 1.01 minutes 8.64 seconds
99.995% (“four nines five”) 26.30 minutes 2.19 minutes 30.24 seconds 4.32 seconds
99.999% (“five nines”) 5.26 minutes 26.30 seconds 6.05 seconds 864.00 milliseconds

How to make your database highly available

Not all companies are the same and neither are their requirements for HA. When planning your database HA architecture, the size of your company is a great place to start to assess your needs. For example, if you’re a small business, paying for a disaster recovery site outside your local data center is probably unnecessary and may cause you to spend more money than the data loss is worth. All companies regardless of size should consider how to strike a balance between availability goals and cost.                                                       

  • Startups and small businesses. Most startups and small businesses can achieve an effective HA infrastructure within a single data center on a local node. This base architecture keeps the database available for your applications in case the primary node goes down, whether that involves automatic failover in case of a disaster or planned switchover during a maintenance window.
  • Medium to large businesses. If you have a bit more budget, consider adding a disaster recovery site outside your local data center. This architecture spans data centers to add more layers of availability to the database cluster. It keeps your infrastructure available and your data safe and consistent even if a problem occurs in the primary data center. In addition to the disaster recovery site, this design includes an external layer of nodes so if communication between the sites is lost, the external node layer acts as a “source of truth” and decides which replica to promote as a primary. In doing so, it keeps the cluster healthy by preventing a split-brain scenario and keeps the infrastructure highly available.
  • Enterprises. An enterprise HA architecture adds another layer of insurance for  companies with additional resources; for whom downtime would mean devastating revenue and reputational losses; and who offer globally distributed services. It features two disaster recovery sites, adding more layers for the infrastructure to stay highly available and keep applications up and running. This architecture, which is based on tightly coupled database clusters spread across data centers and geographic availability zones, can offer 99.999% uptime when used with synchronous streaming replication, the same hardware configuration in all nodes, and fast internode connections. 

Which database is best for high availability?

A common question about database high availability is which database is best. If you’re planning to use proprietary databases, be aware that you’re opening yourself up to vendor lock-in due to burdensome contracts and the high costs associated with loss of data portability (e.g., exorbitant cloud egress fees).

HA is an important goal, but switching to a proprietary database solely for HA limits the other benefits of open source. In addition to enabling a strong database HA architecture, open source avoids costly licensing fees, offers data portability, gives you the freedom to deploy anywhere anytime you want, and delivers great software designed by a community of contributors who prioritize innovation and quality. 

Getting started with open source database high availability

Open source databases like Postgres, MariaDB, MySQL, and Redis are great options for HA but generally don’t include a built-in HA solution. That means you’ll need to carefully review the various extensions and tools available. These extensions and tools are excellent for enriching their respective databases but, as your environment scales, may not be able to keep up with evolving, more complex requirements.

Greater complexity means that you and your team will need certain skills and knowledge, for example, how to write application connectors, integrate multiple systems, and align business requirements with your HA solution. You should also understand open source databases, applications, and infrastructure. Consider the possibility that outside expertise may be necessary to help you manage your HA architecture. 

Learn more about database high availability

Determining which high availability solution is right for your database environment depends greatly on your goals. Find out how Percona high availability database support guarantees application uptime. 

If PostgreSQL is your database of choice for HA, learn how to build highly available PostgreSQL using only battle-tested open source components in our eBook,  Achieving High Availability on PostgreSQL With Open Source Tools.

Notify of

Inline Feedbacks
View all comments