Estimates vary, but most reports put the average cost of unplanned database downtime at approximately $300,000 to $500,000 per hour, or $5,000 to $8,000 per minute. With so much at stake, database high availability and fault tolerance have become must-have items, but many companies just aren’t certain which one they must have.
This blog article will examine shared attributes of high availability (HA) and fault tolerance (FT). We’ll also look at the differences, as it’s important to know what architecture(s) will help you best meet your unique requirements for maximizing data assets and achieving continuous uptime.
We’ll wrap it up by suggesting high availability open source solutions, and we’ll introduce you to support options for ensuring continuous high performance from your systems.
High availability refers to the continuous operation of a database system with little to no interruption to end users in the event of system failures, power outages, or other disruptions. A basic high availability database system provides failover (preferably automatic) from a primary database node to redundant nodes within a cluster.
High availability does not guarantee 100% uptime, but an HA system enables you to minimize downtime to the point you’re almost there. Within IT, the gold standard for high availability is 99.999%, or “five-nines” of availability, but the level of HA needed really depends on how much system downtime you can bear. Streaming services, for example, run mission-critical systems in which excessive downtime could result in significant financial and reputational losses for the business. But many organizations can tolerate a few minutes of downtime without negatively affecting their end users.
The following table shows the amount of downtime for each level of availability.

High availability works through a combination of key elements. Some of the most important elements include:
Two previously mentioned absolutes — no SPOF and foolproof failover — must apply across the following areas if an HA architecture is to be achieved:
Fault tolerance refers to the ability of a database system to continue functioning in full, with no downtime, amid hardware or software failures. When a failure event occurs — such as a server failure, power outage, or network disruption — a fault-tolerant system will ensure that data integrity is preserved and that the system remains operational.
Here are some of the key components and characteristics of a fault-tolerant database environment:
Fault-tolerant information systems are designed to offer 100% availability. Some of the key elements of such designs include:
There’s obviously a lot in common in terms of setup, functionality, and purpose. So with all the similarities (replication, load balancing, redundant components, and more), what are the differences?
In general terms, the purpose of a high availability solution is to minimize downtime and provide continuous access to the database, while the purpose of fault tolerance is to maintain system functionality and data integrity at all times, including during failure events or faults.
Still, generally speaking, but in financial terms, achieving fault tolerance is more costly, and the payoff often does not justify the expense. For example, it takes a lot of time, money, and expertise to completely mirror a system so that if one fails, the other takes over without any downtime. It can be considerably cheaper to establish a high availability database system in which there’s not total redundancy, but there is load balancing that results in minimal downtime (mere minutes a year).
Basically, it comes down to a company’s needs, what’s at stake, and what fits its budget. Here are key questions to consider when deciding between high availability and fault tolerance:
How much downtime can your company endure?
With high availability, you can achieve the gold standard previously mentioned — and your database system will be available 99.999% of the time.
With a fault-tolerant system, you can spend a lot more and perhaps do better than the 5.26 minutes of downtime (in an entire year) that come with the “five nines” described immediately above. But is it mission-critical to do better than 99.999% availability?
How much complexity and how many redundant components are you willing to take on?
High availability systems typically have redundant servers or clusters to ensure that if one component fails, another can take over seamlessly.
Fault tolerance incorporates multiple versions of hardware and software, and it also includes power supply backups. The hardware and software can detect failures and instantly switch to redundant components, but they also constitute more complexity, parts, and cost.
Which option fits your budget?
A high availability database system requires redundancy, load balancing, and failover. It also must ensure that there is no single point of failure. But depending on your needs, there are varying levels of high availability, and an architecture can be fairly simple.
Fault-tolerant systems are designed with more complex architectures, requiring sophisticated hardware and software components, along with specialized expertise to design, configure, and maintain. The additional complexity adds to the cost of the system.
At Percona, we advise that, in most cases, the attributes and cost-effectiveness of high availability are the way to go. It’s plenty available and plenty scalable. We’re also big advocates of using open source software that’s free of vendor lock-in and is backed by the innovation and expertise of the global open source community to achieve high availability.
At the same time, we’re aware that achieving high availability using open source software takes extra effort. HA doesn’t come with the complexity and price tag of a fault-tolerant system, but it still requires considerable time and expertise.
So instead of you having to select, configure, and test architecture for building an HA database environment, why not use ours? You can use Percona architectures on your own, call on us as needed, or have us do it all for you.
Check out these helpful whitepapers, which include ready-to-use architectures:
Percona Distribution for MySQL: High Availability With Group Replication
Percona Distribution for PostgreSQL: High Availability With Streaming Replication
Percona designs, configures, and deploys high availability databases so that applications always have access to essential data. We’ll help you ensure that applications stay up, databases stay bug-free, and your entire system runs at optimal performance levels.
Percona High Availability Database Support
What is the difference between DR vs. HA and FT?
Disaster recovery focuses on recovering from major disruptive events and restoring operations. Fault tolerance and high availability focus on building systems that can withstand failures and continue operating with minimal interruption (HA) and without interruption (FT).
What are the goals of high availability and fault tolerance?
The goal of employing fault tolerance is to maintain system functionality and data integrity at all times, including during failures or faults. The goal of employing a high availability solution is to minimize downtime and provide continuous access to the database.
What is the difference between HA and DR?
High availability is used to prevent downtime due to individual component failures within a live environment, while disaster recovery focuses on recovering from major events that make the database inaccessible or non-functional.
Resources
RELATED POSTS