It used to be easy: a company developed a new application, chose a database solution, launched the new application and then tuned the chosen database solution. A team of DBAs looked at the infrastructure as well as the workload and made changes (or suggestions) as needed. The application then stayed in production for years and small tweaks were made as needed.
Those days are long gone.
As technology has evolved, so has the workflow and deployment strategy within the large enterprise. Large, monolithic applications are being split into several microservices, generally decoupled but still working together and somewhat interdependent. Waterfall deployment strategies are replaced with agile methodology and continuous code deployment. Tuning and maintaining large installations of physical hardware has become less of the focus with the advent of virtualization, containerization, and orchestrated deployments.
Despite all of these changes and radical shifts in the market, one question for executives and management has remained constant: what approach should I use to maximize my return and give me the most productive environment for the lowest cost? As any good consultant will tell you, “it depends”. Even with all the advances in technology, frameworks, and deployment strategies, there is still no silver bullet that achieves everything you need within your organization (while also preparing your meals and walking your dog).
Choosing an Enterprise Database Solution
In this post, we’ll discuss some of the paths you can take as a guide on your journey of choosing an enterprise database solution. It’s not meant to provide technical advice or suggest a “best option.”
Before going into some of the options, let’s put a few assumptions out there:
- Your organization wants to use the right database solution for the job (or a few limited solutions)
- You DO NOT want to rack new physical servers every time you need a new server or expect growth
- Your application teams far outnumber your operations and database team (in terms of number of teams and overall members)
- The question of “what does your application do” is more accurately replaced with several variations of “what does this particular application do”
Now that we have that out of the way, let’s start with buzzword number one: the cloud. While it is used all the time, there are a few different meanings. Originally (and most commonly), the cloud is referring to the “public” cloud — entities like Amazon Web Services (AWS), Microsoft Azure, and Google Cloud. When it first came to fruition, the most common barrier to organizations moving to the cloud was security. As more and more PII data is stored by large enterprises, the inherent fear of a breach in the public cloud led many companies to shy away. Although this is much less of a concern given all the advances in security, there are some instances where an organization might still believe that storing data in a “public” datacenter is a hard no. If this is your organization, feel free to skip ahead to the on-premise discussion below.
Assuming that you can engineer proper security in the public cloud of your choosing, some of the main benefits of outsourcing your infrastructure quickly bubble to the top:
In many circumstances, you need increased capacity now, but only for a limited time. Does this scenario sound familiar? The beauty of the public cloud is that you generally only pay for what you are using. Looking at things from a long-term cost perspective, if you only need two times your capacity for two weeks out the year, why should you pay for half of your infrastructure to sit idle for the other fifty weeks annually?
Since you don’t have to actually maintain any physical gear in the public cloud, you have the ability to add/remove capacity as needed. There is no need to plan for or provision for additional hardware — and everything that comes with that (e.g., maintaining the cooling systems for double the number of data center servers, increased power costs, expanded physical space, etc.).
Flexibility / Agility
Most public clouds offer more than simply instant access to additional compute instances. There are managed services for several common use cases: relational databases, NoSQL databases, big data stores, message queues, and the list goes on. This flexibility is evident in using various managed services as glue to hold other managed services together.
In traditional environments, you may identify the need for a technology (think message queue), but opt against it due to the complexity of needing to actually manage it and use a less efficient alternative (a relational database for example). With these components readily available in most public clouds, your organization has the flexibility to use the correct technology for each use case without the burden of maintaining it.
Along with the flexibility of plugging in the appropriate technology, you greatly increase the speed at which this can be done. There is much less need from an infrastructure standpoint to plan for supporting a new technology. With the click of a button, the new technology is ready to go in your stack. In an agile work environment, having an agile platform to accompany the methodology is very important.
While the above benefits are all really great, the bottom line is always (the most) important. Depending on how you determine the overall cost of your infrastructure (i.e., hardware only, or do you include operations staff, building costs, etc.) you can see cost savings. One of the big challenges with running physical gear is the initial cost. If I want to run a rack of 20 servers, I have to buy 20 servers, rack them up and turn them on. My ongoing operational cost is likely going to be less than in the cloud (remember, in the cloud you are paying as you use it), but I also need to spread the initial cost over time.
While an overall cost analysis is well outside the scope of this document, you can see how determining cost savings using the public cloud vs. an on-premise solution can be challenging. With all else being equal, you will generally have a more predictable monthly cost when using the public cloud and often can get volume (or reserved) discounts. For example, AWS provides a “CTO Calculator” to estimate how you could save on cost by switching to the public cloud: https://aws.amazon.com/tco-calculator/.
So the powers that be at your company have drawn a line in the sand and said “no” to using the public cloud. Does that mean that each time an application team needs a database, your operations team is racking a server and setting it up? It very well could, but let’s explore a few of the options available to your infrastructure team.
While this option can seem outdated, there are several benefits to provisioning bare metal machines in your data center:
- Complete control over the machine
- OS tuning
- Hardware choices
- Physical control
- Easy to make different “classes” of machine
- Spinning disks for DR slaves
- SSD for slaves
- Flash storage for masters
- Easier troubleshooting
- Less of a need to determine which “layer” is having problems
- Less overhead for virtualization/containerization
- No “extra servers” needed for managing the infrastructure
In a relatively static environment, this is still a great choice as you have full access and minimal layers to deal with. If you see disk errors, you don’t have to decide which “layer” is actually having problems – it is likely the disk. While this is nice, it can be cumbersome and a burden on your operations staff when there are always new databases being added (for microservices or scaling).
In this model, each server is assumed to be a static resource. Generally, you wouldn’t provision a bare metal machine with an OS and database and then wipe it and start over repeatedly. Rather, this model of deployment is best suited to an established application running a predictable workload, where scaling is slow and over time.
A major downside to this approach is resource utilization. Normally, you wouldn’t want to only use half of everything that you purchase. When dealing with bare metal machines, you generally don’t want to have everything running at maximum capacity all the time so that you can handle spikes in traffic. When provisioning bare metal machines, this means you either have to pay for all of your potential resources and then watch most of them sit idle much of the time or risk outages while continuously running at the limits.
Right up there with “the cloud”, another buzzword these days is “containers”. At a high level, containers and virtualization are similar in that they both allow you to use part of a larger physical server to emulate a smaller server. This gives operations teams the ability to create “images” that can be used to quickly provision “servers” on larger bare metal machines.
While this does add a new layer to your stack, and can potentially introduce some additional complexity in tuning and/or troubleshooting, two major problems with bare metal provisioning are addressed:
- Resource utilization
In terms of flexibility, operations teams are able to have a collection of standard images for various systems, such as application servers or database servers, and quickly spin them up on readily waiting hardware. This makes it much easier when an application team says “we need a new database for this service and will need four application servers with it.” Rather than racking up and setting up five physical machines and installing the OS along with various packages, the operations team simply starts five virtual machines (or containers for those of you “containerites” out there) and hands them off.
This also helps with resource utilization. Rather than setting one application server up on a physical machine and keeping it under 50% utilization all the time, you are able to launch multiple VMs on this machine, each just using a portion. When the physical machine reaches maximum capacity, you can move an image to a new physical machine. This process gets rinsed and repeated as traffic patterns change and resource demands shift. It decreases some of the pain that comes from watching bare machines sit idle.
Now, let’s put it all together and talk about creating a private cloud. It’s the best of both worlds, right? All the flexibility and elasticity of the public cloud, but in your own data center where you can retain full control of everything. In this scenario, an organization is generally doing the following:
- Managing a data center of generic, physical machines
- Leveraging virtualization and/or containerization to quickly launch/destroy server images
- Using an orchestration layer to manage all of the VMs/containers
This is a great fit for organizations that already have made an investment in a large physical infrastructure. You likely already have hundreds of servers at your disposal, so why not get the most utilization you can out of them and make your infrastructure much more dynamic?
While this sounds amazing (and quite often IS the best fit), here’s what to consider. When dealing with a large internal cloud, you will need people experienced in managing this sort of infrastructure. Even though application teams now just hit a button to launch a database and application server, the cloud is still backed by a traditional data center with bare metal servers. An operations team is still a very needed entity — even though they may not be your traditional “DBA” or “ops guy”.
Also, the complexity of managing (and definitely troubleshooting) an environment such as this generally increases by an order of magnitude. Generic questions like “why is my application running slow?” used to be easier to answer: you check the application server and the database server, look at some metrics, and can generally pinpoint what is happening. In a large private cloud, now you’ll need to look at:
- Application/query layer
- Orchestration layer
- Virtualization / container layer
- Physical layer
It is not to say it isn’t worth it, but managing an internal cloud is not a trivial task and much thought needs to be put in.
How Can Percona Help?
Having been in the open source database space for years, Percona has seen and worked on just about every possible MySQL deployment possible. We also focus on picking the proper tool for the job and will meet your organization where you are. Running Postgres on bare metal servers? We can help. Serving your application off of EC2 instances backed by an RDS database? No problem. MongoDB on Kubernetes in your private cloud? Check.
We can also work with your organization to help you choose the best path to follow. We love open source databases and the flexibility that they can provide. Our team has experience designing and deploying architectures ranging from a single database cloud server to hundreds of bare metal machines spanning across multiple data centers. With that sort of experience, we can help your organization with an enterprise database solution too!