Kubernetes, and its most popular distribution OpenShift, receives a lot of interest as a container orchestration platform. However, databases remain a foreign entity, primarily because of their stateful nature, since container orchestration systems prefer stateless applications. That said, there has been good progress in support for StatefulSet applications and persistent storage, to the extent that it might be already comfortable to have a production database instance running in Kubernetes. With this in mind, we’ve been looking at running Percona XtraDB Cluster in Kubernetes/OpenShift.
While there are already many examples on the Internet of how to start a single MySQL instance in Kubernetes, for serious usage we need to provide:
- High Availability: how can we guarantee availability when an instance (or Pod in Kubernetes terminology) crashes or becomes unresponsive?
- Persistent storage: we do not want to lose our data in case of instance failure
- Backup and recovery
- Traffic routing: in the case of multiple instances, how do we direct an application to the correct one
Percona XtraDB Cluster in Kubernetes/OpenShift
Schematically it looks like this:
The picture highlights the components we are going to use
- Percona XtraDB Cluster to provide High Availability. Although it is possible to use regular MySQL Replication, the need for automatic master failover makes it quite complicated. Shlomi describes some possible approaches here, and we may implement some of this in Percona Server for MySQL in the future.
- (Optional) ProxySQL with proxysql-admin tools
- (Optional) Percona Monitoring and Management (PMM) Server with clients installed on each node
- Support for backup volumes
Running this in Kubernetes assumes a high degree of automation and minimal manual intervention.
We provide our proof of concept in this project: https://github.com/Percona-Lab/percona-openshift. Please treat it like a source for ideas and as an alpha-quality project, in no way it is production ready.
In our implementation we rely on Helm, the package manager for Kubernetes. Unfortunately OpenShift does not officially support Helm out of the box, but there is a guide from RedHat on how to make it work.
In the clustering setup, it is quite typical to use a service discovery software like Zookeeper, etcd or Consul. It may become necessary for our Percona XtraDB Cluster deployment, but for now, to simplify deployment, we are going to use the DNS service discovery mechanism provided by Kubernetes. It should be enough for our needs.
We also expect the Kubernetes deployment to provide Dynamic Storage Provisioning. The major cloud providers (like Google Cloud, Microsoft Azure or Amazon Cloud) should have it. Also, it might not be easy to have Dynamic Storage Provisioning for on-premise deployments. You may need to setup GlusterFS or Ceph to provide Dynamic Storage Provisioning.
The challenge with a distributed file system is how many copies of data you will end up having. Percona XtraDB Cluster by itself has three copies, and GlusterFS will also require at least two copies of the data, so in the end we will have six copies of the data. This can’t be good for write intensive applications, but it’s also not good from the capacity standpoint.
One possible approach is to have local data copies for Percona XtraDB Cluster deployments. This will provide better performance and less impact on the network, but in the case of a big dataset (100GB+ ) the node failure will require SST with a big impact on the cluster and network. So the individual solution should be tailored for your workload and your requirements.
Now, as we have a basic setup working, it would be good to understand the performance impact of running Percona XtraDB Cluster in Kubernetes. Is the network and storage overhead acceptable or it is too big? We plan to look into this in the future.
Once again, our project is located at https://github.com/Percona-Lab/percona-openshift, we are looking for your feedback and for your experience of running databases in Kubernetes/OpenShift.
Before you leave …
Percona XtraDB Cluster
If this article has interested you and you would like to know more about Percona XtraDB Cluster, you might enjoy our recent series of webinar tutorials that introduce this software and how to use it.
You May Also Like
Learn how to perform schema upgrades with Percona XtraDB Cluster, a fully open source solution for scalable high availability MySQL clustering.
What should you consider when planning for performance that scales? Our white paper, Performance at Scale: Keeping Your Database on Its Toes, covers that topic and offers other insight into database performance and availability.