How to Measure MySQL Performance in Kubernetes with Sysbench

MySQL Kubernetes SysbenchAs our Percona Kubernetes Operator for Percona XtraDB Cluster gains in popularity, I am getting questions about its performance and how to measure it properly. Sysbench is the most popular tool for database performance evaluation, so let’s review how we can use it with Percona XtraDB Cluster Operator.

Operator Setup

I will assume that you have an operator running (if not, this is the topic for a different post). We have the documentation on how to get it going, and we will start a three-node cluster using the following cr.yaml file:

If we are successful, we will have three pods running:

It’s important to note that IP addresses allocated are internal to Kubernetes Pods and not routable outside of Kubernetes.

Sysbench on an External to Kubernetes Host

In this part, let’s assume we want to run a client (sysbench) on a separate host, which is not a part of the Kubernetes system. How do we do it? We need to expose one of the pods (or multiple) to the external world, and for this, we use Kubernetes service with type NodePort:

So here we see that port 3306 (MySQL port) is exposed as port 30160 on node-3 (node where pod cluster1-pxc-0 is running). Please note this will invoke a kube-proxy process on node-3, which will handle incoming traffic on port 30160 and route it to the cluster1-pxc-0 pod. Kube-proxy by itself will introduce some networking overhead.

To find the IP address of Node-3:

So now we can connect the dots and connect the mysql client to IP port 30160 and create database sbtest, which we need to run sysbench:

And now we can prepare data for sysbench (nevermind some parameters, we will come to them later).

Sysbench Running Inside Kubernetes

When we have sysbench running inside Kubernetes, it makes all these networking steps unnecessary and it simplifies a lot of things while also making one more complicated: how do you actually start a pod with sysbench?

For the start, we need an image with sysbench, and prudently we already have one in Docker Hub available as perconalab/sysbench, so we will use that one. And with an image you can prepare a yaml file to start a pod with kubectl create -f sysbench.yaml, or, I prefer to invoke it just from the command line (which is a little bit elaborate):

This way, Kubernetes will schedule sysbench-client pod on any available node, which may not be something we want. To schedule sysbench-client on a specific node, we can use:

This will start sysbench-client on node-3. Now from pod command line we can access mysql just using cluster1-pxc-0  hostname:

A Quick Intro to Sysbench

Although we have covered sysbench multiple times, I was asked to provide a basic intro for different scenarios, so I would like to review some basic options for sysbench.

Prepare Data

Before running a benchmark, we need to prepare the data. From our previous example:

This will create ten tables with 1mln rows each, so it will generate data for ten tables, each about 250MB in size, for a total 2.5GB of data. This gives us an idea what knobs we can use to generate less or more data.

If we want, say, 25GB of data, we can use either 100 tables with 1mln rows each or ten tables with 10mln rows. For 50GB data, we can use 200 tables with 1mln rows or ten tables with 20mln rows, or any combination of tables