Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

Measuring MySQL Performance in Kubernetes

May 9, 2019

Author

Vadim Tkachenko

MySQL

Percona Software

Share this Post:

Measuring MySQL Performance in Kubernetes In my previous post Running MySQL/Percona Server in Kubernetes with a Custom Config I’ve looked at how to set up MySQL in Kubernetes to utilize system resources fully. Today I want to measure if there is any performance overhead of running MySQL in Kubernetes, and show what challenges I faced trying to measure it.

I will use a very simple CPU bound benchmark to measure MySQL performance in OLTP read-only workload:

sysbench oltp_read_only --report-interval=1 --time=1800 --threads=56 --tables=10 --table-size=10000000 --mysql-user=sbtest --mysql-password=sbtest --mysql-socket=/var/lib/mysql/mysql.sock run

1	sysbench oltp_read_only --report-interval=1 --time=1800 --threads=56 --tables=10 --table-size=10000000 --mysql-user=sbtest --mysql-password=sbtest --mysql-socket=/var/lib/mysql/mysql.sock run

The hardware is as follows:

Supermicro server

- Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz

- 2 sockets / 28 cores / 56 threads

- Memory: 256GB of RAM

The most interesting number there is 28 cores / 56 threads. Please keep this in mind; we will need this later.

So let’s see the MySQL performance in the bare metal setup:

[ 607s ] thds: 56 tps: 22154.20 qps: 354451.12 (r/w/o: 310143.73/0.00/44307.39) lat (ms,95%): 2.61 err/s: 0.00 reconn/s: 0.00
[ 608s ] thds: 56 tps: 22247.80 qps: 355955.88 (r/w/o: 311461.27/0.00/44494.61) lat (ms,95%): 2.61 err/s: 0.00 reconn/s: 0.00
[ 609s ] thds: 56 tps: 21984.01 qps: 351641.13 (r/w/o: 307672.12/0.00/43969.02) lat (ms,95%): 2.66 err/s: 0.00 reconn/s: 0.00

[ 607s ] thds: 56 tps: 22154.20 qps: 354451.12 (r/w/o: 310143.73/0.00/44307.39) lat (ms,95%): 2.61 err/s: 0.00 reconn/s: 0.00

[ 608s ] thds: 56 tps: 22247.80 qps: 355955.88 (r/w/o: 311461.27/0.00/44494.61) lat (ms,95%): 2.61 err/s: 0.00 reconn/s: 0.00

[ 609s ] thds: 56 tps: 21984.01 qps: 351641.13 (r/w/o: 307672.12/0.00/43969.02) lat (ms,95%): 2.66 err/s: 0.00 reconn/s: 0.00

So we can get about 22000 qps on this server.

Now, let’s see what we can get if the same server runs a Kubernetes node and we deploy the Percona server image on this node. I will use a modified image of Percona Server 8, which already includes sysbench inside.

You can find my image here: https://hub.docker.com/r/vadimtk/ps-8-vadim

And I use the following deployment yaml :

apiVersion: v1
kind: Service
metadata:
  name: mysql
spec:
  selector:
    app: mysql
  ports:
    - name: mysql
      port: 3306
      protocol: TCP
      targetPort: 3306
---
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
  name: mysql
spec:
  selector:
    matchLabels:
      app: mysql
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: mysql
    spec:
      nodeSelector:
        kubernetes.io/hostname: smblade01
      volumes:
        - name: mysql-persistent-storage
          hostPath:
            path: /mnt/data/mysql
            type: Directory
        - name: config-volume
          configMap:
            name: mysql-config
            optional: true
      containers:
      - image: vadimtk/ps-8-vadim
        imagePullPolicy: Always
        name: mysql
        env:
          # Use secret in real usage
        - name: MYSQL_ROOT_PASSWORD
          value: password
        ports:
        - containerPort: 3306
          name: mysql
        volumeMounts:
        - name: mysql-persistent-storage
          mountPath: /var/lib/mysql
        - name: config-volume
          mountPath: /etc/my.cnf.d

apiVersion: v1

kind: Service

metadata:

spec:

selector:

app: mysql

ports:

- name: mysql

port: 3306

protocol: TCP

targetPort: 3306

---

apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2

kind: Deployment

metadata:

spec:

selector:

matchLabels:

app: mysql

strategy:

type: Recreate

template:

metadata:

labels:

app: mysql

spec:

nodeSelector:

kubernetes.io/hostname: smblade01

volumes:

- name: mysql-persistent-storage

hostPath:

path: /mnt/data/mysql

type: Directory

- name: config-volume

configMap:

optional: true

containers:

- image: vadimtk/ps-8-vadim

imagePullPolicy: Always

env:

# Use secret in real usage

- name: MYSQL_ROOT_PASSWORD

value: password

ports:

- containerPort: 3306

volumeMounts:

- name: mysql-persistent-storage

mountPath: /var/lib/mysql

- name: config-volume

mountPath: /etc/my.cnf.d

The most important part here is that we deploy our image on smblade01 node (the same one I ran the bare metal benchmark).

Let’s see what kind of performance we get using this setup. The number I’ve got:

[ 605s ] thds: 56 tps: 10561.88 qps: 169045.04 (r/w/o: 147921.29/0.00/21123.76) lat (ms,95%): 12.98 err/s: 0.00 reconn/s: 0.00
[ 606s ] thds: 56 tps: 10552.00 qps: 168790.98 (r/w/o: 147685.98/0.00/21105.00) lat (ms,95%): 15.83 err/s: 0.00 reconn/s: 0.00
[ 607s ] thds: 56 tps: 10566.00 qps: 169073.97 (r/w/o: 147942.97/0.00/21131.00) lat (ms,95%): 5.77 err/s: 0.00 reconn/s: 0.00
[ 608s ] thds: 56 tps: 10581.08 qps: 169359.21 (r/w/o: 148195.06/0.00/21164.15) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00
[ 609s ] thds: 56 tps: 12873.80 qps: 205861.77 (r/w/o: 180116.17/0.00/25745.60) lat (ms,95%): 5.37 err/s: 0.00 reconn/s: 0.00
[ 610s ] thds: 56 tps: 20196.89 qps: 323184.24 (r/w/o: 282789.46/0.00/40394.78) lat (ms,95%): 3.02 err/s: 0.00 reconn/s: 0.00
[ 611s ] thds: 56 tps: 18033.21 qps: 288487.30 (r/w/o: 252421.88/0.00/36065.41) lat (ms,95%): 5.28 err/s: 0.00 reconn/s: 0.00
[ 612s ] thds: 56 tps: 11444.08 qps: 183129.22 (r/w/o: 160241.06/0.00/22888.15) lat (ms,95%): 5.37 err/s: 0.00 reconn/s: 0.00
[ 613s ] thds: 56 tps: 10597.96 qps: 169511.35 (r/w/o: 148316.43/0.00/21194.92) lat (ms,95%): 5.57 err/s: 0.00 reconn/s: 0.00
[ 614s ] thds: 56 tps: 10566.00 qps: 169103.93 (r/w/o: 147969.94/0.00/21133.99) lat (ms,95%): 5.67 err/s: 0.00 reconn/s: 0.00
[ 615s ] thds: 56 tps: 10640.07 qps: 170227.13 (r/w/o: 148948.99/0.00/21278.14) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00
[ 616s ] thds: 56 tps: 10579.04 qps: 169264.66 (r/w/o: 148106.58/0.00/21158.08) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00

[ 605s ] thds: 56 tps: 10561.88 qps: 169045.04 (r/w/o: 147921.29/0.00/21123.76) lat (ms,95%): 12.98 err/s: 0.00 reconn/s: 0.00

[ 606s ] thds: 56 tps: 10552.00 qps: 168790.98 (r/w/o: 147685.98/0.00/21105.00) lat (ms,95%): 15.83 err/s: 0.00 reconn/s: 0.00

[ 607s ] thds: 56 tps: 10566.00 qps: 169073.97 (r/w/o: 147942.97/0.00/21131.00) lat (ms,95%): 5.77 err/s: 0.00 reconn/s: 0.00

[ 608s ] thds: 56 tps: 10581.08 qps: 169359.21 (r/w/o: 148195.06/0.00/21164.15) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00

[ 609s ] thds: 56 tps: 12873.80 qps: 205861.77 (r/w/o: 180116.17/0.00/25745.60) lat (ms,95%): 5.37 err/s: 0.00 reconn/s: 0.00

[ 610s ] thds: 56 tps: 20196.89 qps: 323184.24 (r/w/o: 282789.46/0.00/40394.78) lat (ms,95%): 3.02 err/s: 0.00 reconn/s: 0.00

[ 611s ] thds: 56 tps: 18033.21 qps: 288487.30 (r/w/o: 252421.88/0.00/36065.41) lat (ms,95%): 5.28 err/s: 0.00 reconn/s: 0.00

[ 612s ] thds: 56 tps: 11444.08 qps: 183129.22 (r/w/o: 160241.06/0.00/22888.15) lat (ms,95%): 5.37 err/s: 0.00 reconn/s: 0.00

[ 613s ] thds: 56 tps: 10597.96 qps: 169511.35 (r/w/o: 148316.43/0.00/21194.92) lat (ms,95%): 5.57 err/s: 0.00 reconn/s: 0.00

[ 614s ] thds: 56 tps: 10566.00 qps: 169103.93 (r/w/o: 147969.94/0.00/21133.99) lat (ms,95%): 5.67 err/s: 0.00 reconn/s: 0.00

[ 615s ] thds: 56 tps: 10640.07 qps: 170227.13 (r/w/o: 148948.99/0.00/21278.14) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00

[ 616s ] thds: 56 tps: 10579.04 qps: 169264.66 (r/w/o: 148106.58/0.00/21158.08) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00

You can see the numbers vary a lot, from 10550 tps to 20196 tps, with the most time being in the 10000tps range.
That’s quite disappointing. Basically, we lost half of the throughput by moving to the Kubernetes node.

But don’t panic, we can improve this. But first, we need to understand why this happens.

The answer lies in how Kubernetes applies Quality of Service for Pods. By default (if CPU or Memory limits are not defined) the QoS is BestEffort, which leads to the results we see above. To allocate all CPU resources, we need to make sure QoS Guaranteed. For this, we add the following to the image definition:

resources:
          requests:
            cpu: "55500m"
            memory: "150Gi"
          limits:
            cpu: "55500m"
            memory: "150Gi"

resources:

requests:

cpu: "55500m"

memory: "150Gi"

limits:

cpu: "55500m"

memory: "150Gi"

These are somewhat funny lines to define CPU limits. As you remember we have 56 threads, so initially I tried to set limits: cpu: "56" , but it did not work as Kubernetes was not able to start the pod with the error Insufficient CPU. I guess Kubernetes allocates a few CPU percentages for the internal needs.

So the line cpu: "55500m" works, which means we allocate 55.5 CPU for Percona Server.

Let’s see what results we can have with Guaranteed QoS:

[ 883s ] thds: 56 tps: 20320.06 qps: 325145.96 (r/w/o: 284504.84/0.00/40641.12) lat (ms,95%): 2.81 err/s: 0.00 reconn/s: 0.00
[ 884s ] thds: 56 tps: 20908.89 qps: 334587.21 (r/w/o: 292769.43/0.00/41817.78) lat (ms,95%): 2.81 err/s: 0.00 reconn/s: 0.00
[ 885s ] thds: 56 tps: 20529.03 qps: 328459.46 (r/w/o: 287402.40/0.00/41057.06) lat (ms,95%): 2.81 err/s: 0.00 reconn/s: 0.00
[ 886s ] thds: 56 tps: 17567.75 qps: 281051.03 (r/w/o: 245914.53/0.00/35136.50) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00
[ 887s ] thds: 56 tps: 18036.82 qps: 288509.07 (r/w/o: 252437.44/0.00/36071.63) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00
[ 888s ] thds: 56 tps: 18398.23 qps: 294399.67 (r/w/o: 257603.21/0.00/36796.46) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00
[ 889s ] thds: 56 tps: 18402.90 qps: 294484.45 (r/w/o: 257677.65/0.00/36806.81) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00
[ 890s ] thds: 56 tps: 19428.12 qps: 310787.86 (r/w/o: 271934.63/0.00/38853.23) lat (ms,95%): 5.37 err/s: 0.00 reconn/s: 0.00
[ 891s ] thds: 56 tps: 19848.69 qps: 317646.11 (r/w/o: 277947.73/0.00/39698.39) lat (ms,95%): 5.28 err/s: 0.00 reconn/s: 0.00
[ 892s ] thds: 56 tps: 20457.28 qps: 327333.49 (r/w/o: 286417.93/0.00/40915.56) lat (ms,95%): 2.86 err/s: 0.00 reconn/s: 0.00

[ 883s ] thds: 56 tps: 20320.06 qps: 325145.96 (r/w/o: 284504.84/0.00/40641.12) lat (ms,95%): 2.81 err/s: 0.00 reconn/s: 0.00

[ 884s ] thds: 56 tps: 20908.89 qps: 334587.21 (r/w/o: 292769.43/0.00/41817.78) lat (ms,95%): 2.81 err/s: 0.00 reconn/s: 0.00

[ 885s ] thds: 56 tps: 20529.03 qps: 328459.46 (r/w/o: 287402.40/0.00/41057.06) lat (ms,95%): 2.81 err/s: 0.00 reconn/s: 0.00

[ 886s ] thds: 56 tps: 17567.75 qps: 281051.03 (r/w/o: 245914.53/0.00/35136.50) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00

[ 887s ] thds: 56 tps: 18036.82 qps: 288509.07 (r/w/o: 252437.44/0.00/36071.63) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00

[ 888s ] thds: 56 tps: 18398.23 qps: 294399.67 (r/w/o: 257603.21/0.00/36796.46) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00

[ 889s ] thds: 56 tps: 18402.90 qps: 294484.45 (r/w/o: 257677.65/0.00/36806.81) lat (ms,95%): 5.47 err/s: 0.00 reconn/s: 0.00

[ 890s ] thds: 56 tps: 19428.12 qps: 310787.86 (r/w/o: 271934.63/0.00/38853.23) lat (ms,95%): 5.37 err/s: 0.00 reconn/s: 0.00

[ 891s ] thds: 56 tps: 19848.69 qps: 317646.11 (r/w/o: 277947.73/0.00/39698.39) lat (ms,95%): 5.28 err/s: 0.00 reconn/s: 0.00

[ 892s ] thds: 56 tps: 20457.28 qps: 327333.49 (r/w/o: 286417.93/0.00/40915.56) lat (ms,95%): 2.86 err/s: 0.00 reconn/s: 0.00

This is much better (mostly ranging in 20000 tps), but we still do not get to 22000 tps.

I do not have the full explanation of why there is still a 10% performance loss, but it might be related to this issue. And I see there is a work in progress to improve Guaranteed QoS performance but it was not merged into the mainstream releases yet. Hopefully, it will be in one of the next releases.

Conclusions:

- Out of the box, you may see quite bad performance when deploying in Kubernetes POD

- To improve your experience you need to make sure you use Guaranteed QoS. Unfortunately, Kubernetes does not make it easy. You need to manually set the number of CPU threads, which is not always obvious if you use dynamic cloud instances.

- With Guaranteed QoS there is still a performance overhead of 10%, but I guess this is the cost we have to accept at the moment.

0 0 votes

Article Rating

4 Comments

Oldest

Newest Most Voted

Matt Lord

7 years ago

Thanks for doing this, Vadim! I’m curious if you have yet done any testing with the beta CPU Manager features available in Kubernetes 1.10+? I wonder if using the –cpu-manager-policy=static option for your kublet(s), while continuing to use the Guaranteed QoS class for your Pod(s) as you showed here, might help to close the noted performance gap completely?
– https://kubernetes.io/blog/2018/07/24/feature-highlight-cpu-manager/
– https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/

I’ll be playing around with similar things myself in the future and will share any interesting findings. Thanks again for sharing this work! Good stuff.

Author

Vadim Tkachenko

7 years ago

Reply to Matt Lord

Matt,

I did not test –cpu-manager-policy=static , but this is something I plan to look into.
This is one area I find problematic with Kubernetes – it is not quite obvious how to changes settings easily cluster wise. I could not figure out it quickly in my current testing.

zainal

7 years ago

Very good inside, but your example is for scale up. Kubernetes design is for scale out. If you add more minion you will have better performance without to increase cpu and memory

Kimo Paul

7 years ago

The Kubernetes cluster consists of a set of nodes – which may be physical or virtual servers – on premise or on cloud – that host applications in the form of containers. https://www.youtube.com/watch?v=8C_SCDbUJTg