VividCortex Agent Benchmark: Measuring the Likely Overhead

Intro

The purpose of this project was to measure the potential overhead of VividCortex Agent, which is used by VividCortex.com database monitoring system. This benchmark is part of a consulting engagement with VividCortex and paid by the customer.

The assumption is that VividCortex agent uses CPU processing time, and we should see an impact on user queries when the workload is CPU-intensive (how much is to be measured). The impact on IO-bound should be small or insignificant.

Workload Description
For this, we use LinkBenchX benchmark in a combination of different options.

Workloads

There are 3 different workloads we want to look into:

CPU bound. All data fits into memory; database performance is limited by CPU and memory speed. In this mode the server will be 100% CPU bound and we should see reduced performance when the agent is running.
CPU bound with limited CPUs. This mode is identical to previous, with a difference that only 4 CPUs are available for mysqld and vc-mysql-query processes. This to emulate “cloud-based” environment. In this workload even less CPUs are available, so we expect even bigger performance hit when running with agent.
IO-bound workload. In this workload the performance will be limited by storage IO performance, the impact from agent should be minimal.

LinkBenchX modes

We use two modes in LinkBenchX

Throughput mode. This mode allows to measure maximal throughput that database server can achieve.
“Request rate” mode. In this mode, LinkBenchX generates load with specified rate, so it allows to measure and compare response times for different configurations. Usually, we set request rate on the level of 75% of maximal throughput
We measure (throughput & 99% response time) for operations ADD_LINKS (write operation) and GET_LINKS_LIST (range select operation) in 10 sec intervals

Agent modes
We wanted to compare the performance impact of VividCortex’s agent to PERFORMANCE_SCHEMA so we measured 4 combinations, of enabled/disabled in following ways:

Performance Schema disabled (OFF in my.cnf), marked as "NO-PS" in charts
Performance Schema enabled (ON in my.cnf, no additional probes enabled), marked as "with-PS" in charts
VividCortex agent is not active ("NO-vc-agent" in charts)
VividCortex agent is active ("vc-agent" in charts)

Summary
CPU bound

In the CPU bound workload, the impact from the enabling agent is the most significant.

Throughput impact:

By enabling PERFORMANCE_SCHEMA we see overhead 1.8% in throughput
By enabling vc-agent the overhead is 10.7% in throughput
By enabling PERFORMANCE_SCHEMA and vc-agent the overhead is 11.7% in throughput

It is worth to highlight, that in CPU-bound load, the vc-mysql-query agent’s CPU consumption is related to the amount of traffic it has to sniff and on CPU-bound workloads with high query traffic it can use up to a single CPU core. We did not benchmark a CPU-bound server with low query traffic. That’s why we decided to measure an impact of vc-agent in a case of limited amount of CPU available (say in a case of cloud or container server).

Response time impact:
on ADD_LINKS operation

PERFORMANCE_SCHEMA impact: added 20% to response time
vc-agent impact: added 42% to response time
PERFORMANCE_SCHEMA and vc-agent impact: added 63% to response time

on GET_LINKS_LIST operation

PERFORMANCE_SCHEMA impact: added 7.7% to response time
vc-agent impact: added 21.6% to response time
PERFORMANCE_SCHEMA and vc-agent impact: added 30.9% to response time

CPU bound (4 CPU cores are available)
In the CPU bound workload the impact from the enabling agent is the most significant.

By enabling PERFORMANCE_SCHEMA the overhead is 5% in throughput
By enabling, vc-agent the overhead is 13% in throughput
By enabling PERFORMANCE_SCHEMA and vc-agent, the overhead is 17.8% in throughput

Response time impact:
on ADD_LINKS operation

PERFORMANCE_SCHEMA impact: added 14% to response time
vc-agent impact: added 52% to response time
PERFORMANCE_SCHEMA and vc-agent impact: added 84% to response time

on GET_LINKS_LIST operation

PERFORMANCE_SCHEMA impact: added 10% to response time
vc-agent impact: added 51% to response time
PERFORMANCE_SCHEMA and vc-agent impact: added 74% to response time

IO Bound
There is no statistical difference in throughput and response time when running with PERFORMANCE_SCHEMA enabled and/or with vc-agent.

Conclusion
The impact from running vc-agent and/or PERFORMANCE_SCHEMA might be negligible or significant, depending on your workload. Numbers above (the impact on throughput and response time) are border cases and might be used as low and high limits for estimation of impact. Most likely for a real workload the overhead will be in the middle.
There is no measurable impact from vc-agent in IO bound workload, but in CPU-bound you may want to make sure you have spare CPU cycles for vc-agent, as it is computation intensive and may add visible overhead to response times.

Supporting graphs.

CPU bound

Throughput timeline