sysbench Histograms: A Helpful Feature Often Overlooked

sysbench Histograms: A Helpful Feature Often Overlooked

PREVIOUS POST
NEXT POST

Sysbench HistogramsIn this blog post, I will demonstrate how to run and use sysbench histograms.

One of the features of sysbench that I often I see overlooked (and rarely used) is its ability to produce detailed query response time histograms in addition to computing percentile numbers. Looking at histograms together with throughput or latency over time provides many additional insights into query performance.

Here is how you get detailed sysbench histograms and performance over time:

There are a few command line options to consider:

  • report-interval=1 prints out the current performance measurements every second, which helps see if performance is uniform, if you have stalls or otherwise high variance
  • percentile=99 computes 99 percentile response time, rather than 95 percentile (the default); I like looking at 99 percentile stats as it is a better measure of performance
  • histogram=on produces a histogram at the end of the run (as shown below)

The first thing to note about this histogram is that it is exponential. This means the width of the buckets changes with higher values. It starts with 0.001 ms (one microsecond) and gradually grows. This design is used so that sysbench can deal with workloads with requests that take small fractions of milliseconds, as well as accommodate requests that take many seconds (or minutes).

Next, we learn some us very interesting things about typical request response time distribution for databases. You might think that this distribution would be close to some to some “academic” distributions, such as normal distribution. In reality, we often observe is something of a “camelback” distribution (not a real term) – and our “camel” can have more than two humps (especially for simple requests such as the single primary key lookup shown here).

Why do request response times tend to have this distribution? It is because requests can take multiple paths inside the database. For example, certain requests might get responses from the MySQL Query Cache (which will result in the first hump). A second hump might come from resolving lookups using the InnoDB Adaptive Hash Index. A third hump might come from finding all the data in memory (rather than the Adaptive Hash Index). Finally, another hump might coalesce around the time (or times) it takes to execute on requests that require disk IO.    

You also will likely see some long-tail data that highlights the fact that MySQL and Linux are not hard, real-time systems. As an example, this very simple run with a single thread (and thus no contention) has an outlier at around 18ms. Most of the requests are served within 0.2ms or less.

As you add contention, row-level locking, group commit and other issues, you are likely to see even more complicated diagrams – which can often show you something unexpected:

I hope you give sysbench histograms a try, and see what you can discover!

PREVIOUS POST
NEXT POST

Share this post

Comments (6)

  • Mark Callaghan Reply

    That looks like a feature I should be using. Is there an option to set the number of buckets?

    September 20, 2017 at 5:20 pm
    • Peter Zaitsev Reply

      Mark,

      Nope. But I already asked Alexey Kopytov to at least allow to specify number of buckets to print out.

      High number of buckets is good to compute accurate 99% but we do not need so many for printout

      September 20, 2017 at 5:25 pm
      • Mark Callaghan Reply

        Too many lines of output is one concern. Too much CPU spent searching for the right bucket is the other concern. But I can run tests to determine whether my other concern is bogus.

        September 20, 2017 at 5:30 pm
  • Alexey Kopytov Reply

    Mark, I don’t think the number of buckets has any measurable impact on performance (the bucket index is calculated as “i = floor((log(value) – h->range_deduct) * h->range_mult + 0.5)”).

    As to making the output more configurable, there is https://github.com/akopytov/sysbench/issues/170 which I’m going to close it together with some other related improvements.

    September 21, 2017 at 12:53 am
  • Rick James Reply

    The buckets are logarithmicly spaced, OK. But it closes up the gaps, bad. That distorts the results.

    September 22, 2017 at 12:12 am
  • Peter Zaitsev Reply

    Rick,

    Yep. You can surely imagine some cases where the gaps which are skipped would visually misrepresent the picture.

    September 22, 2017 at 5:12 am

Leave a Reply