Query Response time histogram – new feature in Percona Server

PREVIOUS POST
NEXT POST

Recently we had couple posts dedicated to performance monitoring, i.e. Color code your performance numbers, Performance Optimization and Six Sigma, so you may understand we consider stability of performance numbers as one of important area for database management.

That’s why we decided to add histogram of queries response times into Percona Server, and our software engineer Oleg Tsarev implemented this feature. The feature adds new INFORMATION_SCHEMA table QUERY_RESPONSE_TIME
( full docs are on page response_time_distribution), which looks like

and which basically means that there were 7040 queries with execution time less than 1 sec and more than 0.1 sec.

The feature will be available in next release of Percona Server 5.1-12.0, but you can try it earlier – we setup our BuildBot farm to generate binary snapshots for each commit into lp:percona-server repository, and now you can take binary version for each particular commit from our TESTING stage

And to demonstrate possible usage of histogram of response time, let me show you
how increasing throughput affects response time. There are histograms and pie charts for sysbench oltp runs
for 1, 32 and 128 concurrent threads

1 thread

so as you see in this case 10% of all queries were executed with execution time in interval (0.1ms, 1 ms] and 86% queries had execution time in interval (0.01ms, 0.1ms]

32 threads

With 32 thread percent of queries in interval (0.1ms, 1ms] increased to 30%, and (0.01ms, 0.1ms] dropped to 68%

128 threads

For 128 threads we see now 12% of queries went into 1 ms < resp time <= 10ms, which we did not see previously,
and (0.1ms, 1ms] increased to 35%, while percentage of queries <=0.1ms is now only 52%.

I expect Baron will add this to Cacti templates, so you can setup monitoring of response times.

PREVIOUS POST
NEXT POST

Comments

  1. Rob Wultsch says

    This will be very useful. I image that it would be useful if there was some way to tag a query with a keyword so that the information is not skewed by slow or fast queries. Without the tagging I imagine that most db’s will have a somewhat broad curve. With tagging I imagine the curve would be much more narrow and it would be easier to discern behavior changes.

  2. says

    Rob,

    I agree, tagging of queries would give even more sense for query monitoring.
    I think we will add it eventually.

  3. says

    Rob,

    This feature acts as a “fingerprint” for a lot of systems – there are going to be some slow queries and some short two, while distribution will stay more or less the same.
    Not all the systems have this property of course some may have cron jobs etc which cause significant behavior variance.

    For systems with stable workload we can monitory such histogram every 5 minutes or so and track for significant variances.

    Periodic spikes in response time is a frequent problem which is often noticed too late – after it significantly impacts production. Though most of such problems start with more subtle changes.

  4. says

    Great stuff!! – specially when monitoring live the difference between periods you may point exactly to the moment when the deviation is happening!..

    Will add it to dim_STAT too :-)

    Rgds,
    -Dimitri

Leave a Reply

Your email address will not be published. Required fields are marked *