Sometimes the question is put like are you looking for Performance OR Stability, which I believe is a strange way to put it. In real life systems you care both about Performance AND Stability. I would even say Stability is a not the best world here, I would day you care about your minimal performance in most cases.
If system can handle 5000 q/sec for 1 minute and when 20.000 s/sec for the next one, how much I can count on in terms of capacity planning ? In case this is typical OLTP system I will have to use 5000 q/sec number as I need my system to always be able to reach performance requirements. If the system though is doing batch processing may be I can count on the average which is 12.5K in this case.
The difference between stability and minimal performance is important as I can be quite OK with “unstable” performance if it is performance bursts rather than stalls, for example if my system performs 7000 q/sec and sometimes bursts to 15.000 q/sec I will prefer it to the system which has stable 6000 q/sec performance.
Finally we have to understand the difference between Benchmarks and real life. Most benchmarks perform “stress testing” throwing load on the system and seeing how much it can handle. In real world however you typically have a given load, which typically falls into the certain range, for example you may say we have 300-500 queries/sec in our peak time. Because most systems have load based on something called “random arrivals” rather than uniform pace of requests, the more you slice the time the more variance you’re going to get. For example the same case could correspond to 20-100 queries in 100ms period. In the real applications you do not drive your system at complete saturation point to accommodate for such micro spikes and you care about response time a lot, as it is response time what
users will use to judge if your system is fast or slow.
You will always see some response time distribution rather than all queries of the same time having same response time and this distribution will vary a lot. Typically the less outliers you have with the same
average response time the better it is.
The relationships between Throughput and response time is complicated and we can’t always say better throughput comes with better response times, but it still serves and important point. If I know my system peaks on 1000 q/sec for 10 seconds and I have to serve traffic which is 2000 q/sec – I can’t do that, as a lot of queries will have to be queued for at least 10 seconds until performance recovers, and this means their response time will be at least 10 seconds, which is likely not acceptable to me.
The micro stalls though can be acceptable. If my system serves 5000q/sec in average but there are some 10ms intervals in which it stalls to process 0 queries or processes just 1000 queries,sec
and my query inflow rate is 2000 q/sec and required response time is within 50ms, it well may be acceptable. Note if you drill to small enough time intervals you will find those micro stalls in basically any system.
As a summary most likely you do care about your performance, the minimal performance to be certain. The interval at which you should measure this minimal performance depends of response time your application can tolerate. In the MySQL Benchmarks Vadim Posted we see there are a lot of stalls lasting 1 minute or more which are not acceptable for any interactive application.