Where the open source community meets: Secure your spot for Percona Live Amsterdam! - Register

Downloads

Blog

What time 18446744073709550.000 means

May 17, 2009

Author

Vadim Tkachenko

Insight for DBAs

Share this Post:

Sometimes when you do profiling you can see number like this in timestamps. Periodically in our patches there was Query_time: 18446744073709550.000 in slow.log file (well, it was fixed recently, but still appears in other places).

I faced this problem several years ago when only 2-core AMD Opteron systems appeared and I noticed sysbench getting crazy showing query execution time 18446744073709550.000 or like this.

Obviously this is unsigned integer which was received by subtraction bigger number from smaller.
But how it can be if we use:

start_time = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &tp);
…
query_execution
…
end_time = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &tp);

total_time = end_time – start_time;

How we can get query executed in negative time ? It would be too good to be real :).

After some research I found the reason, and it is interesting that I want to share it even few years later. The reason is that different CPU on the same motherboard uses different frequency generators (each CPU has its own frequency generator, though exact implementation depends on CPU vendor), and there is no way to get these generators synchronized. That is if in your server you have 4 CPU, each of them lives on its own time. Usually difference between generators is very small (microseconds matter), but it appears this difference is enough to get negative time.
It happens when your query started on one CPU and during execution it was switched to another CPU (where generator delayed to first CPU) and time between two measurements were small enough to catch difference in generators. In result you have negative time, but it actually, it is more serious, it mean – we can’t get exact time between two points in application.
So designing profiling in your application be aware of. I wonder how realtime sensor systems, where microseconds are important, deal with this.

0 0 votes

Article Rating

26 Comments

Oldest

Newest Most Voted

17 years ago

Easy, just make sure the execution between the checkpoints is on the same cpu (using affinity) or use hardware that has synchronized TSC’s or use counters that are not CPU-specific (slower).

Author

Vadim Tkachenko

17 years ago

PH,

Yeah, affinity makes sense. Thanks!

Seth

17 years ago

Good post. Does this also apply to multi-core machines with just one physical CPU?

Author

Vadim Tkachenko

17 years ago

Seth,

In that time it was dual-core single CPU AMD. I am not fully sure how it is currently, I have heard Intel uses single generator per physical CPU, but I may be wrong in this case.

Shlomi Noach

17 years ago

I have encountered the same phenomenon – but under different reasons:
A machine whose clock was really bad was constantly synchronized (in crond) using ntpdate.
Constantly means about every 30 minutes.
So, once every 30 minutes, there was a time “glitch”. Once every 30 minutes, we would see unreasonably long queries in the slow query log – that’s because the clock was set back during the execution of some queries.
I just built a script which ignores such long queries altogether…

Marki

17 years ago

Using ntpdate on production server is really bad idea. Or precisely changing time is a bad idea. Use only “time shifting”, so that time is not changed in step (possible negative), but kernel makes the time go slower/faster which will adjust the time without making any gap. Or the best solution – configure NTP daemon to have constantly synchronized time.

Shlomi Noach

17 years ago

‘@Marki,
Thanks – will look into it.

Joey Parsons

17 years ago

I think I’ve seen this in other places, like Seconds_Behind_Master in “show slave status” where it’ll consistently be at 0 or 1 and then suddenly jump to something ungodly numbers above ten million and then jump right back to 0 or 1 a tick later.

Anyone else seen that?

Tim Bunce

17 years ago

I’ve noted similar issues in the Devel::NYTProf docs at http://search.cpan.org/~timb/Devel-NYTProf/lib/Devel/NYTProf.pm#CAVEATS

Admin

Peter Zaitsev

17 years ago

Vadim,

I think there is a need for TIME_DIFFERENCE macro consistently used in MySQL or any other code which will handle the case of negative times and will replace them with zeroes.

foo

17 years ago

or use a high resolution monotonic clock…

Admin

Peter Zaitsev

17 years ago

foo,

What exactly do you suggest ? Note we use this for profiling which means potentially hundreds of thousands time gets per second so it must be as fast as possible. Because it is profiling we also do not care if it numbers are a bit inaccurate.

http://blog.timbunce.org/

17 years ago

In general you should allow negative times to be accumulated naturally as they’ll tend to balance out over the longer term. Treating them as zero at the time they’re measured would be wrong. Treating them as zero in reports, after accumulation, is less wrong but it’s still better to educate the users to the harsh realities of clocks on SMP/virtual systems.

I’d strongly suggest you allow the clock id passed to clock_gettime() to be configurable. Then the user can make their own choice about the accuracy vs performance trade-off that suits their needs best.

It seems to be difficult to find good detailed information about the semantics of the various CLOCK_*’s that clock_gettime() supports on various systems. If anyone has better links I’d be grateful for them.

Baron Schwartz

17 years ago

If we allow negative times to accumulate naturally, it will skew things horribly. These times go negative because they’re very small to begin with; it would take many of them to add up enough to cancel out a single negative-gone-huge-positive.

Maatkit has handled these for a long time in mk-query-digest by setting them to zero during log aggregation. I think this is reasonable because if it goes negative, it means it wasn’t measurable anyway.

Guillaume Lefranc

17 years ago

‘@Shlomi, regarding negative times in slow query logs I filed this bug http://bugs.mysql.com/bug.php?id=35396, a patch was committed and should solve the problem. Nevertheless the problem was more about the NTP source that was not synchronizing correctly (ask your sysadmin not to run NTP in a virtual machine, quite a bad idea) and my servers having from -10/+10 sec. time readjust.

http://blog.timbunce.org/

17 years ago

Barron, I was referring to the underlying negative values that could arise from the clock_gettime() calls in the original article. The fact they’re presented as a huge positive my mysqld is obviously a bug.

Shlomi Noach

17 years ago

‘@Guillaume – thanks!

Author

Vadim

17 years ago

TimBunce,

It is two different timers. They are not synchronized. If we get negative time from two different clocks I do not see why we should add this time to total time, it has no good information.
It’s like you measure time for runner and measure start time by your watches and measure finish time by your friend’s watches, and difference between both watches is unknown.
It may happen your friend’s watches is slower by XYZ sec. And by your watches runner started at 10:54 and by your friend’s watches he finished at 10:37.
By subtracting you get -0:17, but does it really have any sense ?

http://blog.timbunce.org/

17 years ago

Vadim. I believe the clocks are synchronized, just not as frequently as we’d like so they drift slightly. I had a link that, as I recall, explained that, but it doesn’t work now (http://webnews.giga.net.tw/article/mailing.freebsd.performance/710).

Re your analogy, imagine you have two watches (2 cpus) and each time you measure an interval you pick a watch at random (thread switched to different cpu). Over thousands of measurements you’ll get a more accurate total if you don’t discard negative durations.

foo

17 years ago

peter,

a monotonic clock won’t go backwards and it probably provides equal or better performance when compared to the other clocks supported by clock_gettime.

Admin

Peter Zaitsev

17 years ago

Tim,

I think math wise you’re right. However on the practical measures dealing with the huge values (not small negative values) is quite inconvenient. We also mainly use this for things like query execution time etc where the gain of accuracy by using negative values is not worth loss of convenience.

Admin

Peter Zaitsev

17 years ago

Vadmin – do you know why Google patches do not use monotonic clock in their patches for wall clock timing ?

Mark Callaghan

16 years ago

Why would CLOCK_THREAD_CPUTIME_ID go backwards? It isn’t a wall clock timer, CLOCK_REALTIME and CLOCK_MONOTONIC do that. Ticks/nanoseconds/useconds of CPU time consumed by a process or thread should never go backwards. I think this is a bug in glibc or the Linux kernel.

Mark Callaghan

16 years ago

And this describes the broken behavior in CLOCK_THREAD_CPUTIME_ID and CLOCK_PROCESS_CPUTIME_ID — http://linux.die.net/man/3/clock_gettime. I have no idea what values are returned by them, but CPU time can’t go backwards.

Tim Bunce

16 years ago

That clock_gettime man page says “The CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID clocks are realized on many platforms using timers from the CPUs (TSC on i386, AR.ITC on Itanium). These registers may differ between CPUs and as a consequence these clocks may return bogus results if a process is migrated to another CPU”

I think “may return bogus results” means that CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID can appear to ‘go backwards’ if a process is migrated to a different cpu.

Tim Vaillancourt

11 years ago

Thanks for this helpful post. 5 years later, it is still useful.

I am encountering this number as “min” latency under sysbench 0.5 when using prepare_parallel.lua and oltp.lua, both of which use more than one CPU cores.

Does anyone have suggestions on how to “detect” this problem from a coding perspective? I use a custom perl wrapper to sysbench 0.5 and so far I’ve added hacky things like a match on this exact number, and generally skipping numbers that are “too long”, but this feels less than ideal :). Any suggestions appreciated!

Thanks!

Tim