Buy Percona ServicesBuy Now!

What time 18446744073709550.000 means

 | May 17, 2009 |  Posted In: Insight for DBAs


Sometimes when you do profiling you can see number like this in timestamps. Periodically in our patches there was Query_time: 18446744073709550.000 in slow.log file (well, it was fixed recently, but still appears in other places).

I faced this problem several years ago when only 2-core AMD Opteron systems appeared and I noticed sysbench getting crazy showing query execution time 18446744073709550.000 or like this.

Obviously this is unsigned integer which was received by subtraction bigger number from smaller.
But how it can be if we use:

start_time = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &tp);


end_time = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &tp);

total_time = end_time – start_time;

How we can get query executed in negative time ? It would be too good to be real :).

After some research I found the reason, and it is interesting that I want to share it even few years later. The reason is that different CPU on the same motherboard uses different frequency generators (each CPU has its own frequency generator, though exact implementation depends on CPU vendor), and there is no way to get these generators synchronized. That is if in your server you have 4 CPU, each of them lives on its own time. Usually difference between generators is very small (microseconds matter), but it appears this difference is enough to get negative time.
It happens when your query started on one CPU and during execution it was switched to another CPU (where generator delayed to first CPU) and time between two measurements were small enough to catch difference in generators. In result you have negative time, but it actually, it is more serious, it mean – we can’t get exact time between two points in application.
So designing profiling in your application be aware of. I wonder how realtime sensor systems, where microseconds are important, deal with this.

Vadim Tkachenko

Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Officer. Vadim leads Percona Labs, which focuses on technology research and performance evaluations of Percona’s and third-party products. Percona Labs designs no-gimmick tests of hardware, filesystems, storage engines, and databases that surpass the standard performance and functionality scenario benchmarks. Vadim’s expertise in LAMP performance and multi-threaded programming help optimize MySQL and InnoDB internals to take full advantage of modern hardware. Oracle Corporation and its predecessors have incorporated Vadim’s source code patches into the mainstream MySQL and InnoDB products. He also co-authored the book High Performance MySQL: Optimization, Backups, and Replication 3rd Edition.


  • Easy, just make sure the execution between the checkpoints is on the same cpu (using affinity) or use hardware that has synchronized TSC’s or use counters that are not CPU-specific (slower).

  • Seth,

    In that time it was dual-core single CPU AMD. I am not fully sure how it is currently, I have heard Intel uses single generator per physical CPU, but I may be wrong in this case.

  • I have encountered the same phenomenon – but under different reasons:
    A machine whose clock was really bad was constantly synchronized (in crond) using ntpdate.
    Constantly means about every 30 minutes.
    So, once every 30 minutes, there was a time “glitch”. Once every 30 minutes, we would see unreasonably long queries in the slow query log – that’s because the clock was set back during the execution of some queries.
    I just built a script which ignores such long queries altogether…

  • Using ntpdate on production server is really bad idea. Or precisely changing time is a bad idea. Use only “time shifting”, so that time is not changed in step (possible negative), but kernel makes the time go slower/faster which will adjust the time without making any gap. Or the best solution – configure NTP daemon to have constantly synchronized time.

  • I think I’ve seen this in other places, like Seconds_Behind_Master in “show slave status” where it’ll consistently be at 0 or 1 and then suddenly jump to something ungodly numbers above ten million and then jump right back to 0 or 1 a tick later.

    Anyone else seen that?

  • Vadim,

    I think there is a need for TIME_DIFFERENCE macro consistently used in MySQL or any other code which will handle the case of negative times and will replace them with zeroes.

  • foo,

    What exactly do you suggest ? Note we use this for profiling which means potentially hundreds of thousands time gets per second so it must be as fast as possible. Because it is profiling we also do not care if it numbers are a bit inaccurate.

  • In general you should allow negative times to be accumulated naturally as they’ll tend to balance out over the longer term. Treating them as zero at the time they’re measured would be wrong. Treating them as zero in reports, after accumulation, is less wrong but it’s still better to educate the users to the harsh realities of clocks on SMP/virtual systems.

    I’d strongly suggest you allow the clock id passed to clock_gettime() to be configurable. Then the user can make their own choice about the accuracy vs performance trade-off that suits their needs best.

    See also:

    It seems to be difficult to find good detailed information about the semantics of the various CLOCK_*’s that clock_gettime() supports on various systems. If anyone has better links I’d be grateful for them.

  • If we allow negative times to accumulate naturally, it will skew things horribly. These times go negative because they’re very small to begin with; it would take many of them to add up enough to cancel out a single negative-gone-huge-positive.

    Maatkit has handled these for a long time in mk-query-digest by setting them to zero during log aggregation. I think this is reasonable because if it goes negative, it means it wasn’t measurable anyway.

  • @Shlomi, regarding negative times in slow query logs I filed this bug, a patch was committed and should solve the problem. Nevertheless the problem was more about the NTP source that was not synchronizing correctly (ask your sysadmin not to run NTP in a virtual machine, quite a bad idea) and my servers having from -10/+10 sec. time readjust.

  • TimBunce,

    It is two different timers. They are not synchronized. If we get negative time from two different clocks I do not see why we should add this time to total time, it has no good information.
    It’s like you measure time for runner and measure start time by your watches and measure finish time by your friend’s watches, and difference between both watches is unknown.
    It may happen your friend’s watches is slower by XYZ sec. And by your watches runner started at 10:54 and by your friend’s watches he finished at 10:37.
    By subtracting you get -0:17, but does it really have any sense ?

  • Vadim. I believe the clocks are synchronized, just not as frequently as we’d like so they drift slightly. I had a link that, as I recall, explained that, but it doesn’t work now (

    Re your analogy, imagine you have two watches (2 cpus) and each time you measure an interval you pick a watch at random (thread switched to different cpu). Over thousands of measurements you’ll get a more accurate total if you don’t discard negative durations.

  • peter,

    a monotonic clock won’t go backwards and it probably provides equal or better performance when compared to the other clocks supported by clock_gettime.

  • Tim,

    I think math wise you’re right. However on the practical measures dealing with the huge values (not small negative values) is quite inconvenient. We also mainly use this for things like query execution time etc where the gain of accuracy by using negative values is not worth loss of convenience.

  • Why would CLOCK_THREAD_CPUTIME_ID go backwards? It isn’t a wall clock timer, CLOCK_REALTIME and CLOCK_MONOTONIC do that. Ticks/nanoseconds/useconds of CPU time consumed by a process or thread should never go backwards. I think this is a bug in glibc or the Linux kernel.

  • And this describes the broken behavior in CLOCK_THREAD_CPUTIME_ID and CLOCK_PROCESS_CPUTIME_ID — I have no idea what values are returned by them, but CPU time can’t go backwards.

  • That clock_gettime man page says “The CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID clocks are realized on many platforms using timers from the CPUs (TSC on i386, AR.ITC on Itanium). These registers may differ between CPUs and as a consequence these clocks may return bogus results if a process is migrated to another CPU”

    I think “may return bogus results” means that CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID can appear to ‘go backwards’ if a process is migrated to a different cpu.

  • Thanks for this helpful post. 5 years later, it is still useful.

    I am encountering this number as “min” latency under sysbench 0.5 when using prepare_parallel.lua and oltp.lua, both of which use more than one CPU cores.

    Does anyone have suggestions on how to “detect” this problem from a coding perspective? I use a custom perl wrapper to sysbench 0.5 and so far I’ve added hacky things like a match on this exact number, and generally skipping numbers that are “too long”, but this feels less than ideal :). Any suggestions appreciated!



Comments are closed