What time 18446744073709550.000 means

Sometimes when you do profiling you can see number like this in timestamps. Periodically in our patches there was Query_time: 18446744073709550.000 in slow.log file (well, it was fixed recently, but still appears in other places).

I faced this problem several years ago when only 2-core AMD Opteron systems appeared and I noticed sysbench getting crazy showing query execution time 18446744073709550.000 or like this.

Obviously this is unsigned integer which was received by subtraction bigger number from smaller.
But how it can be if we use:

start_time = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &tp);


end_time = clock_gettime(CLOCK_THREAD_CPUTIME_ID, &tp);

total_time = end_time – start_time;

How we can get query executed in negative time ? It would be too good to be real :).

After some research I found the reason, and it is interesting that I want to share it even few years later. The reason is that different CPU on the same motherboard uses different frequency generators (each CPU has its own frequency generator, though exact implementation depends on CPU vendor), and there is no way to get these generators synchronized. That is if in your server you have 4 CPU, each of them lives on its own time. Usually difference between generators is very small (microseconds matter), but it appears this difference is enough to get negative time.
It happens when your query started on one CPU and during execution it was switched to another CPU (where generator delayed to first CPU) and time between two measurements were small enough to catch difference in generators. In result you have negative time, but it actually, it is more serious, it mean – we can’t get exact time between two points in application.
So designing profiling in your application be aware of. I wonder how realtime sensor systems, where microseconds are important, deal with this.

Share this post

Comments (26)

  • PH

    Easy, just make sure the execution between the checkpoints is on the same cpu (using affinity) or use hardware that has synchronized TSC’s or use counters that are not CPU-specific (slower).

    May 17, 2009 at 1:38 pm
  • Vadim


    Yeah, affinity makes sense. Thanks!

    May 17, 2009 at 3:21 pm
  • Seth

    Good post. Does this also apply to multi-core machines with just one physical CPU?

    May 17, 2009 at 4:27 pm
  • Vadim


    In that time it was dual-core single CPU AMD. I am not fully sure how it is currently, I have heard Intel uses single generator per physical CPU, but I may be wrong in this case.

    May 17, 2009 at 8:50 pm
  • Shlomi Noach

    I have encountered the same phenomenon – but under different reasons:
    A machine whose clock was really bad was constantly synchronized (in crond) using ntpdate.
    Constantly means about every 30 minutes.
    So, once every 30 minutes, there was a time “glitch”. Once every 30 minutes, we would see unreasonably long queries in the slow query log – that’s because the clock was set back during the execution of some queries.
    I just built a script which ignores such long queries altogether…

    May 17, 2009 at 11:07 pm
  • Marki

    Using ntpdate on production server is really bad idea. Or precisely changing time is a bad idea. Use only “time shifting”, so that time is not changed in step (possible negative), but kernel makes the time go slower/faster which will adjust the time without making any gap. Or the best solution – configure NTP daemon to have constantly synchronized time.

    May 18, 2009 at 2:19 am
  • Shlomi Noach

    Thanks – will look into it.

    May 18, 2009 at 2:46 am
  • Joey Parsons

    I think I’ve seen this in other places, like Seconds_Behind_Master in “show slave status” where it’ll consistently be at 0 or 1 and then suddenly jump to something ungodly numbers above ten million and then jump right back to 0 or 1 a tick later.

    Anyone else seen that?

    May 18, 2009 at 8:41 am
  • Tim Bunce

    I’ve noted similar issues in the Devel::NYTProf docs at http://search.cpan.org/~timb/Devel-NYTProf/lib/Devel/NYTProf.pm#CAVEATS

    May 18, 2009 at 12:19 pm
  • peter


    I think there is a need for TIME_DIFFERENCE macro consistently used in MySQL or any other code which will handle the case of negative times and will replace them with zeroes.

    May 18, 2009 at 4:11 pm
  • foo

    or use a high resolution monotonic clock…

    May 18, 2009 at 6:54 pm
  • peter


    What exactly do you suggest ? Note we use this for profiling which means potentially hundreds of thousands time gets per second so it must be as fast as possible. Because it is profiling we also do not care if it numbers are a bit inaccurate.

    May 18, 2009 at 9:05 pm
  • http://blog.timbunce.org/

    In general you should allow negative times to be accumulated naturally as they’ll tend to balance out over the longer term. Treating them as zero at the time they’re measured would be wrong. Treating them as zero in reports, after accumulation, is less wrong but it’s still better to educate the users to the harsh realities of clocks on SMP/virtual systems.

    I’d strongly suggest you allow the clock id passed to clock_gettime() to be configurable. Then the user can make their own choice about the accuracy vs performance trade-off that suits their needs best.

    See also:

    It seems to be difficult to find good detailed information about the semantics of the various CLOCK_*’s that clock_gettime() supports on various systems. If anyone has better links I’d be grateful for them.

    May 19, 2009 at 2:09 am
  • Baron Schwartz

    If we allow negative times to accumulate naturally, it will skew things horribly. These times go negative because they’re very small to begin with; it would take many of them to add up enough to cancel out a single negative-gone-huge-positive.

    Maatkit has handled these for a long time in mk-query-digest by setting them to zero during log aggregation. I think this is reasonable because if it goes negative, it means it wasn’t measurable anyway.

    May 19, 2009 at 5:55 am
  • Guillaume Lefranc

    @Shlomi, regarding negative times in slow query logs I filed this bug http://bugs.mysql.com/bug.php?id=35396, a patch was committed and should solve the problem. Nevertheless the problem was more about the NTP source that was not synchronizing correctly (ask your sysadmin not to run NTP in a virtual machine, quite a bad idea) and my servers having from -10/+10 sec. time readjust.

    May 19, 2009 at 5:58 am
  • http://blog.timbunce.org/

    Barron, I was referring to the underlying negative values that could arise from the clock_gettime() calls in the original article. The fact they’re presented as a huge positive my mysqld is obviously a bug.

    May 19, 2009 at 9:51 am
  • Shlomi Noach

    @Guillaume – thanks!

    May 19, 2009 at 10:24 am
  • Vadim


    It is two different timers. They are not synchronized. If we get negative time from two different clocks I do not see why we should add this time to total time, it has no good information.
    It’s like you measure time for runner and measure start time by your watches and measure finish time by your friend’s watches, and difference between both watches is unknown.
    It may happen your friend’s watches is slower by XYZ sec. And by your watches runner started at 10:54 and by your friend’s watches he finished at 10:37.
    By subtracting you get -0:17, but does it really have any sense ?

    May 19, 2009 at 10:28 am
  • http://blog.timbunce.org/

    Vadim. I believe the clocks are synchronized, just not as frequently as we’d like so they drift slightly. I had a link that, as I recall, explained that, but it doesn’t work now (http://webnews.giga.net.tw/article/mailing.freebsd.performance/710).

    Re your analogy, imagine you have two watches (2 cpus) and each time you measure an interval you pick a watch at random (thread switched to different cpu). Over thousands of measurements you’ll get a more accurate total if you don’t discard negative durations.

    May 19, 2009 at 2:23 pm
  • foo


    a monotonic clock won’t go backwards and it probably provides equal or better performance when compared to the other clocks supported by clock_gettime.

    May 19, 2009 at 3:02 pm
  • peter


    I think math wise you’re right. However on the practical measures dealing with the huge values (not small negative values) is quite inconvenient. We also mainly use this for things like query execution time etc where the gain of accuracy by using negative values is not worth loss of convenience.

    May 20, 2009 at 9:45 am
  • peter

    Vadmin – do you know why Google patches do not use monotonic clock in their patches for wall clock timing ?

    May 20, 2009 at 9:45 am
  • Mark Callaghan

    Why would CLOCK_THREAD_CPUTIME_ID go backwards? It isn’t a wall clock timer, CLOCK_REALTIME and CLOCK_MONOTONIC do that. Ticks/nanoseconds/useconds of CPU time consumed by a process or thread should never go backwards. I think this is a bug in glibc or the Linux kernel.

    January 8, 2010 at 5:25 pm
  • Mark Callaghan

    And this describes the broken behavior in CLOCK_THREAD_CPUTIME_ID and CLOCK_PROCESS_CPUTIME_ID — http://linux.die.net/man/3/clock_gettime. I have no idea what values are returned by them, but CPU time can’t go backwards.

    January 8, 2010 at 5:29 pm
  • Tim Bunce

    That clock_gettime man page says “The CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID clocks are realized on many platforms using timers from the CPUs (TSC on i386, AR.ITC on Itanium). These registers may differ between CPUs and as a consequence these clocks may return bogus results if a process is migrated to another CPU”

    I think “may return bogus results” means that CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID can appear to ‘go backwards’ if a process is migrated to a different cpu.

    January 9, 2010 at 3:56 pm
  • Tim Vaillancourt

    Thanks for this helpful post. 5 years later, it is still useful.

    I am encountering this number as “min” latency under sysbench 0.5 when using prepare_parallel.lua and oltp.lua, both of which use more than one CPU cores.

    Does anyone have suggestions on how to “detect” this problem from a coding perspective? I use a custom perl wrapper to sysbench 0.5 and so far I’ve added hacky things like a match on this exact number, and generally skipping numbers that are “too long”, but this feels less than ideal :). Any suggestions appreciated!



    June 24, 2015 at 8:36 am

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.