Replication Triggers a Performance Schema Issue on Percona XtraDB Cluster

Replication triggers a Performance Schema issueIn this blog post, we’ll look at how replication triggers a Performance Schema issue on Percona XtraDB Cluster.

During an upgrade to Percona XtraDB Cluster 5.6, I faced an issue that I wanted to share. In this environment, we set up three Percona XtraDB Cluster nodes (mostly configured as default), copied from a production server. We configured one of the members of the cluster as the slave of the production server.

During the testing process, we found that a full table scan query was taking four times less in the nodes where replication was not configured. After reviewing mostly everything related to the query, we decided to use perf.

We executed:

And the query in another terminal a couple of times. Then we executed:

And we found in the perf.out this useful information:

As you can see, the my_timer_cycles function took 62.03% of the time. Related to this, we found a blog ( that explained how after enabling the Performance Schema, the performance dropped 10%. So, we decided to disable Performance Schema in order to see if this issue was related to the one described in the blog. We found that after the restart required by disabling Performance Schema, the query was taking the expected amount of time.

We also found out that this was triggered by replication, and nodes rebuilt from this member might have this issue. It was the same if you rebuilt from a member that was OK: the new member might execute the query slower.

Finally, you should take into account that my_timer_cycles seems to be called on a per-row basis, so if your dataset is small you will never notice this issue. However, if you are doing a full table scan of a million row table, you could face this issue.


If you are having query performance issues, and you can’t find the root cause, try disabling or debugging instruments from the Performance Schema to see if that is causing the issue.

Share this post

Comments (12)

  • Peter Zaitsev


    I wonder if you have more details here. As Brendan Gregg explains this function should be very fast unless something is stalling it. I wonder what conditions caused it in your case

    Also did you use Performance Schema with default setting or some more verbose instrumentation ?

    October 21, 2016 at 6:12 pm
    • David Ducos

      Hi Peter,

      It was a fresh install, everything set at default.

      There was nothing particularly stalling it as the only traffic that receives were from the replication channel.

      October 28, 2016 at 1:50 pm
  • lefred

    Hi David,

    I have some questions here.

    My first one is related to something I don’t really understand, maybe you could confirm if what I understood is what you meant or if I completely understood it wrong.
    So, let’s call the servers M (the production Master), P1 (the PXC node that will act as asynchronous slave from M), P2 and P3 (both PXC nodes), OK?

    P1, P2 and P3 are 3 new nodes with data copied from M (or a n async slave of M), the same data on all 3 nodes.
    P1 , P2 and P3 are in the same PXC and finally P1 is configured as slave of M.

    If you run your specific query on P1 it’s slow, but not on P2 and P3. Just because P1 replicated from M.
    If you add a new node to the cluster (P4) and this node performs SST with P1 as donor, then the query is also slow on P4, but still ok on P2 and P3, right ?!!?

    This is strange, if what I understood is indeed what’s happening…

    My second question is related to the instrumentation that is specific to PXC, I saw in PLAM that now PXC as performance_schema instruments that are not in MySQL Community Edition, neither in Galera… did you try to disable only these (if they are default or enabled) ?

    Thank you.

    October 21, 2016 at 6:36 pm
    • David Ducos

      Hi Lefred,

      About “If you add a new node to the cluster (P4) and this node performs SST with P1 as donor, then the query is also slow on P4, but still ok on P2 and P3, right ?!!?” on my tests there were times when, after a SST, P4 was not slow, and there were times when it was slow. P2 or P3 continue ok.
      I’m agree with you, it was strange.

      I didn’t try to disable instruments as this platform was well tested with customer workload, and replication was just an step on the migration path.

      October 28, 2016 at 2:00 pm
  • Peter Zaitsev


    Performance Schema was only added to PXC 5.7 From what I understand we’re speaking about full table scan queries which seems to point to the timed table access instrumentation in perfomance schema which is disabled by default exactly as its cost is high…. It still looks way to high 🙂 This is why I’m very curious what is the other details of the configuration – OS, Hardware etc.

    October 21, 2016 at 6:50 pm
  • Mark Leith

    Issues with table IO instrumentation for large scans is what (“PERFORMANCE SCHEMA, BATCH TABLE IO”) was implemented for, within MySQL 5.7.

    Do you see these issues with PXC 5.7?

    October 22, 2016 at 2:14 am
    • David Ducos

      Hi Mark,
      Sorry, I didn’t test it on PXC 5.7

      October 28, 2016 at 2:02 pm
  • Mark Leith

    And to verify if that is the issue, rather than disabling all of performance schema, you could just try disabling the table IO instrument:

    UPDATE performance_schema.setup_instruments SET enabled = ‘NO’, timed = ‘NO’ WHERE name = ‘wait/io/table/sql/handler’;

    October 22, 2016 at 2:17 am
  • tecogyan


    October 22, 2016 at 7:06 am
  • Daniël van Eeden

    This reminds me of this bug: So I once noticed the cycle timer taking way to much time. With ‘perf top’ I traced that back to a slow rdtsc instruction. Rebooting the VM fixed it.

    October 26, 2016 at 6:54 am
  • Peter Zaitsev

    Daniel, This is very interesting. Are there some known cases where rdtsc would be very slow ?

    October 26, 2016 at 8:30 am
    • Daniël van Eeden

      I only know about the case I encountered, which was probably a VMWare or firmware bug. However more people might have had this issue w/o finding the real root-cause. That’s also why I filed the bug, that should make it easier to discover.

      With different kernel versions, types of hardware and virtualization there are many things that can go wrong.

      I found this with lots of info about rdtsc:

      October 26, 2016 at 2:46 pm

Comments are closed.

Use Percona's Technical Forum to ask any follow-up questions on this blog topic.