EmergencyEMERGENCY? Get 24/7 Help Now!

Replication Triggers a Performance Schema Issue on Percona XtraDB Cluster

 | October 21, 2016 |  Posted In: MySQL, Percona XtraDB Cluster

PREVIOUS POST
NEXT POST

Replication triggers a Performance Schema issueIn this blog post, we’ll look at how replication triggers a Performance Schema issue on Percona XtraDB Cluster.

During an upgrade to Percona XtraDB Cluster 5.6, I faced an issue that I wanted to share. In this environment, we set up three Percona XtraDB Cluster nodes (mostly configured as default), copied from a production server. We configured one of the members of the cluster as the slave of the production server.

During the testing process, we found that a full table scan query was taking four times less in the nodes where replication was not configured. After reviewing mostly everything related to the query, we decided to use perf.

We executed:

And the query in another terminal a couple of times. Then we executed:

And we found in the perf.out this useful information:

As you can see, the my_timer_cycles function took 62.03% of the time. Related to this, we found a blog (http://dtrace.org/blogs/brendan/2011/06/27/viewing-the-invisible/) that explained how after enabling the Performance Schema, the performance dropped 10%. So, we decided to disable Performance Schema in order to see if this issue was related to the one described in the blog. We found that after the restart required by disabling Performance Schema, the query was taking the expected amount of time.

We also found out that this was triggered by replication, and nodes rebuilt from this member might have this issue. It was the same if you rebuilt from a member that was OK: the new member might execute the query slower.

Finally, you should take into account that my_timer_cycles seems to be called on a per-row basis, so if your dataset is small you will never notice this issue. However, if you are doing a full table scan of a million row table, you could face this issue.

Conclusion

If you are having query performance issues, and you can’t find the root cause, try disabling or debugging instruments from the Performance Schema to see if that is causing the issue.

PREVIOUS POST
NEXT POST
David Ducos

David studied Computer Science in National University of La Plata and has worked as a DBA consultant since 2008. For the past 3 years he worked with a worldwide platform of free classifieds up until he joined Percona's consulting team in November 2014. David lives near Buenos Aires, Argentina and in his free time loves to spend time with his family.

12 Comments

  • David,

    I wonder if you have more details here. As Brendan Gregg explains this function should be very fast unless something is stalling it. I wonder what conditions caused it in your case

    Also did you use Performance Schema with default setting or some more verbose instrumentation ?

    • Hi Peter,

      It was a fresh install, everything set at default.

      There was nothing particularly stalling it as the only traffic that receives were from the replication channel.

  • Hi David,

    I have some questions here.

    My first one is related to something I don’t really understand, maybe you could confirm if what I understood is what you meant or if I completely understood it wrong.
    So, let’s call the servers M (the production Master), P1 (the PXC node that will act as asynchronous slave from M), P2 and P3 (both PXC nodes), OK?

    P1, P2 and P3 are 3 new nodes with data copied from M (or a n async slave of M), the same data on all 3 nodes.
    P1 , P2 and P3 are in the same PXC and finally P1 is configured as slave of M.

    If you run your specific query on P1 it’s slow, but not on P2 and P3. Just because P1 replicated from M.
    If you add a new node to the cluster (P4) and this node performs SST with P1 as donor, then the query is also slow on P4, but still ok on P2 and P3, right ?!!?

    This is strange, if what I understood is indeed what’s happening…

    My second question is related to the instrumentation that is specific to PXC, I saw in PLAM that now PXC as performance_schema instruments that are not in MySQL Community Edition, neither in Galera… did you try to disable only these (if they are default or enabled) ?

    Thank you.

    • Hi Lefred,

      About “If you add a new node to the cluster (P4) and this node performs SST with P1 as donor, then the query is also slow on P4, but still ok on P2 and P3, right ?!!?” on my tests there were times when, after a SST, P4 was not slow, and there were times when it was slow. P2 or P3 continue ok.
      I’m agree with you, it was strange.

      I didn’t try to disable instruments as this platform was well tested with customer workload, and replication was just an step on the migration path.

  • Fred,

    Performance Schema was only added to PXC 5.7 From what I understand we’re speaking about full table scan queries which seems to point to the timed table access instrumentation in perfomance schema which is disabled by default exactly as its cost is high…. It still looks way to high 🙂 This is why I’m very curious what is the other details of the configuration – OS, Hardware etc.

  • Issues with table IO instrumentation for large scans is what https://dev.mysql.com/worklog/task/?id=7802 (“PERFORMANCE SCHEMA, BATCH TABLE IO”) was implemented for, within MySQL 5.7.

    Do you see these issues with PXC 5.7?

  • And to verify if that is the issue, rather than disabling all of performance schema, you could just try disabling the table IO instrument:

    UPDATE performance_schema.setup_instruments SET enabled = ‘NO’, timed = ‘NO’ WHERE name = ‘wait/io/table/sql/handler’;

  • This reminds me of this bug: https://bugs.mysql.com/bug.php?id=76309 So I once noticed the cycle timer taking way to much time. With ‘perf top’ I traced that back to a slow rdtsc instruction. Rebooting the VM fixed it.

    • I only know about the case I encountered, which was probably a VMWare or firmware bug. However more people might have had this issue w/o finding the real root-cause. That’s also why I filed the bug, that should make it easier to discover.

      With different kernel versions, types of hardware and virtualization there are many things that can go wrong.

      I found this with lots of info about rdtsc: http://oliveryang.net/2015/09/pitfalls-of-TSC-usage/

Leave a Reply to Peter Zaitsev Cancel reply