First, a thank you to everyone who attended the webinar Today, I appreciate your time and nice comments. As promised, here are answers to questions that couldn’t be answered during the talk:
The manual has full details, but it’s important to know that the latest release for every tool is also available for direct download in a url like this: https://www.percona.com/get/<tool-name>. I use this a lot when I just need a specific tool and I’m working on a server without the toolkit installed.
As part of its safeguards, pt-table-checksum monitors lag on replicas, and pauses as needed to let them catch up (it also pauses if replication stops on a replica). In order to do this, it must connect to all replicas off the master server, and it tries to use the same credentials used in the DSN specified for the master. This is why I recommended having consistent credentials across hosts, but if your setup is different, you can control the way the tool discovers and connects to slaves with the –recursion-method option.
This was replied during the webinar, but I think it is important enough to repeat it here: Both these tools should be run from the master. Writing to slaves is the #1 cause I have seen for replication problems, so it only makes sense that the tools are designed to write to the master.
It will work just fine, though it will obviously take more time! The tool processes the database one table at a time, and processes each table one chunk at a time, checksumming each chunk with a single query. It monitors master load to suspend itself if there is impact, and has other safeguards (as mentioned in a previous reply), so it is safe to use even for very large data sets.
Here‘s a Vagrant box my colleague Marcos Albe and myself prepared for our Percona Toolkit Tutorial at the last Percona Live New York, and it includes an example of pt-table-checksum and pt-table-sync, among others.
Unfortunately, I don’t know about which specific output this question is, so I’ll review all the slides which include output from the tools just in case:
The rename table itself should be fast, but as it needs to lock the table, long running transactions that lock (part of) the table can make the tool abort with a ‘Lock wait timeout exceeded’ error. The table and triggers created during the process are dropped, and the original table is left unmodified.
The same attendee asked about any performance differences when using the tool with or without innodb_file_per_table, and the first thing that comes to my mind is that when using a single shared tablespace, using this on very large tables can make your ibdata file grow a lot. As is known, this space won’t be recovered when the auxiliary table is dropped unless you’re using innodb_file_per_table.
Yes, though you may want to test, as even for large data set sizes, it may be faster to run the alter table command directly (specially so when dropping an index, based on my experience).
The Percona Monitoring Plugins include cacti templates and nagios plugins, and each should be used for its appropriate use case. The cacti templates provide very good performance metrics.
Yes, pt-sift does not collect any diagnosis data. It is only a way to get a bird’s eye view of a capture (or a set of captures) obtained with pt-stalk. If the latest release is invoked without arguments, it looks for captures under /var/lib/pt-stalk/
This is a very good question, and in the end, it is the DBA’s judgement call. As I explained in the webinar, once you have data differences between two (or more) servers, if using statement based replication, it is possible that new differences are introduced while the sync is happening. In other words: bad data usually leads to more bad data! My preference is to use pt-table-sync if the damage is contained (i.e. one or a few tables, in rows that probably won’t be updated short term), or if the data set size, storage engine, network speed, or other factors make cloning a server too difficult, and just clone the host otherwise. If you’re using Innodb only, streaming a backup with xtrabackup could be a good and safe alternative.
Thanks again to all those who attended, if you have further questions, just reply here or drop me an email.