Sometimes crucial data sharing is avoided because of compliance rules, organizational policies, or numerous security concerns. The common use cases involve sharing pt-mysql-summary, pt-stalk, and other OS-related details to assist Support Engineers or any other third-party team troubleshoot database-related issues.
In this context, pt-secure-collect is a very important utility from Percona, which helps capture the required information securely and also provides aid in masking the existing information.
Pt-secure-collect helps in collecting, sanitizing, and encrypting data from various sources. By default, this utility collects the output with the help of pt-stalk, pt-summary, and pt-mysql-summary.
Let’s see how this tool works.
Installation
The tool can be installed via the Percona official repositories:
| 1 | sudo yum install percona-toolkit | 
Another option for downloading pt-secure-collect is either via the Percona Toolkit or directly installing the specific tool.
| 1 2 | shell> sudo wget https://downloads.percona.com/downloads/percona-toolkit/3.5.2/binary/redhat/7/x86_64/percona-toolkit-3.5.2-2.el7.x86_64.rpm  shell> sudo yum install percona-toolkit-3.5.2-2.el7.x86_64.rpm | 
OR
| 1 2 | shell> sudo wget percona.com/get/pt-secure-collect  shell> sudo chmod +x pt-secure-collect | 
Now, let’s run our first command to capture the OS/Database-related details from the tool.
| 1 2 | shell> ./pt-secure-collect collect --bin-dir=/usr/bin/ --temp-dir=/home/vagrant/pt/ --mysql-host=localhost --mysql-port=3306 --mysql-user=root --mysql-password=Root@1234 Encryption password | 
Output:
| 1 2 3 4 5 6 7 8 9 10 11 | <span style="font-weight: 400;">INFO[2023-04-22 06:54:10] Temp directory is "/home/vagrant/pt/" INFO[2023-04-22 06:54:10] Creating output file "/home/vagrant/pt/pt-stalk_2023-04-22_06_54_10.out"   INFO[2023-04-22 06:54:10] Running pt-stalk --no-stalk --iterations=2 --sleep=30 --host=localhost --dest=/home/vagrant/pt/ --port=3306 --user=root --password=********   INFO[2023-04-22 06:55:42] Creating output file "/home/vagrant/pt/pt-summary_2023-04-22_06_55_42.out"   INFO[2023-04-22 06:55:42] Running pt-summary                             INFO[2023-04-22 06:55:48] Creating output file "/home/vagrant/pt/pt-mysql-summary_2023-04-22_06_55_48.out"   INFO[2023-04-22 06:55:48] Running pt-mysql-summary --host=localhost --port=3306 --user=root --password=********   INFO[2023-04-22 06:56:01] Sanitizing output collected data               INFO[2023-04-22 06:56:17] Creating tar file "/home/vagrant/pt/pt.tar.gz"   INFO[2023-04-22 06:56:17] Encrypting "/home/vagrant/pt/pt.tar.gz" file into "/home/vagrant/pt/pt.tar.gz.aes"   INFO[2023-04-22 06:56:17] Skipping encrypted file "pt.tar.gz.aes"   </span> | 
So, here the above command collected the data from the “pt*” tools securely. By default, it encrypts the data and asks for the encryption password as well. However, we can skip that part by mentioning this option “ –no-encrypt” option.
Options:-
| 1 2 | --bin-dir => Directory having the Percona Toolkit binaries (pt* tools).  --temp-dir => Temporary directory used for the data collection. | 
Note – In order to run the command successfully all prerequisites binaries of (pt-stalk, pt-summary, and pt-mysql-summary) must be present and included in the command.
Let’s decrypt the file and observe the captured details:
| 1 2 3 | <span style="font-weight: 400;">shell> ./pt-secure-collect decrypt /home/vagrant/pt/pt.tar.gz.aes  --outfile=/home/vagrant/pt/pt.tar.gz Encryption password: INFO[2023-04-22 07:01:55] Decrypting file "/home/vagrant/pt/pt.tar.gz.aes" into "/home/vagrant/pt/pt.tar.gz" </span> | 
Note – Here, we need to provide the password which we used at the time of encryption.
| 1 | --outfile => Write the output to this file. If omitted, the output file name will be the same as the input file, adding the .aes extension. | 
Now, inside the path, we can see the unencrypted file. Followed by this, we can uncompress the file to see the contents.
| 1 2 | shell> /home/vagrant/pt  -rw-------. 1 vagrant vagrant 500K Apr 22 07:01 pt.tar.gz | 
| 1 | shell> tar -xzvf pt.tar.gz | 
Let’s look at a couple of examples where the sensitive data has been altered or masked.
- With pt-secure-collect:
| 1 2 3 4 5 6 | Hostname | hostname  log_error | /var/log/hostname  Config File | /etc/hostname  pid-file        = /var/run/mysqld/hostname  log-error     = /var/log/hostname  socket        = /var/lib/mysql/hostname | 
- Without pt-secure-collect:
| 1 2 3 4 5 6 | Hostname | localhost.localdomain  log_error | /var/log/mysqld.log  Config File | /etc/my.cnf  pid-file       = /var/run/mysqld/mysqld.pid  log-error     = /var/log/mysqld.log  socket        = /var/lib/mysql/mysql.sock | 
Note – We can clearly see some differences in the both types of outputs. With pt-secure-collection the above information was just replaced with some random value(“hostname”).
Now, let’s see how we can sanitize an existing file “pt-mysql-summary.out” and mask the critical information that ends with the below output section.
| 1 | shell> ./pt-secure-collect sanitize --input-file=/home/vagrant/pt-mysql-summary.out > /home/vagrant/pt-mysql-summary_sanitize.out | 
Output:
| 1 2 3 4 5 6 7 8 9 | Hostname | hostname  Pidfile | /var/run/mysqld/hostname (exists)  log_error | /var/log/hostname  Config File | /etc/hostname  pid-file        = /var/run/mysqld/hostname  log-error     = /var/log/hostname  socket        = /var/lib/mysql/hostname  log-error     = /var/log/mariadb/hostname pid-file        = /var/run/mariadb/hostname | 
You may also control the information which you want to skip from masking with settings with option –no-sanitize-hostnames and –no-sanitize-queries.
Here, we see one more example where the critical information, such as “hostname” details inside the OS log file (“/var/log/messages”), is masked/replaced by some other value.
| 1 | shell> sudo ./pt-secure-collect sanitize --input-file=/var/log/messages > /home/vagrant/messages_sanitize.out | 
Output (without pt-secure-collect):
| 1 | Apr 23 03:37:13 localhost pmm-agent: #033[31mERRO#033[0m[2023-04-23T03:37:13.547+00:00] time="2023-04-23T03:37:13Z" level=error msg="Error opening connection to ProxySQL: dial tcp 127.0.0.1:6032: connect: connection refused" source="exporter.go:169"  #033[31magentID#033[0m=/agent_id/04dd6ad8-5c2e-4c52-a624-eb3bc7357651 #033[31mcomponent#033[0m=agent-process #033[31mtype#033[0m=proxysql_exporter | 
Output (with pt-secure-collect):
| 1 | Apr 23 03:37:13 localhost pmm-agent: #033[31mERRO#033[0m[2023-04-23T03:37:13.547+00:00] time="2023-04-23T03:37:13Z" level=error msg="Error opening connection to ProxySQL: dial tcp hostname:6032: connect: connection refused" source="hostname:169"  #033[31magentID#033[0m=/agent_id/04dd6ad8-5c2e-4c52-a624-eb3bc7357651 #033[31mcomponent#033[0m=agent-process #033[31mtype#033[0m=proxysql_exporte | 
Summary
With the help of this tool, both OS and database-level information/logs can be encrypted or masked with some different values to hide the sensitive data. This tool comes in handy while dealing with critical data troubleshooting with any third-party stakeholders and also maintains security/compliance-related practices.
Percona Toolkit is a collection of advanced open source command-line tools, developed and used by the Percona technical staff, that are engineered to perform a variety of MySQL, MariaDB, MongoDB, and PostgreSQL server and system tasks that are too difficult or complex to perform manually.
Learn more about Percona Toolkit
 
 
 
 
						 
						 
						 
						 
						