Evaluation of PMP Profiling ToolsAlexey Stroganov
In this blog post, we’ll look at some of the available PMP profiling tools.
While debugging or analyzing issues with Percona Server for MySQL, we often need a quick understanding of what’s happening on the server. Percona experts frequently use the pt-pmp tool from Percona Toolkit (inspired by http://poormansprofiler.org).
The pt-pmp tool collects application stack traces GDB and then post-processes them. From this you get a condensed, ordered list of the stack traces. The list helps you understand where the application spent most of the time: either running something or waiting for something.
Getting a profile with pt-pmp is handy, but it has a cost: it’s quite intrusive. In order to get stack traces, GDB has to attach to each thread of your application, which results in interruptions. Under high loads, these stops can be quite significant (up to 15-30-60 secs). This means that the pt-pmp approach is not really usable in production.
Below I’ll describe how to reduce GDB overhead, and also what other tools can be used instead of GDB to get stack traces.
By default, the symbol resolution process in GDB is very slow. As a result, getting stack traces with GDB is quite intrusive (especially under high loads).There are two options available that can help notably reduce GDB tracing overhead:
Shell123456# to check if index already exists:readelf -S | grep gdb_index# to generate index:gdb -batch mysqld -ex "save gdb-index /tmp" -ex "quit"# to embed index:objcopy --add-section .gdb_index=tmp/mysqld.gdb-index --set-section-flags .gdb_index=readonly mysqld mysqld
- Use readnever patch. RHEL and other distros based on it include GDB with the readnever patch applied. This patch allows you to avoid unnecessary symbol resolving with the --readnever option. As a result you get up to 10 times better speed.
- Use gdb_index. This feature was added to address symbol resolving issue by creating and embedding a special index into the binaries. This index is quite compact: I’ve created and embedded gdb_index for Percona server binary (it increases the size around 7-8MB). The addition of the gdb_index speeds up obtaining stack traces/resolving symbols two to three times.
- eu-stack (elfutils)
The eu-stack from the elfutils package prints the stack for each thread in a process or core file.Symbol resolving also is not very optimized in eu-stack. By default, if you run it under load it will take even more time than GDB. But eu-stack allows you to skip resolving completely, so it can get stack frames quickly and then resolve them without any impact on the workload later.
Quickstack is a tool from Facebook that gets stack traces with minimal overheads.
Now let’s compare all the above profilers. We will measure the amount of time it needs to take all the stack traces from Percona Server for MySQL under a high load (sysbench OLTP_RW with 512 threads).
The results show that eu-stack (without resolving) got all stack traces in less than a second, and that Quickstack and GDB (with the readnever patch) got very close results. For other profilers, the time was around two to five times higher. This is quite unacceptable for profiling (especially in production).
There is one more note regarding the pt-pmp tool. The current version only supports GDB as the profiler. However, there is a development version of this tool that supports GDB, Quickstack, eu-stack and eu-stack with offline symbol resolving. It also allows you to look at stack traces for specific threads (tids). So for instance, in the case of Percona Server for MySQL, we can analyze just the purge, cleaner or IO threads.
Below are the command lines used in testing:
# gdb & gdb+gdb_index
time gdb -ex "set pagination 0" -ex "thread apply all bt" -batch -p `pidof mysqld` > /dev/null
time gdb --readnever -ex "set pagination 0" -ex "thread apply all bt" -batch -p `pidof mysqld` > /dev/null
time eu-stack -s -m -p `pidof mysqld` > /dev/null
# eu-stack without resolving
time eu-stack -q -p `pidof mysqld` > /dev/null
# quickstack - 1 sample
time quickstack -c 1 -p `pidof mysqld` > /dev/null
# quickstack - 1000 samples
time quickstack -c 1000 -p `pidof mysqld` > /dev/null