Have you ever had a case where you needed to find a process which sent a HUP/KILL/TERM or other signal to your database? Let me rephrase. Did you ever have to find which process messed up your night? 😉 If so, you might want to read on. I’m going to tell you how you can find it.
Granted, on small and/or meticulously managed systems tracking down the culprit is probably not a big deal. You can likely identify your process simply by checking what processes have enough privileges to send mysqld a HUP/KILL/TERM signal. However, frequently we see cases where this may not work or the elimination process would be too tedious to execute.
We recently had a case where a process was frequently sending SIGHUPs to mysqld and the customer asked us to see if we could get rid of his annoyance. This blog is the direct result of a discussion I had with my colleague Francisco Bordenave, on options available to deal with his issue. I’m only going to cover a few of them in this blog but I imagine that most of you will be able to find one that will work for your case. Note that most tracing tools add some overhead to the system being investigated. The tools presented in the following are designed to be lightweight so the impact should be well within acceptable range for most environments.
DISCLAIMER: While writing this blog I discovered that David Busby has also discussed one of the tools that I’m going to cover in his article. For those who have read the article note that I’m going to cover other tools as well and I will also cover a few extra SystemTap details in this blog. For those who haven’t yet had chance to read David’s blog, you can read it here.
All right, let’s see what “low hanging tools” there are available to us to deal with our issue!
In this article I’m going to focus on Linux as that’s what people in the MySQL community seem to care about most nowadays. The tools that I will discuss will be SystemTap, Perf and Audit. If you feel that you would like to read about the rest, let me know and I will cover the rest of the options in a followup article.
I’m going to set up SystemTap on a recent, 64 bit CentOS 7 box. I will only cover basic install, you can find more about how to install SystemTap here.
The strength of SystemTap is definitely its flexibility, potentially the best tool for solving our problem on the Linux platform. It’s been around for some time and is generally regarded mature but I would recommend to test your “tapscripts” in dev/qa before you run them in production.
Follow below steps to install SystemTap:
|
1 |
[root@centos7]~# sed -i 's/enabled=0/enabled=1/' /etc/yum.repos.d/CentOS-Debuginfo.repo<br>[root@centos7]~# yum repolist<br>...<br>base-debuginfo/x86_64 CentOS-7 - Debuginfo 1,688<br>...<br> |
|
1 |
[root@centos7]~# yum install kernel-debuginfo kernel-debuginfo-common kernel-devel<br>[root@centos7]~# yum install systemtap systemtap-runtime <br> |
Create a tapscript like the one below:
|
1 |
[root@centos7]~# cat find_sighupper.stp <br>#!/usr/bin/stap<br><br># Prints information on process which sent HUP signal to mysqld<br><br>probe begin {<br> printf("%-26s %-8s %-5s %-8s %-5sn", "TIME", "SOURCE", "SPID", "TARGET", "TPID");<br>}<br><br>probe nd_syscall.kill.return {<br> sname = @entry(execname());<br> spid = @entry(pid());<br> sig = @entry(uint_arg(2));<br> tpid = @entry(uint_arg(1));<br> tname = pid2execname(tpid);<br> time = ctime(gettimeofday_s());<br> if (sig == 1 && tname == "mysqld") <br> printf("%-26s %-8s %-5d %-8s %-5dn", time, sname, spid, tname, tpid);<br>}<br> |
Then run the tap script in a dedicated terminal:
|
1 |
[root@centos7]~# stap find_sighupper.stp <br>TIME SOURCE SPID TARGET TPID <br><br> |
Send your HUP signal to mysqld from another terminal:
|
1 |
[root@centos7]~# kill -1 1984<br> |
The culprit should will show up on your first window like so:
|
1 |
[root@centos7]~# stap find_sighupper.stp <br>TIME SOURCE SPID TARGET TPID <br>Thu Feb 26 21:20:44 2015 kill 6326 mysqld 1984 <br>^C<br> |
Note that with this solution I was able to define fairly nice constraints relatively easily. With a single probe (well, quasi, as @entry refers back to the callee) I was able to get all this information and filter out HUP signals sent to mysqld. No other filtering is necessary!
Perf is another neat tool to have. As its name implies, it was originally developed for lightweight profiling, to use the performance counters subsystem in Linux. It became fairly popular and got extended many times over these past years. Since it happens to have probes we can leverage, we are going to use it!
As you can see, installing Perf is relatively simple.
|
1 |
# yum install perf<br> |
Start perf in a separate terminal window. I’m only going to run it for a minute but I could run it in screen for a longer period of time.
|
1 |
[root@centos7 ~]# perf record -a -e syscalls:sys_enter_kill sleep 60<br> |
In a separate terminal window send your test and obtain the results via “perf script”:
|
1 |
[root@centos7 ~]# echo $$<br>11380<br>[root@centos7 ~]# pidof mysqld<br>1984<br>[root@centos7 ~]# kill -1 1984<br>[root@centos7 ~]# perf script<br># ========<br># captured on: Thu Feb 26 14:25:02 2015<br># hostname : centos7.local<br># os release : 3.10.0-123.20.1.el7.x86_64<br># perf version : 3.10.0-123.20.1.el7.x86_64.debug<br># arch : x86_64<br># nrcpus online : 2<br># nrcpus avail : 2<br># cpudesc : Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz<br># cpuid : GenuineIntel,6,70,1<br># total memory : 1885464 kB<br># cmdline : /usr/bin/perf record -a -e syscalls:sys_enter_kill sleep 60 <br># event : name = syscalls:sys_enter_kill, type = 2, config = 0x9b, config1 = 0x0, config2 = 0x0, excl_usr = 0, exc<br># HEADER_CPU_TOPOLOGY info available, use -I to display<br># HEADER_NUMA_TOPOLOGY info available, use -I to display<br># pmu mappings: software = 1, tracepoint = 2, breakpoint = 5<br># ========<br>#<br> bash 11380 [000] 6689.348219: syscalls:sys_enter_kill: pid: 0x000007c0, sig: 0x00000001<br> |
As you can see in above output process “bash” with pid of 11380 signalled pid 0x07c0 (decimal: 1984) a HUP signal (0x01). Thus, we found our culprit with this method as well.
You can read more about Audit in the Red Hat Security Guide.
Depending on your OS installation, it may be already installed.
If case it is not, you can install it as follows:
|
1 |
[root@centos7 ~]# yum install audit<br> |
When you are done installing, start your trace and track 64 bit kill system calls that send HUP signals with signal ID of 1:
|
1 |
[root@centos7]~# auditctl -l<br>No rules<br>[root@centos7]~# auditctl -a exit,always -F arch=b64 -S kill -F a1=1<br>[root@centos7]~# auditctl -l <br>LIST_RULES: exit,always arch=3221225534 (0xc000003e) a1=1 (0x1) syscall=kill<br>[root@centos7]~# auditctl -s<br>AUDIT_STATUS: enabled=1 flag=1 pid=7010 rate_limit=0 backlog_limit=320 lost=0 backlog=0<br>[root@centos7]~# pidof mysqld <br>1984<br>[root@centos7]~# kill -1 1984<br>[root@centos7]~# tail -2 /var/log/audit/audit.log <br>type=SYSCALL msg=audit(1425007202.384:682): arch=c000003e syscall=62 success=yes exit=0 a0=7c0 a1=1 a2=a a3=7c0 items=0 ppid=11380 pid=3319 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 comm="zsh" exe="/usr/bin/zsh" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)<br>type=OBJ_PID msg=audit(1425007202.384:682): opid=1984 oauid=-1 ouid=995 oses=-1 obj=system_u:system_r:mysqld_t:s0 ocomm="mysqld"<br> |
As you can see from above output, the results showed up nicely in the system audit.log. From the log it’s clear that I sent my SIGHUP to mysqld (pid 1984, “opid” field) from zsh (see the command name in the “comm” field) via the 64 bit kill syscall. Thus, mischief managed, once again!
In this blog I presented you three different tools to help you trace down sources of signals. The three tools each have their own strengths. SystemTap is abundant of features and really nicely scriptable. The additional features of auditd may make it appealing to deploy to your host. Perf is a great tool for CPU profiling and you might want to install it solely for that reason. On the other hand, your distribution might not have support compiled in its kernel or may make the setup harder for given tool. In my experience most modern distributions support the tools discussed here so the choice comes down to personal preference or convenience.
In case you were wondering, I often pick auditd because it is often already installed. SystemTap might be a bit more complicated to setup but I would likely invest some extra time into the setup if my case is more complex. I primary use perf for CPU tracing and tend to think of the other two tools before I think of perf for tracing signals.
Hope you enjoyed reading! Happy [h/t]racking!