September 30, 2014

How to track down the source of Aborted_connects

Yesterday I helped someone who was seeing a lot of “server has gone away” error messages on his website. While investigating this problem, I noticed several things amiss, which appeared to be related but really weren’t. The biggest measurable sign was

These two status variables are actually unrelated (see the manual page that explains them). The first was related to the errors the client was seeing: the server was closing inactive connections after a while, and I fixed it by increasing the wait_timeout configuration variable.

The second error does not indicate that an active connection is closed at all. Rather, it shows that a connection cannot be made for some reason. Perhaps it’s networking, or perhaps there’s an issue with permissions or something else. The first thing I did was look for packet loss between the database server and the web server; the network appeared to be working fine.

With that ruled out (at least, to my satisfaction) I turned to tcpdump to see what was happening with these connections. I ran the following command in one window of my screen session, so I could see when a connection was aborted:

And then I started tcpdump in another window:

After I saw an aborted connection, I cancelled tcpdump and looked at the resulting file. Inspecting the session with tcpdump -r showed that there was a complete TCP session; nothing bad was happening at that layer. So I used the strings utility to look at the text sent in the packets:

I’ve anonymized the offending IP address. However, I checked the server’s grant tables and indeed. that IP address (which is a machine in the local network) is not allowed to connect.

I don’t actually use tcpdump much, but this was a fun little exercise that I thought I’d share with you.

About Baron Schwartz

Baron is the lead author of High Performance MySQL.
He is a former Percona employee.

Comments

  1. Nathan Labenz says:

    Baron – I am slow to thank you for responding, but thanks! I can’t get enough of this blog and it’s great that you guys will still respond to questions on old posts. I am approaching Percona fanboy-dom

  2. Baron Schwartz says:

    Nathan, sorry I am slow to respond. There won’t be anything in tcpdump because the connection is actually closed, so the server has no client to send the error message to.

  3. Nathan Labenz says:

    Baron -

    I am using this strategy to try to identify the cause of Aborted_connects that I suspect are due to network timeouts. Any idea what the error output would look like? I am using grep to examine the tcpdump output but I can’t seem to find the error message related to Aborted_connects.

    Thanks for any suggestions!

  4. Nathan Labenz says:

    Baron -

    Any idea what the error output looks like for “Aborted_connects” caused by network timeouts?

    I am using this technique to try to track down the root cause of periodic “Aborted_connects” and so far none of the obvious keywords seem to appear in the tcpdump output.

    Any suggestions will be much appreciated. Thanks!

    PS – Sorry if this is posted twice. I didn’t get a confirmation or waiting for moderation or anything the first time I tried.

  5. KB Benton says:

    Like Antony above, I too agree that mysqld needs audit metrics. Our information protection group is asking us to log all connections to mysqld in utmp-like style so we can see when an account logged in and out and where the connection came from.

  6. KB Benton says:

    Good information, Baron – thanks. :-) Like you, I don’t use tcpdump very often to troubleshoot problems but your example is very helpful.

    Tracking down aborted clients can be a real pain if the DBA isn’t regularly monitoring the error logs and allowing mysqld to log warnngs. Failing to keep up with the warnings, however, can cause disk full. I wrote an article you may find interesting on it here: http://kbcmdba.blogspot.com/2011/05/heavy-improper-connection-termination.html.

  7. Antony Curtis says:

    If you were to dig out one of my earlier Audit patches for MySQL, you could have hooked a plugin on the aborted connect event and no need to mess about with tcpdump.

  8. It’s too bad it’s not in the server! There’s no possibility of patching the server in most cases like this.

  9. Many things that are not being built in/for proxy and enterprise monitor should be direct server features or plugins.
    That not being the case is not a technical decision but one of product marketing. And I *DO NOT LIKE IT*.
    IMHO the “but we need to make money” argument is a wussy cop-out. It’s just so that people don’t have to use their imagination a bit more. What a pity. Such good things could come of looking for real opportunities.

  10. peter says:

    Baron, Antony

    Indeed Audit extensions are must to have in MySQL.
    At least very basics it would be good to split Aborted_connects to network related and Authentication_Failures as these are very different events.
    Aborted_clients may also benefit splitting somehow because unexpected aborts (network related) are not the same as aborts when client exists without closing connection or because wait_timeout seconds passed.

  11. sasi says:

    I have read this somewhere

    MySQL internally increments a per-host counter it uses to track how many “bad connections” it has seen. When this counter exceeds the value of max_connect_errors MySQL will block the host from connecting again until you issue a FLUSH HOSTS command.

    By the above statement it seems clear that mysql server stores the bad client hosts somewhere.I would like to see what are those(if any).Can somebody help?..

  12. Unfortunately it’s invisible :-( I think someone somewhere was working on a SHOW HOSTS patch, but I don’t think it made it into any codebase I know of.

  13. IIRC it was Jeremy Cole who had that patch. We can probably get it into MariaDB. It makes sense to have that info visible.

  14. sasi says:

    Hi Baron and Arjen,
    Thanks a lot for your responses :) .Will be waiting for that patch to come live.It is very useful in trouble shooting.

  15. Daniel says:

    I know this is an old post, but I’m hoping it’s still monitored. Do you have any suggestions on how to accomplish something similar but with a socket rather than TCP?

  16. I’ve never found a way to snoop on data passing through a Unix socket.

  17. Daniel says:

    Darn. Well, thanks for the quick reply. This has me at a loss.

  18. Nick S. says:

    Or…..you can just set the log-warnings variable to greater than 1.

    This will start logging the aborted connections in the mysql log.

    It’s not as detailed as tcpdump. However, in some cases running tcpdump is not an option.

  19. Nick S. says:

    Clarification: the log-warnings variable only addresses “Aborted Clients” counter.

  20. @Nick, no, log-warnings=2 addresses “aborted_connects”, which is what the discussion is about:
    http://dev.mysql.com/doc/refman/5.5/en/server-options.html#option_mysqld_log-warnings

    However it does NOT log for every aborted connects. I have a server with like 10 aborted connects per second, and only one log entry per 30 seconds…

Speak Your Mind

*