October 22, 2014

Rare evil MySQL Bug

There is the rare bug which I ran into every so often. Last time I’ve seen it about 3 years ago on MySQL 4.1 and I hoped it is long fixed since… but it looks like it is not. I now get to see MySQL 5.4.2 in the funny state.

When you see bug happening you would see MySQL log flooded with error messages like this:

091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable
091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable
091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable
091119 23:03:34 [ERROR] Error in accept: Resource temporarily unavailable

filling out disk space

Depending on the case you may be able to connect to MySQL through Unix Socket or TCP/IP or neither.
It also looks like there is a correlation between having a lot of tables and such condition.

Previously I was unlucky with seeing this issue in production so we had to restart MySQL quickly currently I have a test MySQL showing some behavior.

Here is what strace tells me:

[percona@test9 msb_5_4_2]$ strace -f -p 19229
Process 19229 attached with 23 threads – interrupt to quit
[pid 19286] rt_sigtimedwait([HUP QUIT ALRM TERM TSTP],
[pid 19285] futex(0x165962ec, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19284] select(0, NULL, NULL, NULL, {0, 765000}
[pid 19283] select(0, NULL, NULL, NULL, {0, 412000}
[pid 19248] futex(0x1781193c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19247] futex(0x178118bc, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19246] futex(0x1781183c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19245] futex(0x178117bc, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19244] futex(0x1781173c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19232] futex(0x1781113c, FUTEX_WAIT_PRIVATE, 165, NULL
[pid 19229] accept(16392,
[pid 19241] futex(0x178115bc, FUTEX_WAIT_PRIVATE, 319, NULL
[pid 19243] futex(0x178116bc, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19242] futex(0x1781163c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19240] futex(0x1781153c, FUTEX_WAIT_PRIVATE, 3, NULL
[pid 19239] futex(0x178114bc, FUTEX_WAIT_PRIVATE, 7, NULL
[pid 19238] futex(0x1781143c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19237] futex(0x178113bc, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19236] futex(0x1781133c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19235] futex(0x178112bc, FUTEX_WAIT_PRIVATE, 7, NULL
[pid 19234] futex(0x1781123c, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19233] futex(0x178111bc, FUTEX_WAIT_PRIVATE, 207, NULL
[pid 19231] futex(0x178110bc, FUTEX_WAIT_PRIVATE, 1, NULL
[pid 19229] <... accept resumed> 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] accept(16392, 0x7fffffac70b0, [18423225245113516048]) = -1 EAGAIN (Resource temporarily unavailable)
[pid 19229] fcntl(16392, F_SETFL, O_RDWR) = 0
[pid 19229] fcntl(16392, F_SETFL, O_RDWR) = 0
[pid 19229] select(16394, [1025 1040 1042 1044
....
8 9709 9714 9716 9717 15368 15369], NULL, NULL, NULL*** buffer overflow detected ***: strace terminated
======= Backtrace: =========
/lib64/libc.so.6(__chk_fail+0x2f)[0x35f36e6aff]
/lib64/libc.so.6[0x35f36e5ad3]
strace[0x408b60]

So as you can see accept gets pretty high socket number – probably because of large innodb_open_files I tested with in this case – which all can be used during recovery.

Note the process gets accept on the socket to fail with EAGAIN (which is not well described in the manual) and later getting call to select call with 16384 sockets. This does not seems to be the healthy number to work with SELECT call and It is quite possible something goes wrong because of it.

If you have any ideas why could be going wrong in this case.

About Peter Zaitsev

Peter managed the High Performance Group within MySQL until 2006, when he founded Percona. Peter has a Master's Degree in Computer Science and is an expert in database kernels, computer hardware, and application scaling.

Comments

  1. Ah, I think I see it now.

    The select() and accept() calls in the trace are from
    handle_connections_sockets() in sql/mysqld.cc. It is supposed to select() on
    (at most) 2 file descriptors: the IP server socket and the unix server socket.

    Now in this case, one of these somehow ended up with file descriptor number
    16393.

    From `man 2 select`:

    “An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with a
    value of fd that is negative or is equal to or larger than FD_SETSIZE will
    result in undefined behavior. Moreover, POSIX requires fd to be a valid
    file descriptor.”

    And FD_SETSIZE seems to be 1024 on Linux!

    So basically mysqld is now passing random stack memory to select() for file
    descriptors 1024-16393 :-(. This also explains why accept() fails with EAGAIN:
    there isn’t any new connection pending, it is just that select() gets lots of
    random file descriptors in the undefined memory, and one of them happens to
    return ready.

    So the code in handle_connections_sockets() is really broken, the only reason
    it works at all is probably that the server sockets are usually created early
    and hence get low file descriptors. But in this case for some reason at least
    one of them got the very high number 16393.

    Even if we do not exceed the FD_SETSIZE it can still be inefficient to
    search a potentially big fd_set for just two descriptors. The code should be
    changed to use poll(). I seem to remember seeing a patch for this, or I might
    make on up for MariaDB.

  2. peter says:

    Kristian,

    Thanks. Good catch. This matches my guess as well. select() is way to the trouble especially these days when we deal with a lot of connections, a lot of open files etc.

    I filed a bug on it BTW:
    http://bugs.mysql.com/bug.php?id=48929

  3. Brant says:

    I don’t have anything to add other than Kristian is awesome.

  4. Ryan H says:

    Thanks for this. We have this problem all the time on our production servers.

    -Ryan

  5. peter says:

    Ryan,

    Looks like this series of posts of workings with large amount of tables matches your envinronment significantly :)

  6. domas says:

    yet more arguments for shared tablespaces, mwaha.

  7. Antony says:

    Or simply rearrange the server code so that the socket handles are created first, before any storage engine is initialized.

Speak Your Mind

*