Repeatable crash with percona Server 5.5.13-15

  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeatable crash with percona Server 5.5.13-15

    Recently I noticed that MySQL was crashing and upon further investigation discovered that a database was upgraded to use Magento 1.6.

    Details of the error:

    len 232; hex c821d40700000000b1c05895ad7f000030322a6bad7f000000 0000000000 00000000000000000000000000000000000003000000000000 0001000000 00000000000000000000000003000000000000000100000000 0000000300 00000000000000000000000000000000000000000000000000 0000000000 ffffffffffffffff0000000000000000010000000000000083 4451070000 00009d33be0700000000020000000000000001000000000000 0030322a6b ad7f0000000000000000000060e11177000000000200000000 0000000000 0000000000009833be07000000001000000000000000; asc ! X 02*k DQ 3 02*k ` w 3 ;
    110906 15:30:55 InnoDB: Assertion failure in thread 140382468241152 in file btr0pcur.c line 242
    InnoDB: We intentionally generate a memory trap.
    InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
    InnoDB: If you get repeated assertion failures or crashes, even
    InnoDB: immediately after the mysqld startup, there may be
    InnoDB: corruption in the InnoDB tablespace. Please refer to
    InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recove ry.html
    InnoDB: about forcing recovery.
    110906 15:30:55 - mysqld got signal 6 ;
    This could be because you hit a bug. It is also possible that this binary
    or one of the libraries it was linked against is corrupt, improperly built,
    or misconfigured. This error can also be caused by malfunctioning hardware.
    We will try our best to scrape up some info that will hopefully help diagnose
    the problem, but since we have already crashed, something is definitely wrong
    and this may fail.

    It is possible that mysqld could use up to
    key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 987531 K
    bytes of memory
    Hope that's ok; if not, decrease some variables in the equation.

    Thread pointer: 0x7b27400
    Attempting backtrace. You can use the following information to find out
    where mysqld died. If you see no messages after this, something went
    terribly wrong...
    stack_bottom = 0x7fad5726ee48 thread_stack 0x60000
    /usr/local/mysql55/bin/mysqld(my_print_stacktrace+0x35)[0x8e 31b6]
    /usr/local/mysql55/bin/mysqld(handle_segfault+0x321)[0x55a8d 8]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0xfc60)[0x7fae19c02c6 0]
    /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7fae180b0d05 ]
    /usr/local/mysql55/bin/mysqld(_ZN26QUICK_GROUP_MIN_MAX_SELEC T11next_prefixEv+0x171)[0x85ffc3]
    /usr/local/mysql55/bin/mysqld(_ZN26QUICK_GROUP_MIN_MAX_SELEC T8get_nextEv+0x4d)[0x85f7f3]
    /usr/local/mysql55/bin/mysqld(_Z10sub_selectP4JOINP13st_join _tableb+0xc1)[0x646d60]
    /usr/local/mysql55/bin/mysqld(_ZN4JOIN4execEv+0x23d5)[0x62fe 91]
    /usr/local/mysql55/bin/mysqld(_Z12mysql_selectP3THDPPP4ItemP 10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP1 3select_re sultP18st_select_lex_unitP13st_select_lex+0x368)[0x630630]
    /usr/local/mysql55/bin/mysqld(_Z13handle_selectP3THDP3LEXP13 select_resultm+0x1e5)[0x62887b]
    /usr/local/mysql55/bin/mysqld(_Z21mysql_execute_commandP3THD +0x98a)[0x5fb7d5]
    /usr/local/mysql55/bin/mysqld(_Z11mysql_parseP3THDPcjP12Pars er_state+0x340)[0x604fac]
    /usr/local/mysql55/bin/mysqld(_Z16dispatch_command19enum_ser ver_commandP3THDPcj+0xaf7)[0x5f87e1]
    /usr/local/mysql55/bin/mysqld(_Z10do_commandP3THD+0x2f0)[0x5 f7ad7]
    /usr/local/mysql55/bin/mysqld(_Z24do_handle_one_connectionP3 THD+0x1a1)[0x6e16c8]
    /usr/local/mysql55/bin/mysqld(handle_one_connection+0x33)[0x 6e1176]
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x6d8c)[0x7fae19bf9d8 c]

    Trying to get some variables.
    Some pointers may be invalid and cause the dump to abort.
    Query (0x7bd8ab0): SELECT COUNT(DISTINCT parent_id) FROM `catalog_product_relation` WHERE (child_id IN('17'))
    Connection ID (thread ID): 1
    Status: NOT_KILLED

    Server was compiled from source and I can reproduce on both Ubuntu 11.04 and Centos 5.6. All tables have been checked and are ok.

    If I enter the above query the crash happens every time. Have tried the following versions:


    Apologies if this is the wrong place to post this.

  • #2
    Have you tried this on a vanilla MySQL 5.5 server?


    • #3
      Not yet, about to do that tonight.


      • #4

        With a stock MySQL 5.5.15 from Oracle using the same compile line it barfs in the exact same fault. Mariadb 5.2.8 works absolutely fine.


        • #5
          Please file a bug! If you have a support contract we can fix this.


          • #6
            I filed a bug over at mysql.com and it is an already confirmed bug. No eta on a fix and the original bug report is marked private.


            • #7
              I too am seeing this error on a recently built Percona Server. Note: we are seeing the crash during replication on the same table and command. So I'm suspecting a corrupted data file, but have no way of telling. We are using innodb_file_per_table. We produced this mysql data folder by rsyncing another slave's data and logs over, modifying the configs and restarting replication. This workflow has worked for us every previous time tried, even on other Percona builds.

              Here is the crash data:

              Thread pointer: 0x33f1be00
              Attempting backtrace. You can use the following information to find out
              where mysqld died. If you see no messages after this, something went
              terribly wrong...
              stack_bottom = 0x59f7f0f8 thread_stack 0x40000
              /lib64/libc.so.6(gsignal+0x35)[0x3f5aa30265]/lib64/libc.so.6 (abort+0x110)[0x3f5aa31d10]/usr/sbin/mysqld[0x84954a]/usr/sb in/mysqld[0x82c961]
              /usr/sbin/mysqld(_ZN7handler13ha_delete_rowEPKh+0x54)[0x6917 f4]
              /usr/sbin/mysqld(_ZN12ha_partition10delete_rowEPKh+0x7f)[0x9 7a05f]
              /usr/sbin/mysqld(_ZN7handler13ha_delete_rowEPKh+0x54)[0x6917 f4]
              /usr/sbin/mysqld(_Z12mysql_deleteP3THDP10TABLE_LISTP4ItemP10 SQL_I_ListI8st_orderEyy+0x8e8)[0x789b88]
              /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x1aa7)[0x57 8df7]
              /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x38 c)[0x57d6ec]
              /usr/sbin/mysqld(_ZN15Query_log_event14do_apply_eventEPK14Re lay_log_infoPKcj+0xf59)[0x74e3d9]
              /usr/sbin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3 THDP14Relay_log_info+0x152)[0x516be2]/usr/sbin/mysqld[0x5192 f2]
              Trying to get some variables.
              Some pointers may be invalid and cause the dump to abort.
              Query (0x35d220bf): DELETE FROM [hidden schema].[hidden table] WHERE date < '2008-09-06'Connection ID (thread ID): 11
              Status: NOT_KILLED


              • #8
                You need to describe your rsync method in more detail. Just saying that you rsync data and it worked in the past doesn't mean anything, you could have just been lucky. In general you can't rsync InnoDB files while InnoDB is running. Did you do it from a snapshot or while the server was shut down?


                • #9
                  Hi Baron, I appreciate your attention and hope that this is a constructive exchange. I'm actually hoping that this is a problem on my end that can be fixed by slapping myself around.

                  Here's the scenario:

                  Target and source hosts both have data and logs on separate volumes, and are almost identical in all ways except the target host is running Percona Server and the hard disk volumes are slightly different hardware. /etc/my.cnf is same on both except for server-id.

                  The rsync process is as follows:
                  01. On the source host (a redundant slave), "stop slave" and "service mysql stop".
                  02. On the source host rsync the mysql data folder (contains all table spaces) to the target host in a tmp folder.
                  03. On the source host rsync the logs folder (relay, bin, etc) to the target host in a tmp folder.
                  04. On the target host, mv the tmp folders to proper names to match the /etc/my.cnf
                  05. chown -Rf mysql:mysql the data and logs folders.
                  06. Make sure the server-id in /etc/my.cnf is unique.
                  07. "service mysql start"
                  08. "mysql_upgrade"
                  09. "service mysql restart"
                  10. Compare records from both hosts, spot checking qty of rows and sampling data.
                  11. "show slave status\G" to see relay and bin indexes are set along with replication settings.
                  12. "start slave"
                  13. Wait for crash.


                  • #10
                    Nothing's wrong with your rsync procedure, so it does look like a server bug.


                    • #11
                      Hello Randy,

                      Did you try to run the query that caused the crash:
                      Query (0x35d220bf): DELETE FROM [hidden schema].[hidden table] WHERE date < '2008-09-06'Connection ID (thread ID): 11

                      on the slave that you rsynced from? Did it crash too?


                      • #12
                        Did a manual run and no errors/warnings/crashes. So I'm thinking this has to be a hardware or file system issue. Brand new box, brand new drives... guess we'll send back to Dell. :roll:

                        Thanks for your help!


                        • #13
                          Problem solved. The root cause was table corruption in InnoDB tables. The fix was to force a rebuild of the table using ALTER.

                          ALTER TABLE tbl1 ENGINE=InnoDB;

                          or in my specific case:

                          ALTER TABLE tbl1 COALESCE PARTITION 8;

                          After applying the alters on all tables with partitions, replication caught up w/o issue.