Announcement

Announcement Module
Collapse
No announcement yet.

All nodes in cluster failed.

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • All nodes in cluster failed.

    Hello,

    Today my cluster with 2 nodes in it unexpectedly went offline. Both nodes were down with following error in error log:

    12:07:53 UTC - mysqld got signal 11 ;
    This could be because you hit a bug. It is also possible that this binary
    or one of the libraries it was linked against is corrupt, improperly built,
    or misconfigured. This error can also be caused by malfunctioning hardware.
    We will try our best to scrape up some info that will hopefully help
    diagnose the problem, but since we have already crashed,
    something is definitely wrong and this may fail.
    Please help us make Percona XtraDB Cluster better by reporting any
    bugs at https://bugs.launchpad.net/percona-xtradb-cluster

    key_buffer_size=2147483648
    read_buffer_size=20971520
    max_used_connections=54
    max_threads=153
    thread_count=21
    connection_count=21
    It is possible that mysqld could use up to
    key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 8366310 K bytes of memory
    Hope that's ok; if not, decrease some variables in the equation.

    Thread pointer: 0xc1d3db0
    Attempting backtrace. You can use the following information to find out
    where mysqld died. If you see no messages after this, something went
    terribly wrong...
    stack_bottom = 7fc52c43dd28 thread_stack 0x40000
    /usr/sbin/mysqld(my_print_stacktrace+0x35)[0x901e65]
    /usr/sbin/mysqld(handle_fatal_signal+0x4c4)[0x67fc34]
    /lib64/libpthread.so.0(+0xfae0)[0x7fc55a946ae0]
    /usr/lib64/libgalera_smm.so(_ZN6galera13Certification10do_tes t_v3EPNS_9TrxHandleEb+0x1b1)[0x7fc538a4f571]
    /usr/lib64/libgalera_smm.so(_ZN6galera13Certification7do_test EPNS_9TrxHandleEb+0x36f)[0x7fc538a52d1f]
    /usr/lib64/libgalera_smm.so(_ZN6galera13Certification4testEPN S_9TrxHandleEb+0x28)[0x7fc538a52ee8]
    /usr/lib64/libgalera_smm.so(_ZN6galera13Certification10append _trxEPNS_9TrxHandleE+0x8a)[0x7fc538a52f8a]
    /usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM4certEPN S_9TrxHandleE+0x8b)[0x7fc538a7f3fb]
    /usr/lib64/libgalera_smm.so(_ZN6galera13ReplicatorSMM10pre_co mmitEPNS_9TrxHandleEP14wsrep_trx_meta+0x59)[0x7fc538a7f8d9]
    /usr/lib64/libgalera_smm.so(galera_pre_commit+0x148)[0x7fc538a90728]
    /usr/sbin/mysqld(_Z22wsrep_run_wsrep_commitP3THDP10handlerto nb+0x9a2)[0x7ba0f2]
    /usr/sbin/mysqld[0x7baa83]
    /usr/sbin/mysqld(_Z14ha_prepare_lowP3THDb+0x8c)[0x5c37bc]
    /usr/sbin/mysqld(_Z15ha_commit_transP3THDbb+0x27c)[0x5c50dc]
    /usr/sbin/mysqld(_Z12trans_commitP3THD+0x49)[0x79ff59]
    /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x3571)[0x7045a1]
    /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x60 8)[0x7072e8]
    /usr/sbin/mysqld[0x707411]
    /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3 THDPcj+0x1ad4)[0x709694]
    /usr/sbin/mysqld(_Z10do_commandP3THD+0x1e3)[0x70aa63]
    /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x17f)[0x6d430f]
    /usr/sbin/mysqld(handle_one_connection+0x47)[0x6d44e7]
    /usr/sbin/mysqld(pfs_spawn_thread+0x12a)[0xb3b63a]
    /lib64/libpthread.so.0(+0x7ddb)[0x7fc55a93eddb]
    /lib64/libc.so.6(clone+0x6d)[0x7fc55907ca1d]

    Trying to get some variables.
    Some pointers may be invalid and cause the dump to abort.
    Query (7fc2e803b660): is an invalid pointer
    Connection ID (thread ID): 9004
    Status: NOT_KILLED

    You may download the Percona XtraDB Cluster operations manual by visiting
    http://www.percona.com/software/percona-xtradb-cluster/. You may find information
    in the manual which will help you identify the cause of the crash.
    140214 12:07:54 mysqld_safe Number of processes running now: 0
    140214 12:07:54 mysqld_safe WSREP: not restarting wsrep node automatically
    140214 12:07:54 mysqld_safe mysqld from pid file /var/lib/mysql/ip-10-1-7-144.pid ended


    Server version:

    Server version: 5.6.15-56 Percona XtraDB Cluster (GPL), Release 25.2, Revision 692, wsrep_25.2.r4034

    And my.cnf

    [mysqld]
    datadir=/var/lib/mysql
    user=mysql
    wsrep_provider=/usr/lib64/libgalera_smm.so
    wsrep_cluster_address=gcomm://10.1.7.144,10.1.8.218
    binlog_format=ROW
    default_storage_engine=InnoDB
    innodb_locks_unsafe_for_binlog=1
    innodb_buffer_pool_size=6G
    key_buffer_size = 2048M
    max_allowed_packet = 50M
    table_open_cache = 1024
    sort_buffer_size = 20M
    read_buffer_size = 20M
    read_rnd_buffer_size = 80M
    myisam_sort_buffer_size = 64M
    thread_cache_size = 32
    query_cache_size = 32M
    thread_concurrency = 4
    innodb_flush_method=O_DIRECT
    innodb_log_file_size=1G
    innodb_buffer_pool_size=6G
    innodb_autoinc_lock_mode=2
    wsrep_node_address=10.1.7.144
    wsrep_sst_method=xtrabackup
    wsrep_cluster_name=my_centos_cluster
    wsrep_sst_auth="sstuser:s3cret"
    [mysql]
    prompt=\\u@\\h [\\d]>\\_


    This is weird that both nodes went offline, can someone suggest how to avoid this in future please?

  • #2
    There could be many causes for this type of problem!, make sure you are not setting higher values that could run out of resources like (memory,disk space etc...).

    In your above config u have specified innodb_buffer_pool_size 2 times!.

    for more info u can go through below links. (see if altering the values could do any better..)
    http://www.mysqlperformanceblog.com/...zation-basics/
    http://www.mysqlperformanceblog.com/...fer_pool_size/

    Comment


    • #3
      Thank you for your response, will check these links. As for innodb_buffer_pool_size it was error in my.cnf template.

      Comment

      Working...
      X