GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

Unable to initialize cluster: "terminate called after throwing an instance of 'gu::NotFound'&qu

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unable to initialize cluster: "terminate called after throwing an instance of 'gu::NotFound'&qu

    Greetings,

    I'm at my wits end trying to get this working. I think I've tried everything short of downgrading to 5.5.20. Below you'll find some details about my setup and other resources I've consulted.

    Pretty sure there's just some little config setting I'm missing, but can't figure it out!

    Thanks!

    Erik Osterman

    ------------------------------------------------------------ ---

    Using RPMs downloaded directly from Percona:


    Percona-XtraDB-Cluster-client-5.5.23-23.5.333.rhel5Percona-XtraDB-Cluster-devel-5.5.23-23.5.333.rhel5Percona-XtraDB-Cluster-galera-2.0-1.109.rhel5Percona-XtraDB-Cluster-server-5.5.23-23.5.333.rhel5Percona-XtraDB-Cluster-shared-5.5.23-23.5.333.rhel5


    With boost 1.41:


    boost141-program-options-1.41.0-2.el5


    With the following interface:


    # ifconfig | grep 10.254.167.5 inet addr:10.254.167.5 Bcast:10.254.167.255 Mask:255.255.254.0



    Using the following wsrep.cnf:

    [mysqld]wsrep_provider=/usr/lib/libgalera_smm.sowsrep_cluster_address=gcomm://wsrep_slave_threads=2wsrep_cluster_name=sentrywsre p_sst_method=rsyncwsrep_node_name=sentry1wsrep_nod e_address=10.254.167.5wsrep_sst_receive_address=10 .254.167.5wsrep_provider_options="gmcast.listen_ad dr=10.254.167.5; ist.recv_addr=10.254.167.5"binlog_format=ROWinnodb _locks_unsafe_for_binlog=1innodb_autoinc_lock_mode =2


    Fresh install of CentOS 5.4 (i386).
    No iptables.
    No selinux.

    Found these two related postings, neither of which seems to fix my problems:
    https://bugs.launchpad.net/percona-xtradb-cluster/+bug/91497 6
    http://forum.percona.com/index.php?t=msg&goto=8318&

    Checked the FAQs:
    http://www.codership.com/wiki/doku.php?id=faq
    http://www.percona.com/doc/percona-x...uster/faq.html


    This cluster has never been started and no data has even been loaded other than that which gets installed by mysql_install_db.


    Error log below:


    120605 21:52:39 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql120605 21:52:39 [Note] Flashcache bypass: disabled120605 21:52:39 [Note] Flashcache setup error is : ioctl failed120605 21:52:39 [Note] WSREP: Read nil XID from storage engines, skipping position init120605 21:52:39 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'120605 21:52:39 [Note] WSREP: wsrep_load(): Galera 2.1dev(r109) by Codership Oy loaded succesfully.120605 21:52:39 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.120605 21:52:39 [Note] WSREP: Passing config to GCS: base_host = 10.254.167.5; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 2147483647; gcs.recv_q_soft_limit = 0.25; gmcast.listen_addr = 10.254.167.5; ist.recv_addr = 10.254.167.5; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3120605 21:52:39 [Note] WSREP: wsrep_sst_grab()120605 21:52:39 [Note] WSREP: Start replication120605 21:52:39 [Warning] WSREP: state file not found: /var/lib/mysql//grastate.dat120605 21:52:39 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1120605 21:52:39 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1terminate called after throwing an instance of 'gu::NotFound'04:52:39 UTC - mysqld got signal 6 ;


    Contents of /var/lib/mysql/



    # ls -al /var/lib/mysql/total 280600drwxr-xr-x 5 mysql mysql 4096 2012-06-05 21:52 .drwxr-xr-x 3 27 27 50 2012-06-04 16:11 ..-rw------- 1 mysql mysql 134219040 2012-06-05 19:27 galera.cache-rw-rw---- 1 mysql mysql 18874368 2012-06-05 00:53 ibdata1-rw-rw---- 1 mysql mysql 67108864 2012-06-05 00:53 ib_logfile0-rw-rw---- 1 mysql mysql 67108864 2012-06-04 16:11 ib_logfile1drwx------ 2 mysql mysql 4096 2012-06-04 16:11 mysqldrwx------ 2 mysql mysql 4096 2012-06-05 00:30 performance_schema-rw-r--r-- 1 root root 347 2012-06-05 00:30 RPM_UPGRADE_HISTORY-rw-r--r-- 1 root root 347 2012-06-05 00:30 RPM_UPGRADE_MARKER-LASTdrwx------ 2 mysql mysql 6 2012-06-04 16:11 test

  • #2
    Ahh, CentOS 5.4! Why 5.4, may I enquire, when there is 5.8 already?

    But, besides 5.4 having buggy libstdc++, you have really over-configured. And made a subtle, but fatal error in configuration which causes this exception which can't be caught by CentOS 5.4.

    The error (well, not really an error, but CentOS 5.4 makes it such) is in

    gmcast.listen_addr=10.254.167.5

    it should be

    gmcast.listen_addr=tcp://10.254.167.5


    But, unless you have some very specific needs, all you need to set up is wsrep_node_address. It will be used for everything, unless explicitly overridden.

    Regards,
    Alex

    Comment


    • #3
      Alex,

      Thanks so much for the explanation of what's going wrong. I tried with and without the tcp:// scheme for "gmcast.listen_addr" with the same unfortunate outcome.

      I simplified the configuration as you suggested to the one below:


      [mysqld]datadir=/var/lib/mysqluser=mysqlbinlog_format=ROWwsrep_debug=1wsrep _provider=/usr/lib/libgalera_smm.sowsrep_cluster_address="gcomm://"wsrep_slave_threads=2wsrep_node_name=node1wsrep_n ode_address=10.254.167.5wsrep_cluster_name=cluster wsrep_sst_method=rsyncbind_address=0.0.0.0default_ storage_engine=InnoDBinnodb_locks_unsafe_for_binlo g=1innodb_autoinc_lock_mode=2max_connections=10max _allowed_packet=32Mtable_cache=2048thread_cache_si ze=32query_cache_size=256Minnodb_buffer_pool_size= 128Msort_buffer_size=64M read_rnd_buffer_size=2Minnodb_log_file_size = 64Minnodb_log_buffer_size = 8Mlog-bin=/mnt/mysql-binlogs/mysql-binlog-bin=/mnt/mysql-binlogs/mysql-bin.indexrelay-log=/mnt/mysql-binlogs/mysql-relay-binrelay-log-index=/mnt/mysql-binlogs/mysql-relay-bin.index



      It terminated with the same outcome:


      # /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql120607 12:07:56 [Note] Flashcache bypass: disabled120607 12:07:56 [Note] Flashcache setup error is : ioctl failed120607 12:07:56 [Note] WSREP: Read nil XID from storage engines, skipping position init120607 12:07:56 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'120607 12:07:56 [Note] WSREP: wsrep_load(): Galera 2.1dev(r112) by Codership Oy loaded succesfully.120607 12:07:56 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1120607 12:07:56 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.120607 12:07:56 [Note] WSREP: Passing config to GCS: base_host = 10.254.167.5; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 0.5; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 2147483647; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3120607 12:07:57 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1120607 12:07:57 [Note] WSREP: wsrep_sst_grab()120607 12:07:57 [Note] WSREP: Start replication120607 12:07:57 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1terminate called after throwing an instance of 'gu::NotFound'19:07:57 UTC - mysqld got signal 6 ;This could be because you hit a bug. It is also possible that this binaryor one of the libraries it was linked against is corrupt, improperly built,or misconfigured. This error can also be caused by malfunctioning hardware.We will try our best to scrape up some info that will hopefully helpdiagnose the problem, but since we have already crashed, something is definitely wrong and this may fail.key_buffer_size=0read_buffer_size=131072max_u sed_connections=0max_threads=10thread_count=0conne ction_count=0It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 656721 K bytes of memoryHope that's ok; if not, decrease some variables in the equation.Thread pointer: 0x0Attempting backtrace. You can use the following information to find outwhere mysqld died. If you see no messages after this, something wentterribly wrong...stack_bottom = 0 thread_stack 0x30000/usr/sbin/mysqld(my_print_stacktrace+0x33)[0x8409693]/usr/sbin/mysqld(handle_fatal_signal+0x48c)[0x82d309c][0xc5d420]/lib/i686/nosegneg/libc.so.6(abort+0x101)[0x941a21]/usr/lib/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_ handlerEv+0x150)[0x20c4d0]/usr/lib/libstdc++.so.6[0x209f35]/usr/lib/libstdc++.so.6[0x209f72]/usr/lib/libstdc++.so.6[0x20a0aa]/usr/lib/libgalera_smm.so(_ZNK2gu3URI10get_optionERKSs+0xf6 )[0x4c73e6]/usr/lib/libgalera_smm.so(_ZN9GCommConnC2ERKN2gu3URIERNS0_6 ConfigE+0x25d)[0x5b8d2d]/usr/lib/libgalera_smm.so(gcs_gcomm_create+0xd4)[0x5b41b4]/usr/lib/libgalera_smm.so(gcs_backend_init+0xa9)[0x5a1cb9]/usr/lib/libgalera_smm.so(gcs_core_open+0x6f)[0x5a7eaf]/usr/lib/libgalera_smm.so(gcs_open+0x2c8)[0x5aeb58]/usr/lib/libgalera_smm.so(_ZN6galera13ReplicatorSMM7connect ERKSsS2_S2_+0x296)[0x5f97d6]/usr/lib/libgalera_smm.so(galera_connect+0xae)[0x6146ae]/usr/sbin/mysqld(_Z23wsrep_start_replicationv+0x107)[0x8289cb7]/usr/sbin/mysqld(_Z18wsrep_init_startupb+0x7c)[0x828aa9c]/usr/sbin/mysqld[0x813b069]/usr/sbin/mysqld(_Z11mysqld_mainiPPc+0xa22)[0x813d402]/usr/sbin/mysqld(main+0x27)[0x8130e27]/lib/i686/nosegneg/libc.so.6(__libc_start_main+0xdc)[0x92ce9c]/usr/sbin/mysqld[0x8130d41]The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html containsinformation that should help you find out what is causing the crash.


      I noticed that it's always in the "get_option" call that it appears to crash, so I agree with you that there's some "fatal error in configuration"!

      As for why we're running CentOS 5.4, it's just due to a very large infrastructure already built on top of it. With over 26G of rpms comprised of 12K packages it's a project we're pushing off! With that said, much of the reason for all the packages is we've upgraded the OS well beyond 5.4 by backporting RPMs. Ofcourse, libstdc++ is not one of them =).

      If upgrading to 5.8 might alleviate some of these issues, I can try it out for these particular servers. Also, fwiw, this is on EC2.

      -Erik

      Comment


      • #4
        Hi,

        So it looks like there are more places where catching this exception fails. Basically this is happening when the code is checking for some option, and if it is not found in user-supplied configuration, it uses the default. By trial and error we can find all options that must be set to avoid this, but you can imagine that if the catch() statement fails there, it may as well fail elsewhere in a critical path. So "workability" of Galera on CentOS 5.4 can't be guaranteed. I suggest that you upgrade the distribution.

        I may be mistaken, but CentOS 5.8 should be fully binary compatible with 5.4, so all your existing packages should work fine in 5.8 (except when they rely on bugs for operation).

        EC2 is totally fine. We do it all the time.

        Regards,
        Alex

        Comment


        • #5
          Thanks! I'll work on getting a 5.8 image setup soon and see how it goes from there.

          -Erik

          Comment


          • #6
            Upgraded to CentOS 5.6 image (we had one ready to go) and Percona XtraDB starts up just fine and is listening on TCP:4567 with the same configuration that failed on 5.4.

            Thanks @ayurchen again for the help!

            -Erik

            Comment

            Working...
            X