GET 24/7 LIVE HELP NOW

Announcement

Announcement Module
Collapse
No announcement yet.

Percona Cluster: 2nd node can't start on "Ubuntu 12.04.1 LTS"

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Percona Cluster: 2nd node can't start on "Ubuntu 12.04.1 LTS"

    Hi,

    I have 3 servers with "Ubuntu 12.04.1 LTS" and I have instaled Percona XtraDB Cluster server from the binaries repositories using apt-get method. It appears that everything is installed correctly, because it runs the mysql server standalone from each node.
    The first node (with wsrep_cluster_address=gcomm://) starts without problem. And I can run mysql --defaults-file=/etc/mysql/debian.cnf -e "show status like 'wsrep%';" and get the spected output saying that there is a cluster initialized with wsrep_cluster_size=1.

    But when I start the second (or third node) the service mysql start finish with [fail] message. When i tried to start mysql with the command : "/etc/init.d/mysql start" or "service mysql start", i get the message start failed.

    "my.cnf" file for the first node is:
    [client]
    port = 3306
    socket = /var/run/mysqld/mysqld.sock
    [mysqld]
    user = mysql
    default_storage_engine = InnoDB
    port = 3306
    pid-file = /var/run/mysqld/mysqld.pid
    socket = /var/run/mysqld/mysqld.sock
    # MyISAM
    key_buffer_size = 32M
    # SAFETY
    max_allowed_packet = 16M
    max_connect_errors = 1000
    # DATA STORAGE
    datadir = /var/opt/hosting/db
    # BINARY LOGGING
    log_bin = /var/opt/hosting/tmp/mysql-bin-log/log-bin-node01.log
    expire_logs_days = 10
    # INNODB
    innodb_flush_method = ALL_O_DIRECT
    innodb_log_files_in_group = 2
    innodb_log_file_size = 150M
    innodb_import_table_from_xtrabackup = 1
    innodb_flush_log_at_trx_commit = 1
    innodb_file_per_table = 1
    innodb_buffer_pool_size = 1G
    # LOGGING
    log-error = /var/opt/hosting/db/node1.err
    long_query_time = 2
    slow-query-log-file = /var/opt/hosting/log/mysql/mysql-slow-queries.node1.log
    # GALERA
    # Path to Galera library
    wsrep_provider = /usr/lib/libgalera_smm.so
    # Cluster connection URL contains the IPs of node#1, node#2 and node#3
    wsrep_cluster_address = gcomm://
    # In order for Galera to work correctly binlog format should be ROW
    binlog_format = ROW
    # This is a recommended tuning variable for performance
    innodb_locks_unsafe_for_binlog = 1
    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    innodb_autoinc_lock_mode = 2
    # Node address
    wsrep_node_address = node1_IP
    # SST method
    wsrep_sst_method = xtrabackup
    # Cluster name
    wsrep_cluster_name = my_first_node
    # Authentication for SST method
    wsrep_sst_auth = "sst_userassword"
    -------------------------------------------------

    "my.cnf" file for the 2nd node
    [client]
    port = 3306
    socket = /var/run/mysqld/mysqld.sock
    [mysqld]
    user = mysql
    default_storage_engine = InnoDB
    port = 3306
    pid-file = /var/run/mysqld/mysqld.pid
    socket = /var/run/mysqld/mysqld.sock
    # MyISAM
    key_buffer_size = 32M
    # SAFETY
    max_allowed_packet = 16M
    max_connect_errors = 1000
    # DATA STORAGE
    datadir = /var/opt/hosting/db
    # BINARY LOGGING
    log_bin = /var/opt/hosting/tmp/mysql-bin-log/log-bin-node02.log
    expire_logs_days = 10
    # INNODB
    innodb_flush_method = ALL_O_DIRECT
    innodb_log_files_in_group = 2
    innodb_log_file_size = 150M
    innodb_import_table_from_xtrabackup = 1
    innodb_flush_log_at_trx_commit = 1
    innodb_file_per_table = 1
    innodb_buffer_pool_size = 1G
    # LOGGING
    log-error = /var/opt/hosting/db/poolm/node2.err
    long_query_time = 2
    slow-query-log-file = /var/opt/hosting/log/mysql/mysql-slow-queries.node2.log
    # GALERA
    # Path to Galera library
    wsrep_provider = /usr/lib/libgalera_smm.so
    # Cluster connection URL contains the IPs of node#1, node#2 and node#3
    wsrep_cluster_address = gcomm://node1_IP
    # In order for Galera to work correctly binlog format should be ROW
    binlog_format = ROW
    # This is a recommended tuning variable for performance
    innodb_locks_unsafe_for_binlog = 1
    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    innodb_autoinc_lock_mode = 2
    # Node address
    wsrep_node_address = node2_IP
    # SST method
    wsrep_sst_method = xtrabackup
    # Cluster name
    wsrep_cluster_name = my_first_node
    # Authentication for SST method
    wsrep_sst_auth = "sst_userassword"

    I followed the documentation from the Percona XtraDB Cluster : http://www.percona.com/doc/percona-x...ter/index.html

    During the process of starting up with "/etc/init.d/mysql start" command on "node2", it looks like the synchronization from this node(node2) with "node1" set more time than the 14seconds which were defined. I think "/etc/init.d/mysql start" command, don't wait that this synchronization is ended before going out in error.

    What do you think about it? Can you help me with this setup, I need it urgently.

    Thanks in advance.

    PS : PERCONA SERVER VERSION : 5.5.30-30.2-log Percona Server (GPL), Release 30.2, wsrep_23.7.4.r3843

  • #2
    So -- what does the log say on the 2nd node? My guess would be this is some SST error, so you can also check the innobackup.backup.log file in the datadir on the first node for clues.

    Comment


    • #3
      Thank you for your reply.
      For the log, please see attached files : "node1log.txt
      " "node2log.txt"
      I noticed that "/etc/init.d/mysql" deployed during installation of "percona-xtradb-cluster-server.5.5" was erroneous. See attached file "etc_initd_mysql_deployed"
      .
      I replaced it by "etc_initd_mysql_replaced"
      file before my installation and the 2nd node started well and joined the cluster maybe because of in this script timeout waiting is set to "service_startup_timeout=900" seconds.

      Expect 15 minutes that 2nd node synchronizes with the cluster don't seem to me very determinist. If size of my data in cluster make 10Go,15Go,50Go or 1To..., 15 minutes seems to me too few. Wouldn't it be better to wait that the synchronization of the second node is ended instead of waiting 15minutes?








      Comment


      • #4
        Attachment Attachment

        Comment


        • #5
          node2log.txt ==> Attachment

          Comment


          • #6
            Based on your logs, I'm not sure I agree with the assessment that your problems are due to init script timeouts -- there are some clear crashes with stack traces in the log. I'd try clearing the datadir on your second node to see if that helps at all. I'd also check the innobackup.apply.log on that node to see if there were any errors recovering the Xtrabackup SST before the node starts up.

            Comment


            • #7
              Before every installation i clean my datadir on 2ndnode but that doesn't seem to solve my problem....Even before "2ndnode" tries to connect to the cluster, i already have an error in /var/opt/hosting/db/poolm/node2.err log log : 130704 16:27:40 Percona XtraDB (http://www.percona.com) 5.5.30-rel30.2 started; log sequence number 1597945
              ERROR: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ALTER TABLE user ADD column Show_view_priv enum('N','Y') CHARA
              130704 16:27:40 [ERROR] Aborting

              What do you think of the difference between "etc_initd_mysql_deployed" and "etc_initd_mysql_replaced" (I got back etc_initd_mysql_replaced here : http://www.percona.com/downloads/Per.../linux/x86_64/) files. Why during installation of "percona-xtradb-cluster-server.5.5" "etc_initd_mysql_deployed" was deployed in place of "etc_initd_mysql_replaced"?

              Compared to my previous post, if size of my data in cluster make 10Go,15Go,50Go or 1To wouldn't it be better to wait that the synchronization of the second node is ended instead of waiting 15minutes?

              Comment


              • #8
                According to me, a procedure should be implemented in the file "/etc/init.d/mysql" to verify that synchronization of "2ndnode" is in progress. In this case, which mechanism could be set up in this file to implement that?

                Comment


                • #9
                  I'm not a huge fan of all the "help" the init scripts try to give you in Ubuntu/Debian. In my experience errors like "ERROR: 1064 You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ALTER TABLE user ADD column Show_view_priv enum('N','Y') CHARA" happen on newly created datadirs and you may be better off with leaving a working datadir on node2 before you start it.

                  I can't comment on the inner workings of the init script. If you think there's something wrong there, then I'd direct you to: http://www.percona.com/doc/percona-x...bugreport.html

                  Comment


                  • #10
                    OK. Thank You. For information, I tried to install the lastest version of "percona-xtradb-cluster-server-5.5 (5.5.31-23.7.5-438.precise)" and I haven't errors on node2 during her synchronization with node1. However something wrong in "/etc/init.d/mysql" script and i am going to report it.

                    Comment


                    • #11
                      A hardcoded timeout in the init script can be too short and this was addressed in Percona XtraDB Cluster version 5.5.31 - see this bug: https://bugs.launchpad.net/percona-x...r/+bug/1099428 The other thing you should make sure is that newly joined nodes have the same credentials in /etc/mysql/debian.cnf as primary node. This is also needed for init script to work properly.

                      Comment

                      Working...
                      X