Emergency

Cluster doesnt accept wsrep_cluster_address in my.cnf

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cluster doesnt accept wsrep_cluster_address in my.cnf

    hi experts,

    I'm new to percona cluster.
    I just installed it:

    sudo apt-get install percona-xtradb-cluster-client-5.5 \
    percona-xtradb-cluster-server-5.5 percona-xtrabackup

    and now try to make it running however without access.

    I created /etc/mysql/my.cnf on all 3 nodes which look like this one from the first node:

    [client]
    port = 3306
    socket = /var/run/mysqld/mysqld.sock

    [mysqld_safe]
    socket = /var/run/mysqld/mysqld.sock
    nice = 0

    [mysqld]

    datadir=/var/lib/mysql/

    # Path to Galera library
    wsrep_provider=/usr/lib64/libgalera_smm.so
    # Cluster connection URL contains the IPs of node#1, node#2 and node#3
    #wsrep_cluster_address=gcomm://xxx.xx.xxx.xx1,xxx.xx.xxx.xx2,xxx.xx.xxx.xx3
    # In order for Galera to work correctly binlog format should be ROW
    binlog_format=ROW
    # MyISAM storage engine has only experimental support
    default_storage_engine=InnoDB
    # This is a recommended tuning variable for performance
    innodb_locks_unsafe_for_binlog=1
    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    innodb_autoinc_lock_mode=2
    # Node #1 address
    wsrep_node_address=xxx.xx.xxx.xx1
    # SST method
    wsrep_sst_method=xtrabackup
    # Cluster name
    wsrep_cluster_name=my_debian_cluster
    # Authentication for SST method
    wsrep_sst_auth="usernameassword"

    this way mysql starts, but when I comment out wsrep_cluster_address:
    wsrep_cluster_address=gcomm://xxx.xx.xxx.xx1,xxx.xx.xxx.xx2,xxx.xx.xxx.xx3
    it won't start.

    the first node was started this way:
    /etc/init.d/mysql start --wsrep-cluster-address="gcomm://"

    the two others this way:
    /etc/init.d/mysql start

    so none of the nodes accepts wsrep_cluster_address option in my.cnf.

    OS: debian 6

    mysql> show status like 'wsrep%'; from 3 nodes looks like this:
    +----------------------------+-----------------------------------+
    | Variable_name | Value |
    +----------------------------+-----------------------------------+
    | wsrep_local_state_uuid | |
    | wsrep_protocol_version | 18446744073709551615 |
    | wsrep_last_committed | 18446744073709551615 |
    | wsrep_replicated | 0 |
    | wsrep_replicated_bytes | 0 |
    | wsrep_received | 0 |
    | wsrep_received_bytes | 0 |
    | wsrep_local_commits | 0 |
    | wsrep_local_cert_failures | 0 |
    | wsrep_local_bf_aborts | 0 |
    | wsrep_local_replays | 0 |
    | wsrep_local_send_queue | 0 |
    | wsrep_local_send_queue_avg | 0.000000 |
    | wsrep_local_recv_queue | 0 |
    | wsrep_local_recv_queue_avg | 0.000000 |
    | wsrep_flow_control_paused | 0.000000 |
    | wsrep_flow_control_sent | 0 |
    | wsrep_flow_control_recv | 0 |
    | wsrep_cert_deps_distance | 0.000000 |
    | wsrep_apply_oooe | 0.000000 |
    | wsrep_apply_oool | 0.000000 |
    | wsrep_apply_window | 0.000000 |
    | wsrep_commit_oooe | 0.000000 |
    | wsrep_commit_oool | 0.000000 |
    | wsrep_commit_window | 0.000000 |
    | wsrep_local_state | 0 |
    | wsrep_local_state_comment | Initialized |
    | wsrep_cert_index_size | 0 |
    | wsrep_causal_reads | 0 |
    | wsrep_incoming_addresses | |
    | wsrep_cluster_conf_id | 18446744073709551615 |
    | wsrep_cluster_size | 0 |
    | wsrep_cluster_state_uuid | |
    | wsrep_cluster_status | Disconnected |
    | wsrep_connected | OFF |
    | wsrep_local_index | 18446744073709551615 |
    | wsrep_provider_name | Galera |
    | wsrep_provider_vendor | Codership Oy <info@codership.com> |
    | wsrep_provider_version | 2.5(r150) |
    | wsrep_ready | OFF

    what am I doing wrong?
    thank you indeed!

  • #2
    now I could start the first node with:
    wsrep_cluster_address=gcomm://
    in my.cnf

    however in the documentation they recommend :
    After this single-node cluster is started, variable wsrep_cluster_address should be updated to the list of all nodes in the cluster. For example:
    wsrep_cluster_address=gcomm://192.168.70.2,192.168.70.3,192.168.70.4

    so I stop mysql , change it in my.cnf to wsrep_cluster_address=gcomm://xxx.xx.xxx.xx1,xxx.xx.xxx.xx2,xxx.xx.xxx.xx3
    and it fails again!
    where is the mistake?

    Comment


    • #3
      All of your nodes (including the first node) are in Initialized state? Paste your logs.

      Comment


      • #4
        here you go

        Comment


        • #5
          attachments from 3 nodes

          Comment


          • #6
            attachment function in this forum doesnt seem to work

            Comment


            • #7
              Hmm, I let someone know. In the meantime, you can put them on pastebin or sprunge.us (or similar) and just paste the links.

              Comment


              • #8
                https://www.dropbox.com/sh/odcod4fnwivcx6q/EvD7g8Y9Ti

                Comment


                • #9
                  3 uploaded files here

                  Comment


                  • #10
                    1 file here

                    Comment


                    • #11
                      ok doesnt work, please use the link to dropbox.
                      by the way this forums threads can't be seen in chrome browser, at least in the chrome version for linux

                      Comment


                      • #12
                        no worries, but bump

                        Comment


                        • #13
                          Sorry for the delayed reply here Zuri.

                          Rereading your comments, I think you're confused about how to bootstrap the cluster. The first node must be bootstrapped by providing 'wsrep_cluster_address=gcomm://'.
                          This first node should have a status like this:

                          mysql> show status like 'wsrep%'; from 3 nodes looks like this:
                          +----------------------------+-----------------------------------+
                          | Variable_name | Value |
                          +----------------------------+-----------------------------------+
                          | wsrep_local_state_comment | Synced |
                          | wsrep_cluster_conf_id | 1 |
                          | wsrep_cluster_size | 1 |
                          | wsrep_cluster_status | Primary |
                          | wsrep_connected | ON |
                          | wsrep_ready | ON

                          After this node is started in this state, you can start the other nodes with the full wsrep_cluster_address. They *should* SST and join the cluster (you should see the cluster size increase, and all nodes in the Synced and Primary states).

                          After you get the other nodes started, you do want to make sure the first node's my.cnf has your full cluster address (a restart is not required after you get the other nodes up).

                          Comment


                          • #14
                            hi percona.jayj,

                            I've given it an one more try.

                            first node's my.cnf:
                            wsrep_cluster_address=gcomm://
                            started like this:
                            /etc/init.d/mysql start --wsrep-cluster-address="gcomm://"

                            so far success and

                            mysql> show status like 'wsrep%';
                            +----------------------------+-----------------------------------+
                            | Variable_name | Value |
                            +----------------------------+-----------------------------------+
                            | wsrep_local_state_comment | Synced |
                            | wsrep_cluster_conf_id | 1 |
                            | wsrep_cluster_size | 1 |
                            | wsrep_cluster_status | Primary |
                            | wsrep_connected | ON |
                            | wsrep_ready | ON

                            looks ok.

                            second node's my.cnf:
                            wsrep_cluster_address=gcomm://first_nod_ip_here,second_node_ip_here

                            second node started like this:
                            /etc/init.d/mysql start

                            start not successfull. error log shows:

                            130708 15:41:28 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
                            130708 15:41:28 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.QIFh4U6aiJ
                            130708 15:41:33 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
                            130708 15:41:33 [Note] WSREP: wsrep_start_position var submitted: '00000000-0000-0000-0000-000000000000:-1'
                            130708 15:41:33 [Note] WSREP: Read nil XID from storage engines, skipping position init
                            130708 15:41:33 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/libgalera_smm.so'
                            130708 15:41:33 [Note] WSREP: wsrep_load(): Galera 2.5(r150) by Codership Oy <info@codership.com> loaded succesfully.
                            130708 15:41:33 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
                            130708 15:41:33 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.
                            130708 15:41:33 [Note] WSREP: Passing config to GCS: base_host = second_node_ip_here; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
                            130708 15:41:33 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
                            130708 15:41:33 [Note] WSREP: wsrep_sst_grab()
                            130708 15:41:33 [Note] WSREP: Start replication
                            130708 15:41:33 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
                            130708 15:41:33 [Note] WSREP: protonet asio version 0
                            130708 15:41:33 [Note] WSREP: backend: asio
                            130708 15:41:33 [Note] WSREP: GMCast version 0
                            130708 15:41:33 [Note] WSREP: (1a786454-e7d4-11e2-0800-4bed3fdb41d0, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
                            130708 15:41:33 [Note] WSREP: (1a786454-e7d4-11e2-0800-4bed3fdb41d0, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
                            130708 15:41:33 [Note] WSREP: EVS version 0
                            130708 15:41:33 [Note] WSREP: PC version 0
                            130708 15:41:33 [Note] WSREP: gcomm: connecting to group 'my_wsrep_cluster', peer 'first_node_ip_here:,second_node_ip_here:'
                            130708 15:41:33 [Warning] WSREP: (1a786454-e7d4-11e2-0800-4bed3fdb41d0, 'tcp://0.0.0.0:4567') address 'tcp://second_node_ip_here:4567' points to own listening address, blacklisting
                            130708 15:41:36 [Warning] WSREP: no nodes coming from prim view, prim not possible
                            130708 15:41:36 [Note] WSREP: view(view_id(NON_PRIM,1a786454-e7d4-11e2-0800-4bed3fdb41d0,1) memb {
                            1a786454-e7d4-11e2-0800-4bed3fdb41d0,
                            } joined {
                            } left {
                            } partitioned {
                            })
                            130708 15:41:37 [Warning] WSREP: last inactive check more than PT1.5S ago, skipping check
                            130708 15:42:06 [Note] WSREP: view((empty))
                            130708 15:42:06 [ERROR] WSREP: failed to open gcomm backend connection: 110: failed to reach primary view: 110 (Connection timed out)
                            at gcomm/src/pc.cpp:connect():139
                            130708 15:42:06 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():195: Failed to open backend connection: -110 (Connection timed out)
                            130708 15:42:06 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1290: Failed to open channel 'my_wsrep_cluster' at 'gcomm://first_node_ip_here,second_node_ip_here': -110 (Connection timed out)
                            130708 15:42:06 [ERROR] WSREP: gcs connect failed: Connection timed out
                            130708 15:42:06 [ERROR] WSREP: wsrep::connect() failed: 6
                            130708 15:42:06 [ERROR] Aborting

                            130708 15:42:06 [Note] WSREP: Service disconnected.
                            130708 15:42:07 [Note] WSREP: Some threads may fail to exit.
                            130708 15:42:07 [Note] /usr/sbin/mysqld: Shutdown complete

                            130708 15:42:07 mysqld_safe mysqld from pid file /var/lib/mysql/first_node_dnsname_here.pid ended
                            130708 15:42:56 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
                            130708 15:42:56 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.LgQydon7i3
                            130708 15:43:01 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
                            130708 15:43:01 [Note] WSREP: wsrep_start_position var submitted: '00000000-0000-0000-0000-000000000000:-1'
                            130708 15:43:01 [Note] WSREP: Read nil XID from storage engines, skipping position init
                            130708 15:43:01 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/libgalera_smm.so'
                            130708 15:43:01 [Note] WSREP: wsrep_load(): Galera 2.5(r150) by Codership Oy <info@codership.com> loaded succesfully.
                            130708 15:43:01 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
                            130708 15:43:01 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.
                            130708 15:43:01 [Note] WSREP: Passing config to GCS: base_host = second_node_ip_here; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
                            130708 15:43:01 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
                            130708 15:43:01 [Note] Plugin 'FEDERATED' is disabled.
                            130708 15:43:01 InnoDB: The InnoDB memory heap is disabled
                            130708 15:43:01 InnoDB: Mutexes and rw_locks use GCC atomic builtins
                            130708 15:43:01 InnoDB: Compressed tables use zlib 1.2.3
                            130708 15:43:01 InnoDB: Using Linux native AIO
                            130708 15:43:01 InnoDB: Initializing buffer pool, size = 128.0M
                            130708 15:43:01 InnoDB: Completed initialization of buffer pool
                            130708 15:43:01 InnoDB: highest supported file format is Barracuda.
                            130708 15:43:02 InnoDB: Waiting for the background threads to start
                            130708 15:43:03 Percona XtraDB (http://www.percona.com) 5.5.30-rel30.2 started; log sequence number 1598139
                            130708 15:43:03 [Note] Event Scheduler: Loaded 0 events
                            130708 15:43:03 [Note] /usr/sbin/mysqld: ready for connections.
                            Version: '5.5.30-30.2' socket: '/var/run/mysqld/mysqld.sock' port: 3306 Percona Server (GPL), Release 30.2, wsrep_23.7.4.r3843

                            after deleting wsrep_cluster_address=gcomm://first_nod_ip_here,second_node_ip_here
                            in my.cnf on second node, mysql on second node can be started successfull.

                            Any thoughts or ideas?

                            regards,
                            zuri

                            Comment


                            • #15
                              Can your second node connect to your first node on TCP port 4567? It looks to me like it cannot.

                              Comment

                              Working...
                              X