Announcement

Announcement Module
Collapse
No announcement yet.

rsync or xtrabackup in wsrep_sst_method?

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • rsync or xtrabackup in wsrep_sst_method?

    I just install xtrabdb cluster on 3 nodes , but I have some problems:

    -If I use xtrabackup the node 2 and 3 refuse to start.
    -If I use rsync , theses 2 nodes start but when I create a database in one the node , it is not replicated on the others.

    Someone please can help me to find out what is going wrong with my configuration?


    Thanks a lot.


    Here is the configuration I have in my.cnf

    node 1

    wsrep_provider_options="gmcast.listen_addr=tcp://0.0.0.0:4567"
    wsrep_cluster_address=gcomm://


    datadir=/var/lib/mysql
    user=mysql
    # Path to Galera library
    wsrep_provider=/usr/lib64/libgalera_smm.so
    # Cluster connection URL contains the IPs of node#1, node#2 and node#3

    # In order for Galera to work correctly binlog format should be ROW
    binlog_format=ROW
    # MyISAM storage engine has only experimental support
    default_storage_engine=InnoDB
    # This is a recommended tuning variable for performance
    innodb_locks_unsafe_for_binlog=1
    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    innodb_autoinc_lock_mode=2
    # Node #1 address
    wsrep_node_address=@IP node 1
    wsrep_node_name=node1
    # SST method
    wsrep_sst_method=rsync
    # Cluster name
    wsrep_cluster_name=my_cluster
    # Authentication for SST method
    wsrep_sst_auth="sstuser:mdp"
    server-id = 1

    node 2

    wsrep_provider_options="gmcast.listen_addr=tcp://0.0.0.0:4567"
    wsrep_cluster_address=gcomm://@IP node1


    datadir=/var/lib/mysql
    user=mysql
    # Path to Galera library
    wsrep_provider=/usr/lib64/libgalera_smm.so

    # In order for Galera to work correctly binlog format should be ROW
    binlog_format=ROW
    # MyISAM storage engine has only experimental support
    default_storage_engine=InnoDB
    # This is a recommended tuning variable for performance
    innodb_locks_unsafe_for_binlog=1
    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    innodb_autoinc_lock_mode=2
    # Node #1 address
    wsrep_node_address=@IP node2
    # SST method
    wsrep_sst_method=rsync
    # Cluster name
    wsrep_cluster_name=my_cluster
    # Authentication for SST method
    wsrep_sst_auth="sstuser:mdp"
    server-id = 2


    node3
    wsrep_provider_options="gmcast.listen_addr=tcp://0.0.0.0:4567"
    wsrep_cluster_address=gcomm://@IP node1

    datadir=/var/lib/mysql
    user=mysql
    # Path to Galera library
    wsrep_provider=/usr/lib64/libgalera_smm.so

    # In order for Galera to work correctly binlog format should be ROW
    binlog_format=ROW
    # MyISAM storage engine has only experimental support
    default_storage_engine=InnoDB
    # This is a recommended tuning variable for performance
    innodb_locks_unsafe_for_binlog=1
    # This changes how InnoDB autoincrement locks are managed and is a requirement for Galera
    innodb_autoinc_lock_mode=2
    # Node #1 address
    wsrep_node_address=@IP node3
    # SST method
    wsrep_sst_method=rsync
    # Cluster name
    wsrep_cluster_name=my_cluster
    # Authentication for SST method
    wsrep_sst_auth="sstuser:mdp"
    server-id = 3



    So When I try to start the node 2 and 3 , here are the log I have :

    130710 16:30:02 mysqld_safe mysqld from pid file /var/lib/mysql/node2.pid ended
    130710 16:39:08 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
    130710 16:39:08 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.RtwzeB83XD
    130710 16:39:13 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
    130710 16:39:13 [Note] WSREP: wsrep_start_position var submitted: '00000000-0000-0000-0000-000000000000:-1'
    130710 16:39:13 [Note] WSREP: Read nil XID from storage engines, skipping position init
    130710 16:39:13 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/libgalera_smm.so'
    130710 16:39:13 [Note] WSREP: wsrep_load(): Galera 2.5(r150) by Codership Oy <info@codership.com> loaded succesfully.
    130710 16:39:13 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
    130710 16:39:13 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.
    130710 16:39:13 [Note] WSREP: Passing config to GCS: base_host = @IPnode2; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_limit = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3
    130710 16:39:13 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
    130710 16:39:13 [Note] WSREP: wsrep_sst_grab()
    130710 16:39:13 [Note] WSREP: Start replication
    130710 16:39:13 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
    130710 16:39:13 [Note] WSREP: protonet asio version 0
    130710 16:39:13 [Note] WSREP: backend: asio
    130710 16:39:13 [Note] WSREP: GMCast version 0
    130710 16:39:13 [Note] WSREP: (7d9463fd-e96e-11e2-0800-2d863130a295, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
    130710 16:39:13 [Note] WSREP: (7d9463fd-e96e-11e2-0800-2d863130a295, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
    130710 16:39:13 [Note] WSREP: EVS version 0
    130710 16:39:13 [Note] WSREP: PC version 0
    130710 16:39:13 [Note] WSREP: gcomm: connecting to group 'my_cluster', peer '@IPnode1:'
    130710 16:39:14 [Note] WSREP: declaring 43070462-e96e-11e2-0800-a8d5b7bb6865 stable
    130710 16:39:14 [Note] WSREP: Node 43070462-e96e-11e2-0800-a8d5b7bb6865 state prim
    130710 16:39:14 [Note] WSREP: view(view_id(PRIM,43070462-e96e-11e2-0800-a8d5b7bb6865,2) memb {
    43070462-e96e-11e2-0800-a8d5b7bb6865,
    7d9463fd-e96e-11e2-0800-2d863130a295,
    } joined {
    } left {
    } partitioned {
    })
    130710 16:39:14 [Note] WSREP: gcomm: connected
    130710 16:39:14 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
    130710 16:39:14 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
    130710 16:39:14 [Note] WSREP: Opened channel 'my_cluster'
    130710 16:39:14 [Note] WSREP: Waiting for SST to complete.
    130710 16:39:14 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
    130710 16:39:14 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
    130710 16:39:14 [Note] WSREP: STATE EXCHANGE: sent state msg: 7de25592-e96e-11e2-0800-6c7412230421
    130710 16:39:14 [Note] WSREP: STATE EXCHANGE: got state msg: 7de25592-e96e-11e2-0800-6c7412230421 from 0 (node1)
    130710 16:39:14 [Note] WSREP: STATE EXCHANGE: got state msg: 7de25592-e96e-11e2-0800-6c7412230421 from 1 (node2)
    130710 16:39:14 [Note] WSREP: Quorum results:
    version = 2,
    component = PRIMARY,
    conf_id = 1,
    members = 1/2 (joined/total),
    act_id = 0,
    last_appl. = -1,
    protocols = 0/4/2 (gcs/repl/appl),
    group UUID = bce764fb-e884-11e2-0800-f7e913b7323c
    130710 16:39:14 [Note] WSREP: Flow-control interval: [23, 23]
    130710 16:39:14 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 0)
    130710 16:39:14 [Note] WSREP: State transfer required:
    Group state: bce764fb-e884-11e2-0800-f7e913b7323c:0
    Local state: 00000000-0000-0000-0000-000000000000:-1
    130710 16:39:14 [Note] WSREP: New cluster view: global state: bce764fb-e884-11e2-0800-f7e913b7323c:0, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 2
    130710 16:39:14 [Warning] WSREP: Gap in state sequence. Need state transfer.
    130710 16:39:16 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'joiner' --address '@IPnode2' --auth 'sstuser:mdp' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '4209''
    130710 16:39:16 [Note] WSREP: Prepared SST request: xtrabackup|@IPnode2:4444/xtrabackup_sst
    130710 16:39:16 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
    130710 16:39:16 [Note] WSREP: Assign initial position for certification: 0, protocol version: 2
    130710 16:39:16 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (bce764fb-e884-11e2-0800-f7e913b7323c): 1 (Operation not permitted)
    at galera/src/replicator_str.cpprepare_for_IST():436. IST will be unavailable.
    130710 16:39:16 [Note] WSREP: Node 1 (node2) requested state transfer from '*any*'. Selected 0 (node1)(SYNCED) as donor.
    130710 16:39:16 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 0)
    130710 16:39:16 [Note] WSREP: Requesting state transfer: success, donor: 0
    WSREP_SST: [ERROR] xtrabackup process ended without creating '/var/lib/mysql//xtrabackup_galera_info' (20130710 16:39:25.555)
    130710 16:39:25 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup --role 'joiner' --address '@IPnode2' --auth 'sstuser:mdp' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '4209': 32 (Broken pipe)
    130710 16:39:25 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
    130710 16:39:25 [ERROR] WSREP: SST failed: 32 (Broken pipe)
    130710 16:39:25 [ERROR] Aborting

    130710 16:39:25 [Warning] WSREP: 0 (node1): State transfer to 1 (node2) failed: -1 (Operation not permitted)
    130710 16:39:25 [ERROR] WSREP: gcs/src/gcs_group.c:gcs_group_handle_join_msg():719: Will never receive state. Need to abort.
    130710 16:39:25 [Note] WSREP: gcomm: terminating thread
    130710 16:39:25 [Note] WSREP: gcomm: joining thread
    130710 16:39:25 [Note] WSREP: gcomm: closing backend
    130710 16:39:26 [Note] WSREP: view(view_id(NON_PRIM,43070462-e96e-11e2-0800-a8d5b7bb6865,2) memb {
    7d9463fd-e96e-11e2-0800-2d863130a295,
    } joined {
    } left {
    } partitioned {
    43070462-e96e-11e2-0800-a8d5b7bb6865,
    })
    130710 16:39:26 [Note] WSREP: view((empty))
    130710 16:39:26 [Note] WSREP: gcomm: closed
    130710 16:39:26 [Note] WSREP: gcomm: closed
    130710 16:39:26 [Note] WSREP: /usr/sbin/mysqld: Terminated.
    130710 16:39:26 mysqld_safe mysqld from pid file /var/lib/mysql/node2.pid ended









  • #2
    Check the innobackupex.backup.log in the datadir on the first node after the second fails with xtrabackup -- this will tell you what error xtrabackup is facing. Did you execute the grants necessary for Xtrabackup?


    As far as with rsync -- I'm not sure what's happening. Can you confirm all nodes have joined the cluster from SHOW GLOBAL STATUS like 'wsrep%'?

    Comment


    • #3
      Hi ,

      After updating the sstuser rights on mysql and adding the following option on my.cnf (wsrep_sst_donor= node1), everything gone well .

      All the nodes join the cluster , here is the result of the command --SHOW GLOBAL STATUS like 'wsrep%';-- :

      wsrep_local_state_comment | Synced |
      | wsrep_cert_index_size | 134145 |
      | wsrep_causal_reads | 0 |
      | wsrep_incoming_addresses | node1:3306,node2:3306,node3:3306 |
      | wsrep_cluster_conf_id | 17 |
      | wsrep_cluster_size | 3 |
      | wsrep_cluster_state_uuid | bce354fb-e884-11e2-0800-f7e913b7323c |
      | wsrep_cluster_status | Primary |
      | wsrep_connected | ON |
      | wsrep_local_index | 2 |
      | wsrep_provider_name | Galera |
      | wsrep_provider_vendor | Codership Oy <info@codership.com> |
      | wsrep_provider_version | 2.5(r150) |
      | wsrep_ready | ON



      Thanks a lots for your answer.

      Comment

      Working...
      X