Announcement

Announcement Module
Collapse
No announcement yet.

Replicate data from normal Master-Slave setup into a new PXC-cluster

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replicate data from normal Master-Slave setup into a new PXC-cluster

    Hi all
    I would like to use Percona XtraDB Cluster instead of our current Master-Slave setup. Currently I'm evaluating the whole stuff and already set up the PXC cluster with 3 nodes. Here are some hardware informations about the nodes: 3 x HP ProLiant DL380 G5 with
    - 8 CPU-Cores (2 x Intel E5410)
    - 32 GB RAM
    - 2 x 72G HDD (15k RPM / RAID 1) as root partition
    - 5 x 146G HDD (15k RPM / RAID 10 / 1 hotspare) mounted on /var/lib/mysql
    - OS: Debian Wheezy
    - PXC-Version: 5.6.19-25

    The problem is getting the data from the master-slave setup into the pxc-cluster.
    the current database size is ~ 35G.
    The pxc-cluster is already set up and running well. So the first pxc-node (let's call him pxc-01) node is also set up as a mysql-slave replicating from the current master-slave cluster. That also works well.
    I have set log-slave-updates in my.cnf.

    The /etc/mysql/my.cnf of pxc-01 looks like this:

    # Slave settings:
    skip-slave-start
    server-id = 1234
    log-slave-updates = 1
    log_bin = /var/log/mysql/mysql-bin.log
    relay-log = mysqld-relay-bin
    expire_logs_days = 1
    max_binlog_size = 100M

    # Peromance settings:
    innodb_buffer_pool_size = 24G
    innodb_flush_log_at_trx_commit = 2

    I've tried many times to get the whole pxc-cluster working but until now without any success.

    pxc-01 is able to replicate without any problems, as long as the other pxc-nodes do not join the cluster. Whenever I start them or make them joining the cluster (they're getting the data via SST), the replication on pxc-01 fails/stops working.
    I'm always checking replication status on pxc-01 with SHOW SLAVE STATUS NOLOCK\G, and I'm getting the information as fast as on every other slave -> As long as the other pxc-nodes are not in the cluster. If they are in the cluster, the replication stops working. SHOW SLAVE STATUS NOLOCK\G on pxc-01 doesn't print anything, master.info is not being updated anymore and the "SHOW SLAVE STATUS NOLOCK\G"-process has to be killed or I have to restart MySQL on pxc-01. I can also see in the mysql-processlist, that nothing is being done by the slave anymore. The error-log doesn't give any information about that and I found nobody having the same problem.

    The current master is running with MySQL-5.1.49. But this can't be the problem, because the MySQL-Slaves of that master are running with MySQL 5.5 and everything works fine. Is it possible that the throughput of the replication in combination with the stuff that the pxc-cluster does is simply too much for the system?But I guess that there are more complex use cases than this one, with more throughput than in this case.
    I can also ensure, that the pxc-01 replicates from the master without any errors. But only as long as the other nodes don't join the pxc-cluster.
    I've also tried to import the mysqldump on pxc-01 while the whole pxc-cluster was running. The import worked, but as soon as I start the replication on pxc-01, the slave hangs. I've really tried that several times. Sometimes the replication worked 1 hour, but never longer than that.

    Please let me know if you need more informations or something.

    I would really like to use PXC, but I have to ensure that everything will work as expected and I would like to have no (or just a very very short) downtime.
    Last edited by Kasu; 07-31-2014, 09:45 AM.

  • #2
    What is happening in the PXC cluster once you add second node? Is the cluster actually formed or failed? Paste the log error fragment showing the moment when second node joins and slave node stops working.
    Also, do you have MyISAM tables that may be replicated from async master?

    Comment


    • #3
      Hi przemek!
      Thanks for your fast reply

      The cluster is formend although the slave is not working.
      There are only innodb-tables replicated and here is the log fragment:

      Additional informations: pxc-01 is named backenddb-01 and the second pxc-node is backenddb-02
      Code:
      2014-07-31 19:14:33 1503 [Note] WSREP: (fe9842b9, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
      2014-07-31 19:14:34 1503 [Note] WSREP: declaring 244a7896 at tcp://172.20.0.9:4567 stable
      2014-07-31 19:14:34 1503 [Note] WSREP: Node fe9842b9 state prim
      2014-07-31 19:14:34 1503 [Note] WSREP: save pc into disk
      2014-07-31 19:14:34 1503 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
      2014-07-31 19:14:34 1503 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
      2014-07-31 19:14:34 1503 [Note] WSREP: STATE EXCHANGE: sent state msg: 24e38cd1-18d6-11e4-abbc-af0bcdbb236f
      2014-07-31 19:14:34 1503 [Note] WSREP: STATE EXCHANGE: got state msg: 24e38cd1-18d6-11e4-abbc-af0bcdbb236f from 0 (backenddb-02)
      2014-07-31 19:14:34 1503 [Note] WSREP: STATE EXCHANGE: got state msg: 24e38cd1-18d6-11e4-abbc-af0bcdbb236f from 1 (backenddb-01)
      2014-07-31 19:14:34 1503 [Note] WSREP: Quorum results:
              version    = 3,
              component  = PRIMARY,
              conf_id    = 1,
              members    = 1/2 (joined/total),
              act_id     = 3662350,
              last_appl. = 3662240,
              protocols  = 0/6/3 (gcs/repl/appl),
              group UUID = a296889f-13fd-11e4-9807-f2390d3c4de7
      2014-07-31 19:14:34 1503 [Note] WSREP: Flow-control interval: [23, 23]
      2014-07-31 19:14:34 1503 [Note] WSREP: New cluster view: global state: a296889f-13fd-11e4-9807-f2390d3c4de7:3662350, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 3
      2014-07-31 19:14:34 1503 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2014-07-31 19:14:34 1503 [Note] WSREP: REPL Protocols: 6 (3, 2)
      2014-07-31 19:14:34 1503 [Note] WSREP: Service thread queue flushed.
      2014-07-31 19:14:34 1503 [Note] WSREP: Assign initial position for certification: 3662350, protocol version: 3
      2014-07-31 19:14:34 1503 [Note] WSREP: Service thread queue flushed.
      2014-07-31 19:14:37 1503 [Note] WSREP: (fe9842b9, 'tcp://0.0.0.0:4567') turning message relay requesting off
      2014-07-31 19:14:37 1503 [Note] WSREP: Member 0.0 (backenddb-02) requested state transfer from '*any*'. Selected 1.0 (backenddb-01)(SYNCED) as donor.
      2014-07-31 19:14:37 1503 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 3662350)
      2014-07-31 19:14:37 1503 [Note] WSREP: IST request: a296889f-13fd-11e4-9807-f2390d3c4de7:2201852-3662350|tcp://172.20.0.9:4568
      2014-07-31 19:14:37 1503 [Note] WSREP: IST first seqno 2201853 not found from cache, falling back to SST
      2014-07-31 19:14:37 1503 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2014-07-31 19:14:37 1503 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.20.0.9:4444/xtrabackup_sst' --auth 'sstuser:s3cretPass' --socket '/var/run/mysqld/mysqld.sock' --data
      dir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf'   '' --gtid 'a296889f-13fd-11e4-9807-f2390d3c4de7:3662350''
      2014-07-31 19:14:37 1503 [Note] WSREP: sst_donor_thread signaled with 0
      WSREP_SST: [INFO] Streaming with xbstream (20140731 19:14:37.753)
      WSREP_SST: [INFO] Using socat as streamer (20140731 19:14:37.756)
      WSREP_SST: [INFO] Using /tmp/tmp.ACIfiRagij as innobackupex temporary directory (20140731 19:14:38.163)
      WSREP_SST: [INFO] Streaming GTID file before SST (20140731 19:14:38.168)
      WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio TCP:172.20.0.9:4444; RC=( ${PIPESTATUS[@]} ) (20140731 19:14:38.171)
      WSREP_SST: [INFO] Sleeping before data transfer for SST (20140731 19:14:38.201)
      WSREP_SST: [INFO] Streaming the backup to joiner at 172.20.0.9 4444 (20140731 19:14:48.204)
      WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/mysql/my.cnf --no-version-check  $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | socat -u stdio T
      CP:172.20.0.9:4444; RC=( ${PIPESTATUS[@]} ) (20140731 19:14:48.208)
      2014-07-31 19:21:57 1503 [Note] WSREP: Provider paused at a296889f-13fd-11e4-9807-f2390d3c4de7:3702665 (48769)
      2014-07-31 19:22:00 1503 [Note] WSREP: resuming provider at 48769
      2014-07-31 19:22:00 1503 [Note] WSREP: Provider resumed.
      2014-07-31 19:22:08 1503 [Note] WSREP: 1.0 (backenddb-01): State transfer to 0.0 (backenddb-02) complete.
      2014-07-31 19:22:08 1503 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 3703347)
      2014-07-31 19:22:08 1503 [Note] WSREP: Member 1.0 (backenddb-01) synced with group.
      2014-07-31 19:22:08 1503 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3703347)
      2014-07-31 19:22:08 1503 [Note] WSREP: Synchronized with group, ready for connections
      2014-07-31 19:22:08 1503 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      WSREP_SST: [INFO] Total time on donor: 0 seconds (20140731 19:22:08.571)
      WSREP_SST: [INFO] Cleaning up temporary directories (20140731 19:22:08.574)
      2014-07-31 19:24:48 1503 [Note] WSREP: 0.0 (backenddb-02): State transfer from 1.0 (backenddb-01) complete.
      After stopping MySQL on backenddb-02 the slave started working again and everything was back to normal.

      I'm getting more and more desperate

      Comment


      • #4
        Update:

        I did the same stuff again. Stopped MySQL on backenddb-02, waited a few minutes and started again. Here is the result of the log from backenddb-01:
        Code:
        140731 20:10:47 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
        140731 20:10:47 mysqld_safe Skipping wsrep-recover for a296889f-13fd-11e4-9807-f2390d3c4de7:3878257 pair
        140731 20:10:47 mysqld_safe Assigning a296889f-13fd-11e4-9807-f2390d3c4de7:3878257 to wsrep_start_position
        2014-07-31 20:10:47 0 [Warning] Using unique option prefix key_buffer instead of key_buffer_size is deprecated and will be removed in a future release. Please use the full name instead.
        2014-07-31 20:10:47 0 [Note] WSREP: wsrep_start_position var submitted: 'a296889f-13fd-11e4-9807-f2390d3c4de7:3878257'
        2014-07-31 20:10:47 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
        2014-07-31 20:10:47 31574 [Note] WSREP: Read nil XID from storage engines, skipping position init
        2014-07-31 20:10:47 31574 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/libgalera_smm.so'
        2014-07-31 20:10:47 31574 [Note] WSREP: wsrep_load(): Galera 3.6(r3a949e6) by Codership Oy <info@codership.com> loaded successfully.
        2014-07-31 20:10:47 31574 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
        2014-07-31 20:10:47 31574 [Note] WSREP: Found saved state: a296889f-13fd-11e4-9807-f2390d3c4de7:3878257
        2014-07-31 20:10:47 31574 [Note] WSREP: Passing config to GCS: base_host = 172.20.0.8; base_port = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 50G; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.recovery = 1; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weigh
        2014-07-31 20:10:47 31574 [Note] WSREP: Service thread queue flushed.
        2014-07-31 20:10:47 31574 [Note] WSREP: Assign initial position for certification: 3878257, protocol version: -1
        2014-07-31 20:10:47 31574 [Note] WSREP: wsrep_sst_grab()
        2014-07-31 20:10:47 31574 [Note] WSREP: Start replication
        2014-07-31 20:10:47 31574 [Note] WSREP: Setting initial position to a296889f-13fd-11e4-9807-f2390d3c4de7:3878257
        2014-07-31 20:10:47 31574 [Note] WSREP: protonet asio version 0
        2014-07-31 20:10:47 31574 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
        2014-07-31 20:10:47 31574 [Note] WSREP: backend: asio
        2014-07-31 20:10:47 31574 [Warning] WSREP: access file(gvwstate.dat) failed(No such file or directory)
        2014-07-31 20:10:47 31574 [Note] WSREP: restore pc from disk failed
        2014-07-31 20:10:47 31574 [Note] WSREP: GMCast version 0
        2014-07-31 20:10:47 31574 [Note] WSREP: (ff66a051, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
        2014-07-31 20:10:47 31574 [Note] WSREP: (ff66a051, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
        2014-07-31 20:10:47 31574 [Note] WSREP: EVS version 0
        2014-07-31 20:10:47 31574 [Note] WSREP: PC version 0
        2014-07-31 20:10:47 31574 [Note] WSREP: gcomm: bootstrapping new group 'backenddb_cluster'
        2014-07-31 20:10:47 31574 [Note] WSREP: start_prim is enabled, turn off pc_recovery
        2014-07-31 20:10:47 31574 [Note] WSREP: Node ff66a051 state prim
        2014-07-31 20:10:47 31574 [Note] WSREP: save pc into disk
        2014-07-31 20:10:47 31574 [Note] WSREP: discarding pending addr without UUID: tcp://172.20.0.10:4567
        2014-07-31 20:10:47 31574 [Note] WSREP: discarding pending addr proto entry 0x2d71a50
        2014-07-31 20:10:47 31574 [Note] WSREP: discarding pending addr without UUID: tcp://172.20.0.8:4567
        2014-07-31 20:10:47 31574 [Note] WSREP: discarding pending addr proto entry 0x2d7a110
        2014-07-31 20:10:47 31574 [Note] WSREP: discarding pending addr without UUID: tcp://172.20.0.9:4567
        2014-07-31 20:10:47 31574 [Note] WSREP: discarding pending addr proto entry 0x2d82910
        2014-07-31 20:10:47 31574 [Note] WSREP: gcomm: connected
        2014-07-31 20:10:47 31574 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
        2014-07-31 20:10:47 31574 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
        2014-07-31 20:10:47 31574 [Note] WSREP: Opened channel 'backenddb_cluster'
        2014-07-31 20:10:47 31574 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
        2014-07-31 20:10:47 31574 [Note] WSREP: Waiting for SST to complete.
        2014-07-31 20:10:47 31574 [Note] WSREP: STATE_EXCHANGE: sent state UUID: ff675558-18dd-11e4-9055-6fd8d8ced1a3
        2014-07-31 20:10:47 31574 [Note] WSREP: STATE EXCHANGE: sent state msg: ff675558-18dd-11e4-9055-6fd8d8ced1a3
        2014-07-31 20:10:47 31574 [Note] WSREP: STATE EXCHANGE: got state msg: ff675558-18dd-11e4-9055-6fd8d8ced1a3 from 0 (backenddb-01)
        2014-07-31 20:10:47 31574 [Note] WSREP: Quorum results:
            version    = 3,
            component  = PRIMARY,
            conf_id    = 0,
            members    = 1/1 (joined/total),
            act_id     = 3878257,
            last_appl. = -1,
            protocols  = 0/6/3 (gcs/repl/appl),
            group UUID = a296889f-13fd-11e4-9807-f2390d3c4de7
        2014-07-31 20:10:47 31574 [Note] WSREP: Flow-control interval: [16, 16]
        2014-07-31 20:10:47 31574 [Note] WSREP: Restored state OPEN -> JOINED (3878257)
        2014-07-31 20:10:47 31574 [Note] WSREP: Member 0.0 (backenddb-01) synced with group.
        2014-07-31 20:10:47 31574 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3878257)
        2014-07-31 20:10:47 31574 [Note] WSREP: New cluster view: global state: a296889f-13fd-11e4-9807-f2390d3c4de7:3878257, view# 1: Primary, number of nodes: 1, my index: 0, protocol version 3
        2014-07-31 20:10:47 31574 [Note] WSREP: SST complete, seqno: 3878257
        2014-07-31 20:10:47 31574 [Warning] Using unique option prefix myisam-recover instead of myisam-recover-options is deprecated and will be removed in a future release. Please use the full name instead.
        2014-07-31 20:10:47 31574 [Note] Plugin 'FEDERATED' is disabled.
        2014-07-31 20:10:47 31574 [Note] InnoDB: Using atomics to ref count buffer pool pages
        2014-07-31 20:10:47 31574 [Note] InnoDB: The InnoDB memory heap is disabled
        2014-07-31 20:10:47 31574 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
        2014-07-31 20:10:47 31574 [Note] InnoDB: Compressed tables use zlib 1.2.7
        2014-07-31 20:10:47 31574 [Note] InnoDB: Using Linux native AIO
        2014-07-31 20:10:47 31574 [Note] InnoDB: Not using CPU crc32 instructions
        2014-07-31 20:10:47 31574 [Note] InnoDB: Initializing buffer pool, size = 24.0G
        2014-07-31 20:10:50 31574 [Note] InnoDB: Completed initialization of buffer pool
        2014-07-31 20:10:50 31574 [Note] InnoDB: Highest supported file format is Barracuda.
        2014-07-31 20:10:51 31574 [Note] InnoDB: 128 rollback segment(s) are active.
        2014-07-31 20:10:51 31574 [Note] InnoDB: Waiting for purge to start
        2014-07-31 20:10:51 31574 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.19-67.0 started; log sequence number 381227495665
        2014-07-31 20:10:51 31574 [Note] RSA private key file not found: /var/lib/mysql//private_key.pem. Some authentication plugins will not work.
        2014-07-31 20:10:51 31574 [Note] RSA public key file not found: /var/lib/mysql//public_key.pem. Some authentication plugins will not work.
        2014-07-31 20:10:51 31574 [Note] Server hostname (bind-address): '172.20.0.8'; port: 3306
        2014-07-31 20:10:51 31574 [Note]   - '172.20.0.8' resolves to '172.20.0.8';
        2014-07-31 20:10:51 31574 [Note] Server socket created on IP: '172.20.0.8'.
        2014-07-31 20:10:51 31574 [Note] Event Scheduler: Loaded 0 events
        2014-07-31 20:10:51 31574 [Note] /usr/sbin/mysqld: ready for connections.
        Version: '5.6.19-67.0-56-log'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  Percona XtraDB Cluster (GPL), Release 25.6, wsrep_25.6.r4111

        Comment


        • #5

          Code:
          2014-07-31 20:10:51 31574 [Note] WSREP: inited wsrep sidno 2
          2014-07-31 20:10:51 31574 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
          2014-07-31 20:10:51 31574 [Note] WSREP: REPL Protocols: 6 (3, 2)
          2014-07-31 20:10:51 31574 [Note] WSREP: Service thread queue flushed.
          2014-07-31 20:10:51 31574 [Note] WSREP: Assign initial position for certification: 3878257, protocol version: 3
          2014-07-31 20:10:51 31574 [Note] WSREP: Service thread queue flushed.
          2014-07-31 20:10:51 31574 [Note] WSREP: Synchronized with group, ready for connections
          2014-07-31 20:10:51 31574 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
          2014-07-31 20:11:21 31574 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
          2014-07-31 20:11:21 31574 [Warning] Slave SQL: If a crash happens this configuration does not guarantee that the relay log info will be consistent, Error_code: 0
          2014-07-31 20:11:21 31574 [Note] WSREP: ready state reached
          2014-07-31 20:11:21 31574 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.235894' at position 79854239, relay log './mysqld-relay-bin.001643' position: 79854398
          2014-07-31 20:11:21 31574 [Note] Slave I/O thread: connected to master 'replication@172.20.2.3:3306',replication started in log 'mysql-bin.235895' at position 52096169
          2014-07-31 20:11:21 31574 [Warning] Slave I/O: Notifying master by SET @master_binlog_checksum= @@global.binlog_checksum failed with error: Unknown system variable 'binlog_checksum', Error_code: 1193
          2014-07-31 20:11:21 31574 [Warning] Slave I/O: Unknown system variable 'SERVER_UUID' on master. A probable cause is that the variable is not supported on the master (version: 5.1.49-3-log), even though it is on the slave (version: 5.6.19-67.0-56-log), Error_code: 1193
          2014-07-31 20:11:55 31574 [Note] WSREP: (ff66a051, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
          2014-07-31 20:11:55 31574 [Note] WSREP: declaring 27ef482c at tcp://172.20.0.9:4567 stable
          2014-07-31 20:11:55 31574 [Note] WSREP: Node ff66a051 state prim
          2014-07-31 20:11:55 31574 [Note] WSREP: save pc into disk
          2014-07-31 20:11:55 31574 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
          2014-07-31 20:11:55 31574 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
          2014-07-31 20:11:56 31574 [Note] WSREP: STATE EXCHANGE: sent state msg: 283c0b18-18de-11e4-9378-7b8ab369abcd
          2014-07-31 20:11:56 31574 [Note] WSREP: STATE EXCHANGE: got state msg: 283c0b18-18de-11e4-9378-7b8ab369abcd from 0 (backenddb-02)
          2014-07-31 20:11:56 31574 [Note] WSREP: STATE EXCHANGE: got state msg: 283c0b18-18de-11e4-9378-7b8ab369abcd from 1 (backenddb-01)
          2014-07-31 20:11:56 31574 [Note] WSREP: Quorum results:
          version = 3,
          component = PRIMARY,
          conf_id = 1,
          members = 1/2 (joined/total),
          act_id = 3878257,
          last_appl. = 0,
          protocols = 0/6/3 (gcs/repl/appl),
          group UUID = a296889f-13fd-11e4-9807-f2390d3c4de7
          2014-07-31 20:11:56 31574 [Note] WSREP: Flow-control interval: [23, 23]
          2014-07-31 20:11:56 31574 [Note] WSREP: New cluster view: global state: a296889f-13fd-11e4-9807-f2390d3c4de7:3878257, view# 2: Primary, number of nodes: 2, my index: 1, protocol version 3
          2014-07-31 20:11:56 31574 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
          2014-07-31 20:11:56 31574 [Note] WSREP: REPL Protocols: 6 (3, 2)
          2014-07-31 20:11:56 31574 [Note] WSREP: Service thread queue flushed.
          2014-07-31 20:11:56 31574 [Note] WSREP: Assign initial position for certification: 3878257, protocol version: 3
          2014-07-31 20:11:56 31574 [Note] WSREP: Service thread queue flushed.
          2014-07-31 20:11:58 31574 [Note] WSREP: (ff66a051, 'tcp://0.0.0.0:4567') turning message relay requesting off
          2014-07-31 20:11:59 31574 [Note] WSREP: Member 0.0 (backenddb-02) requested state transfer from '*any*'. Selected 1.0 (backenddb-01)(SYNCED) as donor.
          2014-07-31 20:11:59 31574 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 3878257)
          2014-07-31 20:11:59 31574 [Note] WSREP: IST request: a296889f-13fd-11e4-9807-f2390d3c4de7:3702665-3878257|tcp://172.20.0.9:4568
          2014-07-31 20:11:59 31574 [Note] WSREP: IST first seqno 3702666 not found from cache, falling back to SST
          2014-07-31 20:11:59 31574 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
          2014-07-31 20:11:59 31574 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.20.0.9:4444/xtrabackup_sst' --auth 'sstuser:s3cretPass' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' '' --gtid 'a296889f-13fd-11e4-9807-f2390d3c4de7:3878257''
          2014-07-31 20:11:59 31574 [Note] WSREP: sst_donor_thread signaled with 0
          WSREP_SST: [INFO] Streaming with xbstream (20140731 20:11:59.315)
          WSREP_SST: [INFO] Using socat as streamer (20140731 20:11:59.317)
          WSREP_SST: [INFO] Using /tmp/tmp.6i6qzI1XGo as innobackupex temporary directory (20140731 20:11:59.599)
          WSREP_SST: [INFO] Streaming GTID file before SST (20140731 20:11:59.605)
          WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio TCP:172.20.0.9:4444; RC=( ${PIPESTATUS[@]} ) (20140731 20:11:59.608)
          WSREP_SST: [INFO] Sleeping before data transfer for SST (20140731 20:11:59.614)
          WSREP_SST: [INFO] Streaming the backup to joiner at 172.20.0.9 4444 (20140731 20:12:09.617)
          WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/mysql/my.cnf --no-version-check $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | socat -u stdio TCP:172.20.0.9:4444; RC=( ${PIPESTATUS[@]} ) (20140731 20:12:09.621)
          2014/07/31 20:12:11 socat[32352] E write(3, 0x14732b0, 8192): Broken pipe
          WSREP_SST: [ERROR] innobackupex finished with error: 1. Check /var/lib/mysql//innobackup.backup.log (20140731 20:12:11.888)
          WSREP_SST: [ERROR] Cleanup after exit with status:22 (20140731 20:12:11.890)
          WSREP_SST: [INFO] Cleaning up temporary directories (20140731 20:12:11.893)
          2014-07-31 20:12:11 31574 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.20.0.9:4444/xtrabackup_sst' --auth 'sstuser:s3cretPass' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' '' --gtid 'a296889f-13fd-11e4-9807-f2390d3c4de7:3878257'
          2014-07-31 20:12:11 31574 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.20.0.9:4444/xtrabackup_sst' --auth 'sstuser:s3cretPass' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' '' --gtid 'a296889f-13fd-11e4-9807-f2390d3c4de7:3878257': 22 (Invalid argument)
          2014-07-31 20:12:11 31574 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.20.0.9:4444/xtrabackup_sst' --auth 'sstuser:s3cretPass' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' '' --gtid 'a296889f-13fd-11e4-9807-f2390d3c4de7:3878257'
          2014-07-31 20:12:11 31574 [Warning] WSREP: 1.0 (backenddb-01): State transfer to 0.0 (backenddb-02) failed: -22 (Invalid argument)
          2014-07-31 20:12:11 31574 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 3879661)
          2014-07-31 20:12:11 31574 [Note] WSREP: Member 1.0 (backenddb-01) synced with group.
          2014-07-31 20:12:11 31574 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3879661)
          2014-07-31 20:12:11 31574 [Note] WSREP: Synchronized with group, ready for connections
          2014-07-31 20:12:11 31574 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
          2014-07-31 20:12:12 31574 [Note] WSREP: forgetting 27ef482c (tcp://172.20.0.9:4567)
          2014-07-31 20:12:12 31574 [Note] WSREP: Node ff66a051 state prim
          2014-07-31 20:12:12 31574 [Note] WSREP: save pc into disk
          2014-07-31 20:12:12 31574 [Note] WSREP: forgetting 27ef482c (tcp://172.20.0.9:4567)
          2014-07-31 20:12:12 31574 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 1
          2014-07-31 20:12:12 31574 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 32067541-18de-11e4-ae17-bb9f72bd71e4
          2014-07-31 20:12:12 31574 [Note] WSREP: STATE EXCHANGE: sent state msg: 32067541-18de-11e4-ae17-bb9f72bd71e4
          2014-07-31 20:12:12 31574 [Note] WSREP: STATE EXCHANGE: got state msg: 32067541-18de-11e4-ae17-bb9f72bd71e4 from 0 (backenddb-01)
          2014-07-31 20:12:12 31574 [Note] WSREP: Quorum results:
          version = 3,
          component = PRIMARY,
          conf_id = 2,
          members = 1/1 (joined/total),
          act_id = 3879661,
          last_appl. = 3879607,
          protocols = 0/6/3 (gcs/repl/appl),
          group UUID = a296889f-13fd-11e4-9807-f2390d3c4de7
          2014-07-31 20:12:12 31574 [Note] WSREP: Flow-control interval: [16, 16]
          2014-07-31 20:12:12 31574 [Note] WSREP: New cluster view: global state: a296889f-13fd-11e4-9807-f2390d3c4de7:3879661, view# 3: Primary, number of nodes: 1, my index: 0, protocol version 3

          Comment


          • #6
            Code:
            2014-07-31 20:12:12 31574 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
            2014-07-31 20:12:12 31574 [Note] WSREP: REPL Protocols: 6 (3, 2)
            2014-07-31 20:12:12 31574 [Note] WSREP: Service thread queue flushed.
            2014-07-31 20:12:12 31574 [Note] WSREP: Assign initial position for certification: 3879661, protocol version: 3
            2014-07-31 20:12:12 31574 [Note] WSREP: Service thread queue flushed.
            2014-07-31 20:12:13 31574 [Note] WSREP: remote endpoint tcp://172.20.0.9:4567 changed identity 27ef482c -> 32873135
            2014-07-31 20:12:13 31574 [Note] WSREP: (ff66a051, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
            2014-07-31 20:12:13 31574 [Note] WSREP: declaring 32873135 at tcp://172.20.0.9:4567 stable
            2014-07-31 20:12:13 31574 [Note] WSREP: Node ff66a051 state prim
            2014-07-31 20:12:13 31574 [Note] WSREP: save pc into disk
            2014-07-31 20:12:13 31574 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
            2014-07-31 20:12:13 31574 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
            2014-07-31 20:12:14 31574 [Note] WSREP: STATE EXCHANGE: sent state msg: 32d3fc47-18de-11e4-a846-8bde725628ce
            2014-07-31 20:12:14 31574 [Note] WSREP: STATE EXCHANGE: got state msg: 32d3fc47-18de-11e4-a846-8bde725628ce from 0 (backenddb-02)
            2014-07-31 20:12:14 31574 [Note] WSREP: STATE EXCHANGE: got state msg: 32d3fc47-18de-11e4-a846-8bde725628ce from 1 (backenddb-01)
            2014-07-31 20:12:14 31574 [Note] WSREP: Quorum results:
            version = 3,
            component = PRIMARY,
            conf_id = 3,
            members = 1/2 (joined/total),
            act_id = 3879871,
            last_appl. = 3879787,
            protocols = 0/6/3 (gcs/repl/appl),
            group UUID = a296889f-13fd-11e4-9807-f2390d3c4de7
            2014-07-31 20:12:14 31574 [Note] WSREP: Flow-control interval: [23, 23]
            2014-07-31 20:12:14 31574 [Note] WSREP: New cluster view: global state: a296889f-13fd-11e4-9807-f2390d3c4de7:3879871, view# 4: Primary, number of nodes: 2, my index: 1, protocol version 3
            2014-07-31 20:12:14 31574 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
            2014-07-31 20:12:14 31574 [Note] WSREP: REPL Protocols: 6 (3, 2)
            2014-07-31 20:12:14 31574 [Note] WSREP: Service thread queue flushed.
            2014-07-31 20:12:14 31574 [Note] WSREP: Assign initial position for certification: 3879871, protocol version: 3
            2014-07-31 20:12:14 31574 [Note] WSREP: Service thread queue flushed.
            2014-07-31 20:12:16 31574 [Note] WSREP: Member 0.0 (backenddb-02) requested state transfer from '*any*'. Selected 1.0 (backenddb-01)(SYNCED) as donor.
            2014-07-31 20:12:16 31574 [Note] WSREP: Shifting SYNCED -> DONOR/DESYNCED (TO: 3880283)
            2014-07-31 20:12:16 31574 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
            2014-07-31 20:12:16 31574 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.20.0.9:4444/xtrabackup_sst' --auth 'sstuser:s3cretPass' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' '' --gtid 'a296889f-13fd-11e4-9807-f2390d3c4de7:3880283''
            2014-07-31 20:12:16 31574 [Note] WSREP: sst_donor_thread signaled with 0
            2014-07-31 20:12:16 31574 [Note] WSREP: (ff66a051, 'tcp://0.0.0.0:4567') turning message relay requesting off
            WSREP_SST: [INFO] Streaming with xbstream (20140731 20:12:17.094)
            WSREP_SST: [INFO] Using socat as streamer (20140731 20:12:17.096)
            WSREP_SST: [INFO] Using /tmp/tmp.QwS3wr9Hw4 as innobackupex temporary directory (20140731 20:12:17.391)
            WSREP_SST: [INFO] Streaming GTID file before SST (20140731 20:12:17.397)
            WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio TCP:172.20.0.9:4444; RC=( ${PIPESTATUS[@]} ) (20140731 20:12:17.400)
            WSREP_SST: [INFO] Sleeping before data transfer for SST (20140731 20:12:17.406)
            WSREP_SST: [INFO] Streaming the backup to joiner at 172.20.0.9 4444 (20140731 20:12:27.410)
            WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/etc/mysql/my.cnf --no-version-check $tmpopts $INNOEXTRA --galera-info --stream=$sfmt $itmpdir 2>${DATA}/innobackup.backup.log | socat -u stdio TCP:172.20.0.9:4444; RC=( ${PIPESTATUS[@]} ) (20140731 20:12:27.413)
            2014-07-31 20:16:33 7fe5e69e8700 InnoDB: unable to purge a record
            InnoDB: tuple DATA TUPLE: 2 fields;
            0: len 4; hex 002bad15; asc + ;;
            1: len 4; hex 000c8529; asc );;
            
            InnoDB: record PHYSICAL RECORD: n_fields 2; compact format; info bits 0
            0: len 4; hex 002bad15; asc + ;;
            1: len 4; hex 000c8529; asc );;
            
            space 551 offset 7760 (1095 records, index id 728)
            InnoDB: Submit a detailed bug report to http://bugs.mysql.com
            2014-07-31 20:16:52 31574 [ERROR] Transaction not registered for MySQL 2PC, but transaction is active
            2014-07-31 20:18:46 7fe5e59e6700 InnoDB: unable to purge a record
            InnoDB: tuple DATA TUPLE: 2 fields;
            0: len 4; hex 001604de; asc ;;
            1: len 4; hex 0005951e; asc ;;
            
            InnoDB: record PHYSICAL RECORD: n_fields 2; compact format; info bits 0
            0: len 4; hex 001604de; asc ;;
            1: len 4; hex 0005951e; asc ;;
            
            space 551 offset 4544 (937 records, index id 728)
            InnoDB: Submit a detailed bug report to http://bugs.mysql.com
            2014-07-31 20:19:04 31574 [Note] WSREP: Provider paused at a296889f-13fd-11e4-9807-f2390d3c4de7:3914349 (36433)
            2014-07-31 20:19:07 31574 [Note] WSREP: resuming provider at 36433
            2014-07-31 20:19:07 31574 [Note] WSREP: Provider resumed.
            2014-07-31 20:19:21 31574 [Note] WSREP: 1.0 (backenddb-01): State transfer to 0.0 (backenddb-02) complete.
            2014-07-31 20:19:21 31574 [Note] WSREP: Shifting DONOR/DESYNCED -> JOINED (TO: 3916957)
            2014-07-31 20:19:21 31574 [Note] WSREP: Member 1.0 (backenddb-01) synced with group.
            2014-07-31 20:19:21 31574 [Note] WSREP: Shifting JOINED -> SYNCED (TO: 3916957)
            2014-07-31 20:19:21 31574 [Note] WSREP: Synchronized with group, ready for connections
            2014-07-31 20:19:21 31574 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
            WSREP_SST: [INFO] Total time on donor: 0 seconds (20140731 20:19:21.425)
            WSREP_SST: [INFO] Cleaning up temporary directories (20140731 20:19:21.429)
            2014-07-31 20:23:36 31574 [Note] WSREP: 0.0 (backenddb-02): State transfer from 1.0 (backenddb-01) complete.
            2014-07-31 20:27:53 31574 [Note] WSREP: Member 0.0 (backenddb-02) synced with group.
            2014-07-31 20:29:38 7fe5e71e9700 InnoDB: unable to purge a record
            InnoDB: tuple DATA TUPLE: 2 fields;
            0: len 4; hex 000d5d7c; asc ]|;;
            1: len 4; hex 0003f422; asc ";;
            
            InnoDB: record PHYSICAL RECORD: n_fields 2; compact format; info bits 0
            0: len 4; hex 000d5d7c; asc ]|;;
            1: len 4; hex 0003f422; asc ";;
            
            space 551 offset 2319 (806 records, index id 728)
            InnoDB: Submit a detailed bug report to http://bugs.mysql.com
            2014-07-31 20:31:50 31574 [ERROR] Error reading packet from server: Lost connection to MySQL server during query ( server_errno=2013)
            2014-07-31 20:31:50 31574 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.235900' at position 64232560
            2014-07-31 20:31:50 31574 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
            2014-07-31 20:31:50 31574 [Warning] Slave I/O: Notifying master by SET @master_binlog_checksum= @@global.binlog_checksum failed with error: Unknown system variable 'binlog_checksum', Error_code: 1193
            2014-07-31 20:31:50 31574 [Warning] Slave I/O: Unknown system variable 'SERVER_UUID' on master. A probable cause is that the variable is not supported on the master (version: 5.1.49-3-log), even though it is on the slave (version: 5.6.19-67.0-56-log), Error_code: 1193
            I did a SHOW SLAVE STATUS NOLOCK\G on backenddb-01 and the slave was hanging again - No output.
            Here is the output of show processlist;
            Code:
            mysql> show processlist;
            +-----+-------------+-----------+--------------------+---------+------+----------------------------------------+--------------------------+-----------+---------------+
            | Id  | User        | Host      | db                 | Command | Time | State                                  | Info                     | Rows_sent | Rows_examined |
            +-----+-------------+-----------+--------------------+---------+------+----------------------------------------+--------------------------+-----------+---------------+
            |   1 | system user |           | NULL               | Sleep   | 1882 | wsrep aborter idle                     | NULL                     |         0 |             0 |
            |   2 | system user |           | NULL               | Sleep   | 1882 | NULL                                   | NULL                     |         0 |             0 |
            |   4 | root        | localhost | NULL               | Query   |  403 | init                                   | SHOW SLAVE STATUS NOLOCK |         0 |             0 |
            |   5 | system user |           | NULL               | Connect | 1848 | Queueing master event to the relay log | NULL                     |         0 |             0 |
            |   6 | system user |           | NULL               | Connect |  674 | wsrep in pre-commit stage              | NULL                     |         0 |             0 |
            +-----+-------------+-----------+--------------------+---------+------+----------------------------------------+--------------------------+-----------+---------------+

            Comment


            • #7
              And thats the output on backenddb-02:

              Code:
              mysql> show status like 'wsrep%';
              +------------------------------+--------------------------------------+
              | Variable_name                | Value                                |
              +------------------------------+--------------------------------------+
              | wsrep_local_state_uuid       | a296889f-13fd-11e4-9807-f2390d3c4de7 |
              | wsrep_protocol_version       | 6                                    |
              | wsrep_last_committed         | 3959061                              |
              | wsrep_replicated             | 0                                    |
              | wsrep_replicated_bytes       | 0                                    |
              | wsrep_repl_keys              | 0                                    |
              | wsrep_repl_keys_bytes        | 0                                    |
              | wsrep_repl_data_bytes        | 0                                    |
              | wsrep_repl_other_bytes       | 0                                    |
              | wsrep_received               | 79654                                |
              | wsrep_received_bytes         | 452958986                            |
              | wsrep_local_commits          | 0                                    |
              | wsrep_local_cert_failures    | 0                                    |
              | wsrep_local_replays          | 0                                    |
              | wsrep_local_send_queue       | 0                                    |
              | wsrep_local_send_queue_avg   | 0.001200                             |
              | wsrep_local_recv_queue       | 25                                   |
              | wsrep_local_recv_queue_avg   | 21744.541661                         |
              | wsrep_local_cached_downto    | 3879872                              |
              | wsrep_flow_control_paused_ns | 537270776867                         |
              | wsrep_flow_control_paused    | 0.371022                             |
              | wsrep_flow_control_sent      | 133                                  |
              | wsrep_flow_control_recv      | 133                                  |
              | wsrep_cert_deps_distance     | 2551.067470                          |
              | wsrep_apply_oooe             | 0.000000                             |
              | wsrep_apply_oool             | 0.000000                             |
              | wsrep_apply_window           | 1.000000                             |
              | wsrep_commit_oooe            | 0.000000                             |
              | wsrep_commit_oool            | 0.000000                             |
              | wsrep_commit_window          | 1.000000                             |
              | wsrep_local_state            | 4                                    |
              | wsrep_local_state_comment    | Synced                               |
              | wsrep_cert_index_size        | 83147                                |
              | wsrep_causal_reads           | 0                                    |
              | wsrep_cert_interval          | 0.000000                             |
              | wsrep_incoming_addresses     | 172.20.0.9:3306,172.20.0.8:3306      |
              | wsrep_evs_repl_latency       | 0/0/0/0/0                            |
              | wsrep_cluster_conf_id        | 4                                    |
              | wsrep_cluster_size           | 2                                    |
              | wsrep_cluster_state_uuid     | a296889f-13fd-11e4-9807-f2390d3c4de7 |
              | wsrep_cluster_status         | Primary                              |
              | wsrep_connected              | ON                                   |
              | wsrep_local_bf_aborts        | 0                                    |
              | wsrep_local_index            | 0                                    |
              | wsrep_provider_name          | Galera                               |
              | wsrep_provider_vendor        | Codership Oy <info@codership.com>    |
              | wsrep_provider_version       | 3.6(r3a949e6)                        |
              | wsrep_ready                  | ON                                   |
              +------------------------------+--------------------------------------+
              48 rows in set (0.00 sec

              Comment


              • #8
                Hi, sorry for getting back so late to this. I was busy and then I forgot.
                I can see several problems there. In some log the SST fails:

                Code:
                2014-07-31 20:12:11 31574 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.20.0.9:4444/xtrabackup_sst' --auth 'sstuser:s3cretPass' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' '' --gtid 'a296889f-13fd-11e4-9807-f2390d3c4de7:3878257'
                2014-07-31 20:12:11 31574 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.20.0.9:4444/xtrabackup_sst' --auth 'sstuser:s3cretPass' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' '' --gtid 'a296889f-13fd-11e4-9807-f2390d3c4de7:3878257': 22 (Invalid argument)
                2014-07-31 20:12:11 31574 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '172.20.0.9:4444/xtrabackup_sst' --auth 'sstuser:s3cretPass' --socket '/var/run/mysqld/mysqld.sock' --datadir '/var/lib/mysql/' --defaults-file '/etc/mysql/my.cnf' '' --gtid 'a296889f-13fd-11e4-9807-f2390d3c4de7:3878257'
                2014-07-31 20:12:11 31574 [Warning] WSREP: 1.0 (backenddb-01): State transfer to 0.0 (backenddb-02) failed: -22 (Invalid argument)


                in other place there seems to be some corruption:

                2014-07-31 20:16:33 7fe5e69e8700 InnoDB: unable to purge a record InnoDB: tuple DATA TUPLE: 2 fields; 0: len 4; hex 002bad15; asc + ;; 1: len 4; hex 000c8529; asc );;

                Check if maybe this is similar issue described here: https://bugs.launchpad.net/percona-server/+bug/1006095
                Then in the last status output I can see the cluster is formed from 2 nodes, but this one is blocked by flow control:

                Code:
                | wsrep_local_recv_queue | 25 |
                | wsrep_local_recv_queue_avg | 21744.541661 |
                | wsrep_local_cached_downto | 3879872 |
                | wsrep_flow_control_paused_ns | 537270776867 |
                | wsrep_flow_control_paused | 0.371022 |
                | wsrep_flow_control_sent | 133 |
                | wsrep_flow_control_recv | 133
                So at least this fact explains why the asynchronous replication is stopped - it cannot proceed if the slave node is blocked with flow control. Now the question is why the nodes are blocked and if the status transfer was really successful. The InnoDB errors should be investigated. Also how did you create the first node in the cluster - by using data set copy from 5.1 master? Note that in general the in place upgrade between 5.1 and 5.6 is not recommended and not really supported.
                Last edited by przemek; 10-28-2014, 05:50 PM.

                Comment

                Working...
                X