Announcement

Announcement Module
Collapse
No announcement yet.

Local state UUID does not match group state UUID

Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • Local state UUID does not match group state UUID

    I had two servers setup with mysql-server, I uninstalled mysql and installed percona xtradb cluster.

    I'm trying to boostrap my cluster - I've started the first node (settings files contains wsrep_cluster_address=gcomm://).





    The first node seems to start without issue but the second node refuses to sync and start up.





    Below is what I see in the MySQL error log from the second node:





    130425 15:44:38 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql


    130425 15:44:38 mysqld_safe WSREP: Running position recovery with --log_error=/tmp/tmp.n8hkZ04kO8


    130425 15:44:43 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1


    130425 15:44:43 [Note] WSREP: wsrep_start_position var submitted: '00000000-0000-0000-0000-000000000000:-1'


    130425 15:44:43 [Note] WSREP: Read nil XID from storage engines, skipping position init


    130425 15:44:43 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/libgalera_smm.so'


    130425 15:44:43 [Note] WSREP: wsrep_load(): Galera 2.5(r150) by Codership Oy <info@codership.com> loaded succesfully.


    130425 15:44:43 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1


    130425 15:44:43 [Note] WSREP: Reusing existing '/var/lib/mysql//galera.cache'.


    130425 15:44:43 [Note] WSREP: Passing config to GCS: base_host = 10.99.108.194; base_port = 4567; cert.log_conflicts = no; gcache.dir = /var/lib/mysql/; gcache.keep_page


    s_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1; gcs.fc_lim


    it = 16; gcs.fc_master_slave = NO; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.s


    ync_donor = NO; replicator.causal_read_timeout = PT30S; replicator.commit_order = 3


    130425 15:44:43 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1


    130425 15:44:43 [Note] WSREP: wsrep_sst_grab()


    130425 15:44:43 [Note] WSREP: Start replication


    130425 15:44:43 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1


    130425 15:44:43 [Note] WSREP: protonet asio version 0


    130425 15:44:43 [Note] WSREP: backend: asio


    130425 15:44:43 [Note] WSREP: GMCast version 0


    130425 15:44:43 [Note] WSREP: (9374e2ae-ade0-11e2-0800-ba787389444a, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567


    130425 15:44:43 [Note] WSREP: (9374e2ae-ade0-11e2-0800-ba787389444a, 'tcp://0.0.0.0:4567') multicast: , ttl: 1


    130425 15:44:43 [Note] WSREP: EVS version 0


    130425 15:44:43 [Note] WSREP: PC version 0


    130425 15:44:43 [Note] WSREP: gcomm: connecting to group 'db-clstr1', peer '10.99.108.183:,10.99.108.194:'


    130425 15:44:43 [Warning] WSREP: (9374e2ae-ade0-11e2-0800-ba787389444a, 'tcp://0.0.0.0:4567') address 'tcp://10.99.108.194:4567' points to own listening address, blacklisting


    130425 15:44:43 [Note] WSREP: (9374e2ae-ade0-11e2-0800-ba787389444a, 'tcp://0.0.0.0:4567') address 'tcp://10.99.108.194:4567' pointing to uuid 9374e2ae-ade0-11e2-0800-ba787389444a is blacklisted, skipping


    130425 15:44:43 [Note] WSREP: declaring 2a9ab4b8-ae17-11e2-0800-e1ab0a761d5f stable


    130425 15:44:43 [Note] WSREP: Node 2a9ab4b8-ae17-11e2-0800-e1ab0a761d5f state prim


    130425 15:44:43 [Note] WSREP: view(view_id(PRIM,2a9ab4b8-ae17-11e2-0800-e1ab0a761d5f,2) memb {


    2a9ab4b8-ae17-11e2-0800-e1ab0a761d5f,


    9374e2ae-ade0-11e2-0800-ba787389444a,


    } joined {


    } left {


    } partitioned {


    })


    130425 15:44:44 [Note] WSREP: gcomm: connected


    130425 15:44:44 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636


    130425 15:44:44 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)


    130425 15:44:44 [Note] WSREP: Opened channel 'db-clstr1'


    130425 15:44:44 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2


    130425 15:44:44 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.


    130425 15:44:44 [Note] WSREP: Waiting for SST to complete.


    130425 15:44:44 [Note] WSREP: STATE EXCHANGE: sent state msg: a4ca303d-ae17-11e2-0800-ea8c731d7c4d


    130425 15:44:44 [Note] WSREP: STATE EXCHANGE: got state msg: a4ca303d-ae17-11e2-0800-ea8c731d7c4d from 0 (db2)


    130425 15:44:44 [Note] WSREP: STATE EXCHANGE: got state msg: a4ca303d-ae17-11e2-0800-ea8c731d7c4d from 1 (db3)


    130425 15:44:44 [Note] WSREP: Quorum results:


    version = 2,


    component = PRIMARY,


    conf_id = 1,


    members = 1/2 (joined/total),


    act_id = 4,


    last_appl. = -1,


    protocols = 0/4/2 (gcs/repl/appl),


    group UUID = ec007a0c-adf3-11e2-0800-6096aac0da36


    130425 15:44:44 [Note] WSREP: Flow-control interval: [23, 23]


    130425 15:44:44 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 4)


    130425 15:44:44 [Note] WSREP: State transfer required:


    Group state: ec007a0c-adf3-11e2-0800-6096aac0da36:4


    Local state: 00000000-0000-0000-0000-000000000000:-1


    130425 15:44:44 [Note] WSREP: New cluster view: global state: ec007a0c-adf3-11e2-0800-6096aac0da36:4, view# 2: Primary, number of nodes: 2, my index: 1, protocol version


    2


    130425 15:44:44 [Warning] WSREP: Gap in state sequence. Need state transfer.


    130425 15:44:46 [Note] WSREP: Running: 'wsrep_sst_xtrabackup --role 'joiner' --address '10.99.108.194' --auth 'usernameassword' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --parent '4146''


    130425 15:44:46 [Note] WSREP: Prepared SST request: xtrabackup|10.99.108.194:4444/xtrabackup_sst


    130425 15:44:46 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.


    130425 15:44:46 [Note] WSREP: Assign initial position for certification: 4, protocol version: 2


    130425 15:44:46 [Warning] WSREP: Failed to prepare for incremental state transfer: Local state UUID (00000000-0000-0000-0000-000000000000) does not match group state UUID (ec007a0c-adf3-11e2-0800-6096aac0da36): 1 (Operation not permitted)


    at galera/src/replicator_str.cpprepare_for_IST():436. IST will be unavailable.


    130425 15:44:46 [Note] WSREP: Node 1 (db3) requested state transfer from '*any*'. Selected 0 (db2)(SYNCED) as donor.


    130425 15:44:46 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 4)


    130425 15:44:46 [Note] WSREP: Requesting state transfer: success, donor: 0





    At the command line what I see is:





    Starting MySQL (Percona XtraDB Cluster)...[FAILED]





    Why am I getting this and how can I fix it? I've poked hopes in iptables (and i've tried turning IP tables off entirely). I've also attempted to turn off selinux - that doesn't seem to solve the problem either.





    Thanks


    Brad

  • #2
    I was unable to solve this - I ended up flattening and rebuilding the box. Post rebuild, percona xtradb cluster setup without any issues.

    Comment


    • #3
      Nothing apparently wrong in those logs. Sometimes the init scripts return success or failure prematurely without waiting for SST to complete. Or, SST may have had some issue -- for xtrabackup you can see logs on the DONOR and JOINER in <datadir>/innobackup.something-or-other.log.

      Comment

      Working...
      X