Buy Percona ServicesBuy Now!

[ERROR] WSREP: Node consistency compromised, aborting...

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • [ERROR] WSREP: Node consistency compromised, aborting...

    Hello,

    I have 3 nodes cluster. One (NodeB) has been getting evicted and terminated minutes and/or hours after SST completes. Please advise as I am unable to further investigate the cause or fix it. I have tried a few suggestions, including adding it as a new node.

    Problem:
    Unexpected eviction of cluster nodedue to error executing row event ?

    Analysis:

    frequently see lots of InnoDB [Warning] indicating "Cannot open table from internal data dictionary tough .frm file exists
    AND
    followed by the following Error executing row event as below:

    [ERROR] Slave SQL: Error executing row event: 'Table '<db>'.<table>' doesn't exist', Error_code: 1146
    ...
    [ERROR] WSREP: Failed to apply trx: source: ...
    [ERROR] WSREP: Failed to apply trx 10703962 4 times
    [ERROR] WSREP: Node consistency compromised, aborting...


    This then lead to complete eviction and termination as follows:

    2019-05-13T19:24:45.626814-05:00 2 [Note] WSREP: turning isolation on
    2019-05-13T19:24:45.626926-05:00 2 [Note] WSREP: Closing send monitor...
    2019-05-13T19:24:45.626968-05:00 2 [Note] WSREP: Closed send monitor.
    2019-05-13T19:24:45.627056-05:00 2 [Note] WSREP: gcomm: terminating thread
    2019-05-13T19:24:45.627301-05:00 2 [Note] WSREP: gcomm: joining thread
    2019-05-13T19:24:45.627344-05:00 2 [Note] WSREP: gcomm: closing backend
    2019-05-13T19:24:46.056813-05:00 2 [Note] WSREP: (111f9965, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://<ipNodeA>:4567 tcp://<ipNodeC>:4567
    2019-05-13T19:24:47.326144-05:00 0 [Note] InnoDB: Buffer pool(s) load completed at 190513 19:24:47
    2019-05-13T19:24:50.627502-05:00 2 [Note] WSREP: declaring node with index 0 suspected, timeout PT5S (evs.suspect_timeout)
    2019-05-13T19:24:50.627538-05:00 2 [Note] WSREP: declaring node with index 2 suspected, timeout PT5S (evs.suspect_timeout)
    2019-05-13T19:24:50.627548-05:00 2 [Note] WSREP: evs:roto(111f9965, LEAVING, view_id(REG,02457e1e,240)) suspecting node: 02457e1e
    2019-05-13T19:24:50.627553-05:00 2 [Note] WSREP: evs:roto(111f9965, LEAVING, view_id(REG,02457e1e,240)) suspected node without join message, declaring inactive
    2019-05-13T19:24:50.627559-05:00 2 [Note] WSREP: evs:roto(111f9965, LEAVING, view_id(REG,02457e1e,240)) suspecting node: fc467d1a
    2019-05-13T19:24:50.627563-05:00 2 [Note] WSREP: evs:roto(111f9965, LEAVING, view_id(REG,02457e1e,240)) suspected node without join message, declaring inactive
    2019-05-13T19:24:50.627581-05:00 2 [Note] WSREP: Current view of cluster as seen by this node
    view (view_id(NON_PRIM,02457e1e,240)
    memb {
    111f9965,0
    }
    joined {
    }
    left {
    }
    partitioned {
    02457e1e,0
    fc467d1a,0
    }
    )
    2019-05-13T19:24:50.627630-05:00 2 [Note] WSREP: Current view of cluster as seen by this node
    view ((empty))
    2019-05-13T19:24:50.627789-05:00 2 [Note] WSREP: gcomm: closed
    2019-05-13T19:24:50.627866-05:00 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
    2019-05-13T19:24:50.627883-05:00 0 [Note] WSREP: Flow-control interval: [100, 100]
    2019-05-13T19:24:50.627887-05:00 0 [Note] WSREP: Trying to continue unpaused monitor
    2019-05-13T19:24:50.627890-05:00 0 [Note] WSREP: Received NON-PRIMARY.
    2019-05-13T19:24:50.627893-05:00 0 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 10703962)
    2019-05-13T19:24:50.627900-05:00 0 [Note] WSREP: Received self-leave message.
    2019-05-13T19:24:50.627935-05:00 0 [Note] WSREP: Flow-control interval: [0, 0]
    2019-05-13T19:24:50.627938-05:00 0 [Note] WSREP: Trying to continue unpaused monitor
    2019-05-13T19:24:50.627940-05:00 0 [Note] WSREP: Received SELF-LEAVE. Closing connection.
    2019-05-13T19:24:50.627942-05:00 0 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 10703962)
    2019-05-13T19:24:50.627948-05:00 0 [Note] WSREP: RECV thread exiting 0: Success
    2019-05-13T19:24:50.627979-05:00 2 [Note] WSREP: recv_thread() joined.
    2019-05-13T19:24:50.627985-05:00 2 [Note] WSREP: Closing replication queue.
    2019-05-13T19:24:50.627992-05:00 2 [Note] WSREP: Closing slave action queue.
    2019-05-13T19:24:50.627999-05:00 2 [Note] WSREP: /usr/sbin/mysqld: Terminated.
Working...
X