Emergency

MySQL GTID: gtid_executed split brain after SST transfer

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • MySQL GTID: gtid_executed split brain after SST transfer

    Tested with Server version: Server version: 5.7.19-17-57-log Percona XtraDB Cluster (GPL), Release rel17, Revision 35cdc81, WSREP version 29.22, wsrep_29.22

    pxc-1: Ubuntu 16.04.3 LTS (percona-xtradb-cluster-server-5.7|5.7.19-29.22-3.xenial)
    pxc-2: Ubuntu 16.04.3 LTS (percona-xtradb-cluster-server-5.7|5.7.19-29.22-3.xenial)
    pxc-3: Ubuntu 16.04.3 LTS (percona-xtradb-cluster-server-5.7|5.7.19-29.22-3.xenial)


    When I deploying PXC Cluster, The new node is added to a cluster, where all nodes are GTID enabled.

    I make sure all node wsrep_local_state_comment is SYNCED. And I got same gtid_executed in all node.

    pxc-1 mysql> show global variables like '%gtid%';
    --------------------------------------------------------------------------+
    Variable_name Value
    --------------------------------------------------------------------------+
    binlog_gtid_simple_recovery ON
    enforce_gtid_consistency ON
    gtid_executed db3242e0-0aec-ee18-551c-04f5dfbea3f9:1-6
    gtid_executed_compression_period 1000
    gtid_mode ON
    gtid_owned
    gtid_purged
    session_track_gtids OFF
    --------------------------------------------------------------------------+
    8 rows in set (0.01 sec)
    pxc-2 mysql> show global variables like '%gtid%';
    --------------------------------------------------------------------------+
    Variable_name Value
    --------------------------------------------------------------------------+
    binlog_gtid_simple_recovery ON
    enforce_gtid_consistency ON
    gtid_executed db3242e0-0aec-ee18-551c-04f5dfbea3f9:1-6
    gtid_executed_compression_period 1000
    gtid_mode ON
    gtid_owned
    gtid_purged db3242e0-0aec-ee18-551c-04f5dfbea3f9:1-3
    session_track_gtids OFF
    --------------------------------------------------------------------------+
    8 rows in set (0.02 sec)
    pxc-3 mysql> show global variables like '%gtid%';
    --------------------------------------------------------------------------+
    Variable_name Value
    --------------------------------------------------------------------------+
    binlog_gtid_simple_recovery ON
    enforce_gtid_consistency ON
    gtid_executed db3242e0-0aec-ee18-551c-04f5dfbea3f9:1-6
    gtid_executed_compression_period 1000
    gtid_mode ON
    gtid_owned
    gtid_purged db3242e0-0aec-ee18-551c-04f5dfbea3f9:1-3
    session_track_gtids OFF
    --------------------------------------------------------------------------+
    8 rows in set (0.01 sec)
    Then I test emptied one of the datadir nodes and triggered SST synchronization

    pxc-3 operator steps
    1. /etc/init.d/mysql stop
    2. rm -fr /var/lib/mysql/*
    3. # /etc/init.d/mysql stop
    4. Check wsrep_local_state_comment is SYNCED
    Here is Problem. I got two GTID variables (gtid_executed split brain).

    pxc-3 mysql> show global variables like '%gtid%';
    ------------------------------------------------------------------------------------------------------------------+
    Variable_name Value
    ------------------------------------------------------------------------------------------------------------------+
    binlog_gtid_simple_recovery ON
    enforce_gtid_consistency ON
    gtid_executed ae3ee703-f517-11e7-88e7-fa163e6aa620:1,
    db3242e0-0aec-ee18-551c-04f5dfbea3f9:1-6
    gtid_executed_compression_period 1000
    gtid_mode ON
    gtid_owned
    gtid_purged db3242e0-0aec-ee18-551c-04f5dfbea3f9:1-6
    session_track_gtids OFF
    ------------------------------------------------------------------------------------------------------------------+
    8 rows in set (0.02 sec)
    The new GTID variables made my replication unable to keep testing with these error message.

    Replication ERROR Message
    mysql> SHOW SLAVE STATUS\G
    Slave_IO_State: Waiting for master to send event
    Master_Host: 192.168.1.3
    Master_User: sstuser
    Master_Port: 3306
    Connect_Retry: 60
    Master_Log_File: db-bin.000009
    Read_Master_Log_Pos: 234

    Last_Errno: 1524
    Last_Error: Error 'Plugin 'auth_socket' is not loaded' on query. Default database: 'mysql'. Query: 'ALTER USER 'root'@'localhost' IDENTIFIED WITH 'auth_socket''

    Last_SQL_Errno: 1524
    Last_SQL_Error: Error 'Plugin 'auth_socket' is not loaded' on query. Default database: 'mysql'. Query: 'ALTER USER 'root'@'localhost' IDENTIFIED WITH 'auth_socket''

    Master_UUID: ed591402-f50b-11e7-9a2a-fa163e6aa620
    Retrieved_Gtid_Set: ae3ee703-f517-11e7-88e7-fa163e6aa620:1
    Executed_Gtid_Set: db3242e0-0aec-ee18-551c-04f5dfbea3f9:1-6

    1 row in set (0.00 sec)
    The following is the modified parameters according to the default settings:

    pxc-1: /etc/mysql/percona-xtradb-cluster.conf.d/mysqld.cnf
    server-id=101
    log_slave_updates
    enforce_gtid_consistency=1
    gtid_mode=on
    log-bin="db-bin."

    pxc-1: /etc/mysql/percona-xtradb-cluster.conf.d/wsrep.cnf
    wsrep_cluster_address=gcomm://192.168.1.1,192.168.1.2,192.168.1.3
    pxc-2: /etc/mysql/percona-xtradb-cluster.conf.d/mysqld.cnf
    server-id=102
    log_slave_updates
    enforce_gtid_consistency=1
    gtid_mode=on
    log-bin="db-bin."

    pxc-2: /etc/mysql/percona-xtradb-cluster.conf.d/wsrep.cnf
    wsrep_cluster_address=gcomm://192.168.1.1,192.168.1.2,192.168.1.3
    pxc-3: /etc/mysql/percona-xtradb-cluster.conf.d/mysqld.cnf
    server-id=103
    log_slave_updates
    enforce_gtid_consistency=1
    gtid_mode=on
    log-bin="db-bin."

    pxc-3: /etc/mysql/percona-xtradb-cluster.conf.d/wsrep.cnf
    wsrep_cluster_address=gcomm://192.168.1.1,192.168.1.2,192.168.1.3
    Please help confirm this problem.

    Thanks.

    Ref: https://jira.percona.com/browse/PXC-918
Working...
X