During the last few weeks I’ve been testing and playing a bit with the new group-replication plugin available for MySQL 5.7. Before continuing I’d like to clarify some aspects: the plugin is only available in labs and is not yet ready for production. The current version is 0.6. I used 5.7.9 GA running in a Vagrant 3 nodes cluster with CentOS 7.
As an additional note, I’ve tested previous version of plugin 0.5 against 5.7.8.rc and there are some good changes, so I recommend starting with the GA version.
For the matter of my tests I’ve followed instructions from this post. It’s not as straightforward as it looks; there were some issues that needed to be handled, but I finally managed to get a consistent 3 nodes cluster running:
|
1 |
mysql> SELECT * FROM performance_schema.replication_group_members;<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| group_replication_applier | 80cc27ce-7cb4-11e5-9055-0800275ff74d | node2 | 3306 | ONLINE |<br>| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | ONLINE |<br>| group_replication_applier | e5263489-7cb7-11e5-a8ee-0800275ff74d | node3 | 3306 | ONLINE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>3 rows in set (0.00 sec) |
1- The previous version relies only on Corosync for group communication, meaning that you need to install and configure an extra piece of software. For those who aren’t very familiar with it (like me), it can be a bit tricky and complex. From version 0.6.0 for MySQL 5.7.9, a new communication engine called XCom was added. This is now the default communication framework which is included in the plugin, so no extra pieces of software are needed.
2- When you initialize the MySQL database (mysqld –initialize function now replaces mysql_install_db script) you need to disable binary logging in the configuration file, otherwise information such as ‘create database mysql’ will be pushed to binary logs and cause issues with nodes joining the cluster due errors like:2015-10-21T20:18:52.059231Z 8 [Warning] Slave: Can't create database 'mysql'; database exists Error_code: 1007
3- In group replication there isn’t a concept like SST (State Snapshot Transfer) which basically drops and recreates the datadir if it finds data discrepancies. With group replication you may end up having different datasets and replication will continue working (more on this later in the post).
4- For Incremental State Transfer (a.k.a. IST in Galera), group replication trusts in binary logs present in any of the potential donors (at the moment the selection of a donor is done randomly). So, if a node is disconnected, when it comes back online, it requests binary logs from the donor using the same IO thread as regular replication. The problem here is that if the binary log was purged on the donor then the joiner can’t be recovered and a full rebuild is needed. This is a similar approach to the gcache in Galera, but when gcache is not able to provide transactions needed for IST, an SST is performed instead. Group replication can’t do this (yet?).
These are some of the installation issues I faced. Now that we have the cluster running, what works? Well let’s try some samples.
Simple write tests
I tried running simple write operations like a few inserts, create tables and so on using sysbench like this:
|
1 |
sysbench --test=/vagrant/sysb/doc/sysbench/tests/db/update_index.lua --mysql-user=root --mysql-db=test --mysql-socket=/data/mysql.sock --oltp-table-size=25000 --oltp-tables-count=8 --num-threads=8 --report-interval=1 --max-requests=0 run |
And checked status in the other nodes. It does what it is supposed to do; data and records are found in the rest of nodes just like this:
Node1:
|
1 |
mysql> select * from sbtest1 order by 1 limit 4;<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>| id | k | c | pad |<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>| 8 | 12410 | 05879259692-02024081996-76022818191-82933803603-81845875017-31928300264-16934042125-67052432228-92123768050-95121478647 | 79362588344-09017007031-35411714211-52054317597-26283585383 |<br>| 15 | 12593 | 86109785327-21666363106-71483869798-96435839840-61691707322-34771340621-77881281887-34657680322-46195586522-19795644808 | 74925946566-05246863347-88207617885-76849666822-72200234338 |<br>| 22 | 12544 | 57481185690-89398636500-16888148413-67987678267-15604944838-94210794401-18107184012-91338377776-83386272438-09451188763 | 35227182905-15234265621-59793845249-15413569710-23749555118 |<br>| 29 | 12436 | 29279855805-99348203463-85191104223-39587263726-81794340135-73817557808-54578801760-64404111877-55434439976-37212880746 | 59222897263-22759023974-22020489960-93434521232-77981152534 |<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>4 rows in set (0.08 sec) |
Node2
|
1 |
mysql> select * from sbtest1 order by 1 limit 4;<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>| id | k | c | pad |<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>| 8 | 12410 | 05879259692-02024081996-76022818191-82933803603-81845875017-31928300264-16934042125-67052432228-92123768050-95121478647 | 79362588344-09017007031-35411714211-52054317597-26283585383 |<br>| 15 | 12593 | 86109785327-21666363106-71483869798-96435839840-61691707322-34771340621-77881281887-34657680322-46195586522-19795644808 | 74925946566-05246863347-88207617885-76849666822-72200234338 |<br>| 22 | 12544 | 57481185690-89398636500-16888148413-67987678267-15604944838-94210794401-18107184012-91338377776-83386272438-09451188763 | 35227182905-15234265621-59793845249-15413569710-23749555118 |<br>| 29 | 12436 | 29279855805-99348203463-85191104223-39587263726-81794340135-73817557808-54578801760-64404111877-55434439976-37212880746 | 59222897263-22759023974-22020489960-93434521232-77981152534 |<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>4 rows in set (0.08 sec) |
Well we expected this so, yay, we are fine.
What about trying to write in 2 nodes at the same time? This should fire things like conflict resolution during certification; in a nutshell, if we expect to use group replication to write in multiple nodes at the same time, we need a way to resolve conflicts with the data. These are most common in PK/UK violations; in other words 2 transactions trying to insert the same record/id. This is not recommended because it is not an approach we can use to scale up writes (same as Galera) but it’s still possible to do.
An easier way to test is to run sysbench in more than one member of a cluster and wait for a failure. As expected, it does what it is supposed to do:
Node1
|
1 |
[root@node1 data]# sysbench --test=/vagrant/sysb/doc/sysbench/tests/db/update_index.lua --mysql-user=root --mysql-db=test --mysql-socket=/data/mysql.sock --oltp-table-size=25000 --oltp-tables-count=8 --num-threads=8 --report-interval=1 --max-requests=0 run |grep tps<br>[ 1s] threads: 8, tps: 0.00, reads: 0.00, writes: 1416.75, response time: 25.65ms (95%), errors: 0.00, reconnects: 0.00<br>[ 2s] threads: 8, tps: 0.00, reads: 0.00, writes: 1893.78, response time: 20.94ms (95%), errors: 0.00, reconnects: 0.00<br>[ 3s] threads: 8, tps: 0.00, reads: 0.00, writes: 1421.19, response time: 28.44ms (95%), errors: 0.00, reconnects: 0.00<br>[ 4s] threads: 8, tps: 0.00, reads: 0.00, writes: 1397.83, response time: 34.92ms (95%), errors: 0.00, reconnects: 0.00<br>[ 5s] threads: 8, tps: 0.00, reads: 0.00, writes: 1734.31, response time: 22.75ms (95%), errors: 0.00, reconnects: 0.00 |
Node2
|
1 |
[root@node2 vagrant]# sysbench --test=/vagrant/sysb/doc/sysbench/tests/db/update_index.lua --mysql-user=root --mysql-db=test --mysql-socket=/data/mysql.sock --oltp-table-size=25000 --oltp-tables-count=8 --num-threads=8 --report-interval=1 --max-requests=0 run<br>WARNING: Both max-requests and max-time are 0, running endless test<br>sysbench 0.5: multi-threaded system evaluation benchmark<br><br>Running the test with following options:<br>Number of threads: 8<br>Report intermediate results every 1 second(s)<br>Random number generator seed is 0 and will be ignored<br><br><br>Threads started!<br><br>ALERT: failed to execute MySQL query: `UPDATE sbtest6 SET k=k+1 WHERE id=12608`:<br>ALERT: Error 1180 Got error 149 during COMMIT<br>FATAL: failed to execute function `event': (null)<br>ALERT: failed to execute MySQL query: `UPDATE sbtest6 SET k=k+1 WHERE id=12468`:<br>ALERT: Error 1180 Got error 149 during COMMIT<br>FATAL: failed to execute function `event': (null)<br> |
Node2 eventually failed, but what happened? Let’s check the error log to see what’s reporting:
|
1 |
2015-10-27T17:12:33.894759Z 3 [Note] InnoDB: Blocking transaction: ID: 223804 - Blocked transaction ID: 223820 - MySQL thread id 59, OS thread handle 139838350866176, query id 197913 localhost root query end<br>UPDATE sbtest5 SET k=k+1 WHERE id=12510 |
At the commit stage there was a conflict with an already committed transaction in Node1, so it forced a failure and a rollback of the operation. So far, so good.
What about a node going down?
One of the tests I ran was to kill one of the nodes during the operations to see if it resumes replication properly when back to life. For this we need to set up some variables in the configuration file as follows:
|
1 |
loose-group_replication_group_name="8a94f357-aab4-11df-86ab-c80aa9429562"<br>loose-group_replication_start_on_boot=1<br>loose-group_replication_recovery_user='rpl_user'<br>loose-group_replication_recovery_password='rpl_pass' |
Note: This is interesting, that the replication credentials are not saved into a table (as is done with slave_master_info in regular replication). I guess this is part of a to do section, but it’s something to keep in mind since this implies a security risk.
Back to our test. I ran the regular sysbench command in one of my nodes and then went to node2 and killed mysql daemon. After the regular crash recovery messages we can see:
|
1 |
[root@node2 data]# killall mysqld<br>[root@node2 data]# tail -500 error.log<br>2015-10-27T17:15:26.460674Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.<br>2015-10-27T17:15:26.460711Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.<br>2015-10-27T17:15:26.461001Z 0 [Note] InnoDB: Waiting for purge to start<br>2015-10-27T17:15:26.514015Z 0 [Note] InnoDB: 5.7.9 started; log sequence number 77370253<br>2015-10-27T17:15:26.515417Z 0 [Note] Plugin 'FEDERATED' is disabled.<br>2015-10-27T17:15:26.525466Z 0 [Note] InnoDB: not started<br>2015-10-27T17:15:26.525914Z 0 [Note] InnoDB: Loading buffer pool(s) from /data/ib_buffer_pool<br>[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configuring Xcom group: XCom Group ID=1827295128 Name=8a94f357-aab4-11df-86ab-c80aa9429562<br>[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configured Peer Nodes: 192.168.70.2:10300<br>[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configured Peer Nodes: 192.168.70.3:10300<br>[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configured Total number of peers: 2<br>[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configured Local Node: 192.168.70.3:10300<br>[XCOM_BINDING_DEBUG] ::initialize_xcom():: Configured Bootstrap: false<br>2015-10-27T17:15:26.723392Z 0 [Note] InnoDB: Buffer pool(s) load completed at 151027 17:15:26<br>2015-10-27T17:15:27.135089Z 0 [Warning] Failed to set up SSL because of the following SSL library error: SSL context is not usable without certificate and private key<br>2015-10-27T17:15:27.136449Z 0 [Note] Server hostname (bind-address): '*'; port: 3306<br>2015-10-27T17:15:27.145198Z 0 [Note] IPv6 is available.<br>2015-10-27T17:15:27.145247Z 0 [Note] - '::' resolves to '::';<br>2015-10-27T17:15:27.145265Z 0 [Note] Server socket created on IP: '::'.<br>2015-10-27T17:15:27.171615Z 0 [Warning] 'user' entry 'root@localhost' ignored in --skip-name-resolve mode.<br>2015-10-27T17:15:27.171711Z 0 [Warning] 'user' entry 'mysql.sys@localhost' ignored in --skip-name-resolve mode.<br>2015-10-27T17:15:27.172447Z 0 [Warning] 'db' entry 'sys mysql.sys@localhost' ignored in --skip-name-resolve mode.<br>2015-10-27T17:15:27.173089Z 0 [Warning] 'proxies_priv' entry '@ root@localhost' ignored in --skip-name-resolve mode.<br>2015-10-27T17:15:27.192881Z 0 [Warning] 'tables_priv' entry 'sys_config mysql.sys@localhost' ignored in --skip-name-resolve mode.<br>2015-10-27T17:15:27.205764Z 0 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=node2-relay-bin' to avoid this problem.<br>2015-10-27T17:15:27.676222Z 1 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_applier' executed'. Previous state master_host='<NULL>', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='<NULL>', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.<br>2015-10-27T17:15:27.685374Z 3 [Note] Slave SQL thread for channel 'group_replication_applier' initialized, starting replication in log 'FIRST' at position 0, relay log './node2-relay-bin-group_replication_applier.000002' position: 51793711<br>2015-10-27T17:15:27.685985Z 0 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'<br>2015-10-27T17:15:27.686009Z 0 [Note] Plugin group_replication reported: 'auto_increment_increment is set to 7'<br>2015-10-27T17:15:27.686017Z 0 [Note] Plugin group_replication reported: 'auto_increment_offset is set to 2'<br>[XCOM BINDING DEBUG] ::join()<br>connecting to 192.168.70.3 10300<br>state 0 action xa_init<br>connecting to 192.168.70.3 10300<br>connected to 192.168.70.3 10300<br>connecting to 192.168.70.3 10300<br>connected to 192.168.70.3 10300<br>...<br>[XCOM BINDING DEBUG] ::join():: I am NOT the boot node.<br>[XCOM BINDING DEBUG] ::join():: xcom_client_open_connection to 192.168.70.2:10300<br>connecting to 192.168.70.2 10300<br>connected to 192.168.70.2 10300<br>[XCOM BINDING DEBUG] ::join():: Calling xcom_client_add_node<br>cli_err 0<br>state 3489 action xa_snapshot<br>new state x_recover<br>state 3505 action xa_complete<br>new state x_run<br>get_nodeno(get_site_def()) = 2 task_now() = 1445966128.920615 n = 0 median_time() = 1.000000 executed_msg={84bca5ce 12385 1}<br> [XCOM BINDING DEBUG] ::join():: GCS_OK<br>[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.<br>[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.<br>[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.<br>[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.<br>[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.<br>[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.<br>2015-10-27T17:15:28.926806Z 0 [Note] Event Scheduler: Loaded 0 events<br>2015-10-27T17:15:28.928485Z 0 [Note] /usr/sbin/mysqld: ready for connections.<br>Version: '5.7.9-log' socket: '/data/mysql.sock' port: 3306 MySQL Community Server (GPL)<br>[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.<br>[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.<br>[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.<br>[XCOM_BINDING_DEBUG] : Rejecting this message. I'm not in a view yet.<br>....<br>[XCOM_BINDING_DEBUG] ::process_possible_control_message():: Received a control message<br>[XCOM_BINDING_DEBUG] ::process_possible_control_message()::Install new view<br>[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data<br>[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data<br>[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data<br>2015-10-27T17:15:30.084101Z 0 [Note] Plugin group_replication reported: 'Starting group replication recovery with view_id 46750407:7'<br>[XCOM_BINDING_DEBUG] ::install_view():: View delivered to client handler= 1382963399<br>2015-10-27T17:15:30.091414Z 5 [Note] Plugin group_replication reported: 'Establishing group recovery connection with a possible donor. Attempt 1/86400'<br>2015-10-27T17:15:30.107078Z 5 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='node3', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.<br>2015-10-27T17:15:30.117379Z 5 [Note] Plugin group_replication reported: 'Establishing connection to a group replication recovery donor e5263489-7cb7-11e5-a8ee-0800275ff74d at node3 port: 3306.'<br>2015-10-27T17:15:30.118109Z 6 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.<br>2015-10-27T17:15:30.130001Z 7 [Note] Slave SQL thread for channel 'group_replication_recovery' initialized, starting replication in log 'FIRST' at position 0, relay log './node2-relay-bin-group_replication_recovery.000001' position: 4<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12456 message_id.node= 0<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data():: Delivered message to client handler= 684195134<br>2015-10-27T17:15:30.169817Z 6 [ERROR] Slave I/O for channel 'group_replication_recovery': error connecting to master 'rpl_user@node3:3306' - retry-time: 60 retries: 1, Error_code: 1130<br>2015-10-27T17:15:30.169856Z 6 [Note] Slave I/O thread for channel 'group_replication_recovery' killed while connecting to master<br>2015-10-27T17:15:30.169862Z 6 [Note] Slave I/O thread exiting for channel 'group_replication_recovery', read up to log 'FIRST', position 4<br>2015-10-27T17:15:30.174955Z 5 [ERROR] Plugin group_replication reported: 'There was an error when connecting to the donor server. Check group replication recovery's connection credentials.'<br>2015-10-27T17:15:30.175573Z 5 [Note] Plugin group_replication reported: 'Retrying group recovery connection with another donor. Attempt 2/86400'<br>2015-10-27T17:15:30.178016Z 5 [Note] 'CHANGE MASTER TO FOR CHANNEL 'group_replication_recovery' executed'. Previous state master_host='node3', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''. New state master_host='node1', master_port= 3306, master_log_file='', master_log_pos= 4, master_bind=''.<br>2015-10-27T17:15:30.189233Z 5 [Note] Plugin group_replication reported: 'Establishing connection to a group replication recovery donor a0ef74a1-7cb3-11e5-845e-0800275ff74d at node1 port: 3306.'<br>2015-10-27T17:15:30.190787Z 8 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.<br>2015-10-27T17:15:30.610531Z 8 [Note] Slave I/O thread for channel 'group_replication_recovery': connected to master 'rpl_user@node1:3306',replication started in log 'FIRST' at position 4<br>get_nodeno(get_site_def()) = 2 task_now() = 1445966131.000425 n = 74 (n - old_n) / (task_now() - old_t) = 11.136156<br> |
During this process we can check the status in any node as follows:
|
1 |
mysql> SELECT * FROM performance_schema.replication_group_members;<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| group_replication_applier | 80cc27ce-7cb4-11e5-9055-0800275ff74d | node2 | 3306 | RECOVERING |<br>| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | ONLINE |<br>| group_replication_applier | e5263489-7cb7-11e5-a8ee-0800275ff74d | node3 | 3306 | ONLINE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>3 rows in set (0.00 sec)<br><br>mysql> SELECT * FROM performance_schema.replication_group_members;<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| group_replication_applier | 80cc27ce-7cb4-11e5-9055-0800275ff74d | node2 | 3306 | ONLINE |<br>| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | ONLINE |<br>| group_replication_applier | e5263489-7cb7-11e5-a8ee-0800275ff74d | node3 | 3306 | ONLINE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>3 rows in set (0.00 sec) |
Again, as expected, the node connected to the cluster, requested binary logs from latest GTID executed position and applied remaining changes to be back online.
The final test I’ve done so far is about data consistency. For example, what if I stop group replication in a node and make some data changes? When it gets back to replication will it send these changes?
Let’s see a very simple example:
Node2:
|
1 |
mysql> select * from sbtest1 where id=15;<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>| id | k | c | pad |<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>| 15 | 12593 | 86109785327-21666363106-71483869798-96435839840-61691707322-34771340621-77881281887-34657680322-46195586522-19795644808 | 74925946566-05246863347-88207617885-76849666822-72200234338 |<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>1 row in set (0.12 sec) |
Node1:
|
1 |
mysql> select * from sbtest1 where id=15;<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>| id | k | c | pad |<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>| 15 | 12593 | 86109785327-21666363106-71483869798-96435839840-61691707322-34771340621-77881281887-34657680322-46195586522-19795644808 | 74925946566-05246863347-88207617885-76849666822-72200234338 |<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>1 row in set (0.00 sec)<br><br>mysql> stop group_replication;<br>Query OK, 0 rows affected (0.03 sec)<br><br>mysql> delete from sbtest1 where id=15;<br>Query OK, 1 row affected (0.02 sec)<br><br>mysql> start group_replication;<br>Query OK, 0 rows affected (0.02 sec)<br><br>mysql> select * from sbtest1 where id=15;<br>Empty set (0.00 sec) |
And now Node2 again:
|
1 |
mysql> select * from sbtest1 where id=15;<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>| id | k | c | pad |<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>| 15 | 12593 | 86109785327-21666363106-71483869798-96435839840-61691707322-34771340621-77881281887-34657680322-46195586522-19795644808 | 74925946566-05246863347-88207617885-76849666822-72200234338 |<br>+----+-------+-------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------+<br>1 row in set (0.00 sec) |
Hmmmm, not cool, what if I try to remove a row from Node2?
|
1 |
mysql> delete from sbtest1 where id=15;<br>Query OK, 1 row affected (0.01 sec)<br><br>mysql> SELECT * FROM performance_schema.replication_group_members;<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| group_replication_applier | 80cc27ce-7cb4-11e5-9055-0800275ff74d | node2 | 3306 | ONLINE |<br>| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | ONLINE |<br>| group_replication_applier | e5263489-7cb7-11e5-a8ee-0800275ff74d | node3 | 3306 | ONLINE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>3 rows in set (0.00 sec) |
Hmmmm, strange, everything seems to be working correctly. Is it? Let’s check node1 again:
|
1 |
mysql> SELECT * FROM performance_schema.replication_group_members;<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| group_replication_applier | 80cc27ce-7cb4-11e5-9055-0800275ff74d | node2 | 3306 | ONLINE |<br>| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | ONLINE |<br>| group_replication_applier | e5263489-7cb7-11e5-a8ee-0800275ff74d | node3 | 3306 | ONLINE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>3 rows in set (0.01 sec)<br><br>mysql> stop group_replication;<br>Query OK, 0 rows affected (4.01 sec)<br><br>mysql> start group_replication;<br>Query OK, 0 rows affected (2.41 sec)<br><br>mysql> SELECT * FROM performance_schema.replication_group_members;<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| CHANNEL_NAME | MEMBER_ID | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>| group_replication_applier | a0ef74a1-7cb3-11e5-845e-0800275ff74d | node1 | 3306 | OFFLINE |<br>+---------------------------+--------------------------------------+-------------+-------------+--------------+<br>1 row in set (0.00 sec)<br><br>mysql> exit<br>Bye<br>[root@node1 data]# tail -100 error.log<br>2015-10-27T17:52:50.075274Z 15 [ERROR] Slave SQL for channel 'group_replication_applier': Could not execute Delete_rows event on table test.sbtest1; Can't find record in 'sbtest1', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log FIRST, end_log_pos 346, Error_code: 1032<br>2015-10-27T17:52:50.075274Z 15 [Warning] Slave: Can't find record in 'sbtest1' Error_code: 1032<br>2015-10-27T17:52:50.075274Z 15 [ERROR] Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'FIRST' position 0<br>2015-10-27T17:52:50.075294Z 2 [Note] Plugin group_replication reported: 'Group Replication applier module successfully initialized!'<br>2015-10-27T17:52:50.075308Z 2 [Note] Plugin group_replication reported: 'auto_increment_increment is set to 7'<br>2015-10-27T17:52:50.075312Z 2 [Note] Plugin group_replication reported: 'auto_increment_offset is set to 1'<br>[XCOM BINDING DEBUG] ::join()<br>connecting to 192.168.70.2 10300<br>state 3489 action xa_init<br>connecting to 192.168.70.2 10300<br>....<br><br>XCOM BINDING DEBUG] ::join():: I am NOT the boot node.<br>[XCOM BINDING DEBUG] ::join():: Skipping own address.<br>[XCOM BINDING DEBUG] ::join():: xcom_client_open_connection to 192.168.70.3:10300<br>connecting to 192.168.70.3 10300<br>connected to 192.168.70.3 10300<br>[XCOM BINDING DEBUG] ::join():: Calling xcom_client_add_node<br>cli_err 0<br>state 3489 action xa_snapshot<br>new state x_recover<br>state 3505 action xa_complete<br>new state x_run<br>[XCOM BINDING DEBUG] ::join():: GCS_OK<br>[XCOM_BINDING_DEBUG] ::cb_xcom_receive_global_view():: message_id.group= 2226955726 message_id.msgno= 12585 message_id.node= 0<br>[XCOM_BINDING_DEBUG] ::cb_xcom_receive_global_view():: node set: peer: 0 flag: 1 peer: 1 flag: 1 peer: 2 flag: 1<br>[XCOM_BINDING_DEBUG] ::xcom_receive_global_view():: Processing new view on Handler<br>[XCOM_BINDING_DEBUG] ::xcom_receive_global_view():: My node_id is 2<br>[XCOM_BINDING_DEBUG] ::xcom_receive_global_view():: State Exchange started.<br>get_nodeno(get_site_def()) = 2 task_now() = 1445968372.450627 n = 0 median_time() = 1.000000 executed_msg={84bca5ce 12585 1}<br> [XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12585 message_id.node= 1<br>[XCOM_BINDING_DEBUG] ::process_possible_control_message():: Received a control message<br>[XCOM_BINDING_DEBUG] ::process_possible_control_message()::Still waiting for more State Exchange messages<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12585 message_id.node= 2<br>[XCOM_BINDING_DEBUG] ::process_possible_control_message():: Received a control message<br>[XCOM_BINDING_DEBUG] ::process_possible_control_message()::Still waiting for more State Exchange messages<br>[XCOM_BINDING_DEBUG] ::xcom_receive_data_internal():: xcom_receive_data message_id.group= 2226955726 message_id.msgno= 12586 message_id.node= 0<br>[XCOM_BINDING_DEBUG] ::process_possible_control_message():: Received a control message<br>[XCOM_BINDING_DEBUG] ::process_possible_control_message()::Install new view<br>[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data<br>[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data<br>[XCOM_BINDING_DEBUG] ::install_view():: Processing exchanged data<br>2015-10-27T17:52:52.455340Z 0 [Note] Plugin group_replication reported: 'Starting group replication recovery with view_id 46750407:11'<br>2015-10-27T17:52:52.456474Z 16 [ERROR] Plugin group_replication reported: 'Can't evaluate the group replication applier execution status. Group replication recovery will shutdown to avoid data corruption.'<br>2015-10-27T17:52:52.456503Z 16 [ERROR] Plugin group_replication reported: 'Fatal error during the Recovery process of Group Replication. The server will leave the group.'<br>[XCOM BINDING DEBUG] ::leave()<br>[XCOM BINDING DEBUG] ::leave():: Skipping own address.<br>[XCOM BINDING DEBUG] ::leave():: xcom_client_open_connection to 192.168.70.3:10300<br>connecting to 192.168.70.3 10300<br>[XCOM_BINDING_DEBUG] ::install_view():: View delivered to client handler= 750756806<br>connected to 192.168.70.3 10300<br>[XCOM BINDING DEBUG] ::leave():: Calling xcom_client_remove_node<br>cli_err 0<br>handle_remove_node /export/home2/pb2/build/sb_0-16846472-1445524610.82/build/BUILD/mysql-server/plugin/group_replication/gcs/src/bindings/xcom/xcom/xcom_base.c:1987 nodes: 0x3d05fa8 nodes->node_list_len = 1 nodes->node_list_val: 0x3da7da0 node_address n.address: 0x3d238d0 192.168.70.2:10300<br>getstart group_id 84bca5ce<br>state 3551 action xa_terminate<br>new state x_start<br>state 3489 action xa_exit<br>Exiting xcom thread<br>new state x_start<br>[XCOM BINDING DEBUG] ::leave():: Installing Leave view<br>[XCOM_BINDING_DEBUG] ::install_view():: No exchanged data<br>[XCOM_BINDING_DEBUG] ::install_view():: View delivered to client handler= 750756806<br>[XCOM BINDING DEBUG] ::leave():: Exiting with error=GCS_OK |
So it looks like a member that has data inconsistencies might be reported as ONLINE erroneously, but whenever group replication is restarted it will fail and won’t be able to join to the cluster. It seems there should be better error handling when a data inconsistency is found.
What about the operational perspective?
It looks very limited, just a few variables and status counters, plus some status tables in performance schema as follows:
|
1 |
mysql> show global variables like '%group_repli%';<br>+---------------------------------------------------+---------------------------------------+<br>| Variable_name | Value |<br>+---------------------------------------------------+---------------------------------------+<br>| group_replication_allow_local_lower_version_join | OFF |<br>| group_replication_auto_increment_increment | 7 |<br>| group_replication_bootstrap_group | OFF |<br>| group_replication_components_stop_timeout | 31536000 |<br>| group_replication_gcs_engine | xcom |<br>| group_replication_group_name | 8a94f357-aab4-11df-86ab-c80aa9429562 |<br>| group_replication_local_address | 192.168.70.3:10300 |<br>| group_replication_peer_addresses | 192.168.70.2:10300,192.168.70.3:10300 |<br>| group_replication_pipeline_type_var | STANDARD |<br>| group_replication_recovery_complete_at | TRANSACTIONS_CERTIFIED |<br>| group_replication_recovery_password | |<br>| group_replication_recovery_reconnect_interval | 60 |<br>| group_replication_recovery_retry_count | 86400 |<br>| group_replication_recovery_ssl_ca | |<br>| group_replication_recovery_ssl_capath | |<br>| group_replication_recovery_ssl_cert | |<br>| group_replication_recovery_ssl_cipher | |<br>| group_replication_recovery_ssl_crl | |<br>| group_replication_recovery_ssl_crlpath | |<br>| group_replication_recovery_ssl_key | |<br>| group_replication_recovery_ssl_verify_server_cert | OFF |<br>| group_replication_recovery_use_ssl | OFF |<br>| group_replication_recovery_user | rpl_user |<br>| group_replication_start_on_boot | ON |<br>+---------------------------------------------------+---------------------------------------+<br>24 rows in set (0.00 sec)<br><br>mysql> show global status like '%group_repli%';<br>+-----------------------------+-------+<br>| Variable_name | Value |<br>+-----------------------------+-------+<br>| Com_group_replication_start | 8 |<br>| Com_group_replication_stop | 7 |<br>+-----------------------------+-------+<br>2 rows in set (0.01 sec)<br><br>mysql> show tables from performance_schema like '%group%';<br>+----------------------------------------+<br>| Tables_in_performance_schema (%group%) |<br>+----------------------------------------+<br>| replication_group_member_stats |<br>| replication_group_members |<br>+----------------------------------------+<br>2 rows in set (0.00 sec) |
Most of values above are self-descriptive. I still need to dig into it a bit more to find the function for some of them.
So far the work done with group replication is very impressive. Of course there is still a long road to travel, but it doesn’t look to be fair to compare group replication against Galera, unless it is not a side by side comparison.
Even if I like the idea of using a legacy component, I don’t like the need to install and configure Corosync because it’s another piece of software that could eventually fail. Fortunately this can be avoided with the newer version of the plugin, which can use the new XCom communication framework. I tested both versions and using XCom is far easier to setup and configure; however, the error log file can become very verbose, maybe too verbose in my opinion.
With regards to installation and configuration it’s pretty easy once you find the proper way to do it. There are few variables to configure to have a working cluster and most of the settings works just fine by default (like group_replication_auto_increment_increment).
I would still like to have some automatic control on data inconsistency handling (like SST in Galera), but in my opinion this new feature can be a good approach to consider in the future when looking for high availability solutions. A lot of tests need to be done and I’d also like to see some benchmarks. These are just my first impressions and we should wait some time before seeing this feature as GA. Paraphrasing that song “it’s a long way to the top if you wanna rock ‘n’ roll.”
Resources
RELATED POSTS