Schema changes are the big challenges in Galera replication. So, it is recommended to understand the schema changes operation for everyone who uses the Percona XtraDB Cluster (PXB)/Galera clusters. In this blog, I am going to explain the operation and impact of the various schema changes methods used in the PXB/Galera cluster.
- Schema changes with “wsrep_OSU_method = TOI”
- Schema changes with “wsrep_OSU_method = RSU”
- Schema changes with “ONLINE ALGORITHMS”
- Schema changes with “pt-osc”
- Schema changes with “gh-ost”
For testing:
- I have configured the 3-node Percona Xtradb Cluster (8.0.19).
- Executing read/write load using the sysbench.
1 2 3 4 5 6 7 8 9 10 |
mysql> select @@wsrep_cluster_address\G *************************** 1. row *************************** @@wsrep_cluster_address: gcomm://pxc81,pxc82,pxc83 1 row in set (0.00 sec) mysql> select @@version, @@version_comment\G *************************** 1. row *************************** @@version: 8.0.19-10 @@version_comment: Percona XtraDB Cluster (GPL), Release rel10, Revision 727f180, WSREP version 26.4.3 1 row in set (0.00 sec) |
What is the Impact of Schema Changes in Clusters?
- By default (TOI), all the nodes in the cluster will be pause during the ALTER process. Because the ALTER needs to be replicated on all the nodes. If the ALTER is big it will affect the performance and could be the cause of the downtime.
- Rollback is not possible on schema upgrade.
- You can’t kill the ALTER query immediately during the operation. So, your application may need to wait until the ALTER completion.
1 2 3 4 5 6 7 8 9 |
mysql> pager grep alter PAGER set to 'grep alter' mysql> show processlist; | 19 | root | localhost | schema_changes | Query | 18 | altering table | alter table sbtest1 add index idx_c(c) | 0 | 0 | 7 rows in set (0.00 sec) mysql> kill 19; ERROR 1095 (HY000): You are not owner of thread 19 |
- MDLs are set only on one node. Not across all the nodes in the cluster. So, you need additional control over this.
Schema Changes with “wsrep_OSU_method = TOI”
TOI: Total Order Isolation
- TOI is the default method ( wsrep_OSU_method = TOI ) for schema changes.
- DDL statements are processed in the same order with regard to other transactions in each node.
- The full cluster will be blocked/locked during the DDL operation.
- This guarantees data consistency.
1 2 3 4 |
mysql> select @@wsrep_OSU_method\G *************************** 1. row *************************** @@wsrep_OSU_method: TOI 1 row in set (0.00 sec) |
Example:
I am going to run the below ALTER on “pxc81”.
1 |
alter table sbtest1 add index idx_c(c) |
After initiating the ALTER on pxc81, My processlist states the COMMIT and UPDATE ( from sysbench ) statements are paused. Only ALTER is in progress. The COMMIT and UPDATE will be resumed once the ALTER is completed.
1 2 3 |
| 17 | root | localhost | schema_changes | Execute | 15 | closing tables | COMMIT | 0 | 0 | | 17 | root | localhost | schema_changes | Execute | 15 | updating | UPDATE sbtest1 SET c='91668836759-30934071579-18064439108-53267873872-79461377960-32104006456-143369 | 0 | 1 | | 24 | root | localhost | schema_changes | Query | 15 | altering table | alter table sbtest1 add index idx_c(c) | 0 | 0 | |
But, still, the SELECT statement can be run with “wsrep_sync_wait != 1” because “wsrep_sync_wait = 1” needs the casualty checks from other nodes. So, it will fail.
SELECT with “wsrep_sync_wait=1”
1 2 3 4 5 |
| 1 | system user | | schema_changes | Query | 0 | altering table | alter table sbtest1 add index idx_c(c) | 0 | 0 | | 15 | root | localhost | schema_changes | Query | 40 | starting | select * from sbtest1 where id=1 | 0 | 0 | mysql> select * from sbtest1 where id=1; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction |
TOI can be the right choice for quick operations.
- CREATE STATEMENTS
- RENAME INDEX
- RENAME TABLE
- DROP INDEX
- ALGORITHM=INSTANT
Schema Changes with “wsrep_OSU_method = RSU”
RSU – Rolling Schema Upgrade
- In this method, DDL statements will not replicate across the cluster nodes. Need to execute the DDL individually on all nodes.
- The node which is executing the DDL will desync from the cluster group. The other nodes in the cluster are still operational and receive the application connections.
- Once the node executes the DDL, it will start to apply the missing writesets.
- In this method, the important thing is the WRITEs should not be performed on that particular table until the schema upgrade completes on all the nodes. Users should be very clear on this because the failure will break the cluster and the data may be unrecoverable.
- Gcache should be good enough to store the writesets.
Example:
At pxc82, I am going to execute the ALTER.
Session 1: (setting up the value to RSU – session-level)
1 2 |
mysql> set wsrep_OSU_method=RSU; Query OK, 0 rows affected (0.09 sec) |
Session 2: (checking the node status)
1 2 3 4 5 6 7 8 |
mysql> show global status where Variable_name like 'wsrep_local_recv_queue' or Variable_name like 'wsrep_local_state_comment'; +---------------------------+--------+ | Variable_name | Value | +---------------------------+--------+ | wsrep_local_recv_queue | 0 | | wsrep_local_state_comment | Synced | +---------------------------+--------+ 2 rows in set (0.00 sec) |
Session 1: (executing the ALTER )