Percona XtraDB Cluster 8.0.25 (PXC) has introduced a new option to perform online schema modifications: NBO (Non-Blocking Operation).
When using PXC, the cluster relies on the wsrep_OSU_method parameter to define the Online Schema Upgrade (OSU) method the node uses to replicate DDL statements.
Until now, we normally have three options:
Each method has some positive and negative aspects. TOI will lock the whole cluster from being able to accept data modifications for the entire time it takes to perform the DDL operation. RSU will misalign the schema definition between the nodes, and in any case, the node performing the DDL operation is still locked. Finally, TOI+PTOSC will rely on creating triggers and copying data, so in some cases, this can be very impactful.
The new Non-Blocking Operation (NBO) method is to help to reduce the impact on the cluster and make it easier to perform some DDL operations.
At the moment we only support a limited set of operations with NBO like:
Any other command will result in an error message ER_NOT_SUPPORTED_YET.
But let us see how it works and what the impact is while we will also compare it with the default method TOI.
What we will do is work with four connections:
1 – to perform ddl
2 – to perform insert data in the table being altered
3 – to perform insert data on a different table
4-5 – checking the other two nodes operations
PXC must be at least Version 8.0.25-15.1.
The table we will modify is :
|
1 |
DC1-1(root@localhost) [windmills_s]>show create table windmills_testG<br>*************************** 1. row ***************************<br> Table: windmills_test<br>Create Table: CREATE TABLE `windmills_test` (<br> `id` bigint NOT NULL AUTO_INCREMENT,<br> `uuid` char(36) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,<br> `millid` smallint NOT NULL,<br> `kwatts_s` int NOT NULL,<br> `date` date NOT NULL,<br> `location` varchar(50) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,<br> `active` tinyint NOT NULL DEFAULT '1',<br> `time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,<br> `strrecordtype` char(3) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL,<br> PRIMARY KEY (`id`),<br> KEY `IDX_millid` (`millid`,`active`),<br> KEY `IDX_active` (`id`,`active`),<br> KEY `kuuid_x` (`uuid`),<br> KEY `millid_x` (`millid`),<br> KEY `active_x` (`active`)<br>) ENGINE=InnoDB AUTO_INCREMENT=8199260 DEFAULT CHARSET=utf8mb3 COLLATE=utf8_bin ROW_FORMAT=DYNAMIC<br>1 row in set (0.00 sec)<br> |
And contains ~five million rows.
|
1 |
DC1-1(root@localhost) [windmills_s]>select count(*) from windmills_test;<br>+----------+<br>| count(*) |<br>+----------+<br>| 5002909 |<br>+----------+<br>1 row in set (0.44 sec)<br> |
Connection 1:
|
1 |
ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE;<br> ALTER TABLE windmills_test drop INDEX idx_1, ALGORITHM=INPLACE;<br> |
Connection 2:
|
1 |
while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/PXC8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "insert into windmills_test select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills7 limit 1;" -e "select count(*) from windmills_s.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done |
Connection 3:
|
1 |
while [ 1 = 1 ];do da=$(date +'%s.%3N');/opt/mysql_templates/PXC8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "insert into windmills8 select null,uuid,millid,kwatts_s,date,location,active,time,strrecordtype from windmills7 limit 1;" -e "select count(*) from windmills_s.windmills_test;" > /dev/null;db=$(date +'%s.%3N'); echo "$(echo "($db - $da)"|bc)";sleep 1;done |
Connections 4-5:
|
1 |
while [ 1 = 1 ];do echo "$(date +'%T.%3N')";/opt/mysql_templates/PXC8P/bin/mysql --defaults-file=./my.cnf -uroot -D windmills_s -e "show full processlist;"|egrep -i -e "(windmills_test|windmills_s)"|grep -i -v localhost;sleep 1;done |
|
1 |
DC1-1(root@localhost) [windmills_s]>SET SESSION wsrep_OSU_method=TOI; |
|
1 |
DC1-1(root@localhost) [windmills_s]>SET SESSION wsrep_OSU_method=NBO; |
|
1 |
DC1-1(root@localhost) [windmills_s]>ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE, LOCK=shared; |
|
1 |
DC1-1(root@localhost) [windmills_s]>ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE;<br>Query OK, 0 rows affected (1 min 4.74 sec)<br>Records: 0 Duplicates: 0 Warnings: 0<br> |
Inserts in the altering table (connection 2):
|
1 |
.450<br>.492<br>64.993 <--- Alter blocks all inserts on the table we are altering<br>.788<br>.609<br> |
Inserts on the other table (connection 3):
|
1 |
.455<br>.461<br>64.161 <--- Alter blocks all inserts on all the other tables as well<br>.641<br>.483<br> |
On the other nodes at the same time of the ALTER we can see:
|
1 |
Id User db Command Time State Info Time_ms Rows_sent Rows_examined <br>15 system user windmills_s Query 102 altering table ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE 102238 0 0 <--- time from start <br> |
In short, we have the whole cluster locked for ~64 seconds. During this period of time, all the operations to modify data or structure were on hold.
Inserts in the altering table:
|
1 |
.437<br>.487<br>120.758 <---- Execution time increase<br>.617<br>.510<br> |
Inserts on the other table:
|
1 |
.468<br>.485<br>25.061 <---- still a metalock, but not locking the other tables for the whole duration <br>.494<br>.471<br> |
On the other nodes at the same time of the ALTER we can see:
|
1 |
Id User db Command Time State Info Time_ms Rows_sent Rows_examined <br>110068 system user windmills_s Connect 86 altering table ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE 120420 0 0<br> |
In this case, what is also interesting to note is that:
|
1 |
110174 pmm 127.0.0.1:42728 NULL Query 2 Waiting for table metadata lock SELECT x FROM information_schema.tables WHERE TABLE_SCHEMA = 'windmills_s' 1486 10 0 |
|
1 |
110068 system user connecting host windmills_s Connect 111 closing tables ALTER TABLE windmills_test ADD INDEX idx_1 (`uuid`,`active`), ALGORITHM=INPLACE 111000 0 0 |
Summarizing:
|
1 |
TOI NBO<br>Time on hold for insert for altering table ~64 sec ~120 sec<br>Time on hold for insert for another table ~64 sec ~25 sec <br>metalock whole time only at the end<br> |
Let’s see at a very high level how the two work:
In short, the cluster server behavior changes significantly when using NBO, offering significant flexibility compared to TOI. The cost in time should not linearly increase with the dimension of the table, but more in relation to the single node efficiency in performing the ALTER operation.
NBO can be significantly helpful to reduce the impact of DDL on the cluster, for now, limited to the widely used creation/modification/drop of an index. But in the future … we may expand it.
The feature is still a technology preview, so do not trust it in production, but test it and let us know what you think.
Final comment: another distribution has introduced NBO, but only if you buy the enterprise version.
Percona, which is truly open source with facts not just words, has implemented NBO in standard PXC, and the code is fully open source. This is not the first one, but just another of the many features Percona is offering for free while others ask you to buy the enterprise version.
Enjoy the product and let us have your feedback! Great MySQL to all!