When working with group replication, MySQL router would be the obvious choice for the connection layer. It is tightly coupled with the rest of the technologies since it is part of the InnoDB cluster stack.
The problem is that except for simple workloads, MySQL router’s performance is still not on par with other proxies like Haproxy or even ProxySQL. Fortunately, Haproxy and group replication can still work together as well. Let’s see how this can be accomplished.
For our testing, let’s deploy an InnoDB ClusterSet in single-writer mode.
There are three nodes on the primary site:
And three nodes on the DR site:
In this scenario, group replication is used locally between the nodes on each site, and both sites are linked together via asynchronous replication. Local or regional failover can be controlled from the MySQL shell.
Similarly to using Haproxy as the connection layer for Percona XtraDB Cluster, health checks can be used to detect which member of the cluster to send reads or writes. This is usually accomplished through an xinetd service that runs the actual health check script. Here’s what it looks like; let’s discuss all the components in more detail.
We need to create a database user with enough permissions to run any queries on the health check script. Let’s create the user on the current primary:
|
1 |
CREATE USER clustercheck@localhost IDENTIFIED BY 'Clust3rCh3ck@' WITH MAX_USER_CONNECTIONS 10;<br>GRANT PROCESS ON *.* to 'clustercheck'@'localhost';<br>GRANT SELECT ON sys.* to 'clustercheck'@'localhost';<br> |
The queries in our sample script require a few special objects to be created in the SYS schema:
|
1 |
USE sys;<br><br>DELIMITER $$<br><br>CREATE FUNCTION my_id() RETURNS TEXT(36) DETERMINISTIC NO SQL RETURN (SELECT @@global.server_uuid as my_id);$$<br><br>CREATE FUNCTION IFZERO(a INT, b INT)<br>RETURNS INT<br>DETERMINISTIC<br>RETURN IF(a = 0, b, a)$$<br><br>CREATE FUNCTION LOCATE2(needle TEXT(10000), haystack TEXT(10000), offset INT)<br>RETURNS INT<br>DETERMINISTIC<br>RETURN IFZERO(LOCATE(needle, haystack, offset), LENGTH(haystack) + 1)$$<br><br>CREATE FUNCTION GTID_NORMALIZE(g TEXT(10000))<br>RETURNS TEXT(10000)<br>DETERMINISTIC<br>RETURN GTID_SUBTRACT(g, '')$$<br><br>CREATE FUNCTION GTID_COUNT(gtid_set TEXT(10000))<br>RETURNS INT<br>DETERMINISTIC<br>BEGIN<br> DECLARE result BIGINT DEFAULT 0;<br> DECLARE colon_pos INT;<br> DECLARE next_dash_pos INT;<br> DECLARE next:pos INT;<br> DECLARE next_comma_pos INT;<br> SET gtid_set = GTID_NORMALIZE(gtid_set);<br> SET colon_pos = LOCATE2(':', gtid_set, 1);<br> WHILE colon_pos != LENGTH(gtid_set) + 1 DO<br> SET next_dash_pos = LOCATE2('-', gtid_set, colon_pos + 1);<br> SET next:pos = LOCATE2(':', gtid_set, colon_pos + 1);<br> SET next_comma_pos = LOCATE2(',', gtid_set, colon_pos + 1);<br> IF next_dash_pos < next:pos AND next_dash_pos < next_comma_pos THEN SET result = result + SUBSTR(gtid_set, next_dash_pos + 1, LEAST(next:pos, next_comma_pos) - (next_dash_pos + 1)) - SUBSTR(gtid_set, colon_pos + 1, next_dash_pos - (colon_pos + 1)) + 1; ELSE SET result = result + 1; END IF; SET colon_pos = next:pos; END WHILE; RETURN result; END$$ CREATE FUNCTION gr_applier_queue_length() RETURNS INT DETERMINISTIC BEGIN RETURN (SELECT sys.gtid_count( GTID_SUBTRACT( (SELECT Received_transaction_set FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier' ), (SELECT @@global.GTID_EXECUTED) ))); END$$ CREATE FUNCTION gr_member_in_primary_partition() RETURNS VARCHAR(3) DETERMINISTIC BEGIN RETURN (SELECT IF( MEMBER_STATE='ONLINE' AND ((SELECT COUNT(*) FROM performance_schema.replication_group_members WHERE MEMBER_STATE NOT IN ('ONLINE', 'RECOVERING')) >=<br> ((SELECT COUNT(*) FROM performance_schema.replication_group_members)/2) = 0),<br> 'YES', 'NO' ) FROM performance_schema.replication_group_members JOIN<br> performance_schema.replication_group_member_stats USING(member_id) where member_id=my_id());<br>END$$<br><br><br>CREATE VIEW gr_member_routing_candidate_status AS SELECT<br>sys.gr_member_in_primary_partition() as viable_candidate,<br>IF( (SELECT (SELECT GROUP_CONCAT(variable_value) FROM<br>performance_schema.global_variables WHERE variable_name IN ('read_only',<br>'super_read_only')) != 'OFF,OFF'), 'YES', 'NO') as read_only,<br>Count_Transactions_Remote_In_Applier_Queue as transactions_behind, Count_Transactions_in_queue as 'transactions_to_cert' <br>from performance_schema.replication_group_member_stats where member_id=my_id();$$<br><br>DELIMITER ;<br><br> |
Here is an example of a simple checker script that is good enough for testing purposes. For production use, you might want to develop something more robust.
|
1 |
tee /usr/local/bin/mysql_gr_routing_check.sh <<br># Frederic -lefred- Descamps <[email protected]><br># Based on the original script from Unai Rodriguez and later work by Olaf van Zandwijk and Raghavendra Prabhu<br>#<br># version 0.1 - first release<br># version 0.2 - add read & write check + queue length check<br><br># mysql_gr_routing_check.sh <READ|WRITE><br><br># This password method is insecure and should not be used in a production environment!<br>MYSQL_USERNAME="clustercheck"<br>MYSQL_PASSWORD="Clust3rCh3ck@"<br>MYSQL_HOST=localhost<br>MYSQL_PORT=3306<br><br>MAXQUEUE=${1-100}<br>ROLE=${2-WRITE}<br><br>if ! [ "$MAXQUEUE" -eq "$MAXQUEUE" ] 2>/dev/null;<br>then<br> # Member is not a viable routing candidate => return HTTP 503<br> # Shell return-code is 1<br> echo -en "HTTP/1.1 503 Service Unavailablern"<br> echo -en "Content-Type: text/plainrn"<br> echo -en "Connection: closern"<br> echo -en "Content-Length: 44rn"<br> echo -en "rn"<br> echo -en "Group Replication member is not a viable routing candidate:rn"<br> echo -en "maxqueue argument is not a valid value: ($MAXQUEUE).rn"<br> exit 1<br>fi<br><br>echo $(mysql --no-defaults -BN --connect-timeout=10 --host=$MYSQL_HOST --port=$MYSQL_PORT --user="$MYSQL_USERNAME" --password="$MYSQL_PASSWORD" -e 'SELECT * FROM sys.gr_member_routing_candidate_status' 2>/dev/null) | while read candidate readonly queue tx_to_cert<br>do<br><br>if [ "$candidate" == "YES" ]<br>then<br> if [ "${ROLE^^}" == "READ" ]<br> then<br> if [ $queue -lt $MAXQUEUE ]<br> then<br> # Member is a viable routing candidate => return HTTP 200<br> # Shell return-code is 0<br> echo -en "HTTP/1.1 200 OKrn"<br> echo -en "Content-Type: text/plainrn"<br> echo -en "Connection: closern"<br> echo -en "Content-Length: 40rn"<br> echo -en "rn"<br> echo -en "Group Replication member is a viable routing candidate for $ROLE.rn"<br> exit 0<br> else<br> # Member is not a viable routing candidate => return HTTP 503<br> # Shell return-code is 1<br> echo -en "HTTP/1.1 503 Service Unavailablern"<br> echo -en "Content-Type: text/plainrn"<br> echo -en "Connection: closern"<br> echo -en "Content-Length: 44rn"<br> echo -en "rn"<br> echo -en "Group Replication member is not a viable routing candidate:rn"<br> echo -en "queue exceeds ($queue) threshold ($MAXQUEUE).rn"<br> exit 1<br> fi<br> elif [ "${ROLE^^}" == "WRITE" ]<br> then<br> if [ "$readonly" == "YES" ]<br> then<br> # Member is not a viable routing candidate => return HTTP 503<br> # Shell return-code is 1<br> echo -en "HTTP/1.1 503 Service Unavailablern"<br> echo -en "Content-Type: text/plainrn"<br> echo -en "Connection: closern"<br> echo -en "Content-Length: 44rn"<br> echo -en "rn"<br> echo -en "Group Replication member is not a viable routing candidate:rn"<br> echo -en "$ROLE cannot be routed to a readonly member.rn"<br> exit 1<br> else<br> if [ $queue -lt $MAXQUEUE ]<br> then<br> # Member is a viable routing candidate => return HTTP 200<br> # Shell return-code is 0<br> echo -en "HTTP/1.1 200 OKrn"<br> echo -en "Content-Type: text/plainrn"<br> echo -en "Connection: closern"<br> echo -en "Content-Length: 40rn"<br> echo -en "rn"<br> echo -en "Group Replication member is a viable routing candidate for $ROLE.rn"<br> exit 0<br> else<br> # Member is not a viable routing candidate => return HTTP 503<br> # Shell return-code is 1<br> echo -en "HTTP/1.1 503 Service Unavailablern"<br> echo -en "Content-Type: text/plainrn"<br> echo -en "Connection: closern"<br> echo -en "Content-Length: 44rn"<br> echo -en "rn"<br> echo -en "Group Replication member is not a viable routing candidate:rn"<br> echo -en "queue exceeds ($queue) threshold ($MAXQUEUE).rn"<br> exit 1<br> fi<br> fi<br> else<br> # Member is not a viable routing candidate => return HTTP 503<br> # Shell return-code is 1<br> echo -en "HTTP/1.1 503 Service Unavailablern"<br> echo -en "Content-Type: text/plainrn"<br> echo -en "Connection: closern"<br> echo -en "Content-Length: 44rn"<br> echo -en "rn"<br> echo -en "Group Replication member is not a viable routing candidate:rn"<br> echo -en "$ROLE is not a valid argument.rn"<br> exit 1<br> fi<br>fi<br><br>done<br>EOF<br><br> |
Don’t forget to give the script execute permissions:
|
1 |
/usr/local/bin/mysql_gr_routing_check.sh |
We need to deploy a custom xinetd service to expose the state of MySQL on each node. We deploy one service using port 6446 to check for the write availability of a node and a service on port 6447 to check if the node is available for reads.
First, install the xinetd package, e.g.,
|
1 |
yum -y install xinetd |
Now we prepare the definitions of the xinetd services:
|
1 |
tee /etc/xinetd.d/mysql_gr_routing_check_write <<EOF<br># default: on<br># description: check to see if the node is a viable routing candidate<br>service mysql_gr_routing_check_write<br>{<br>disable = no<br>flags = REUSE<br>socket_type = stream<br>port = 6446<br>wait = no<br>user = mysql<br>server = /usr/local/bin/mysql_gr_routing_check.sh<br>server_args = 100 write<br>log_on_failure += USERID<br>per_source = UNLIMITED<br>}<br>EOF |
And:
|
1 |
tee /etc/xinetd.d/mysql_gr_routing_check_read <<EOF<br># default: on<br># description: check to see if the node is a viable routing candidate<br>service mysql_gr_routing_check_read<br>{<br>disable = no<br>flags = REUSE<br>socket_type = stream<br>port = 6447<br>wait = no<br>user = mysql<br>server = /usr/local/bin/mysql_gr_routing_check.sh<br>server_args = 100 read<br>log_on_failure += USERID<br>per_source = UNLIMITED<br>}<br>EOF |
Start the service:
|
1 |
<span style="font-weight: 400;">service xinetd start</span> |
We can verify our service is working by using telnet (or curl) from a remote host. For example, to check if a node is available as a writer, we query the service on port 6446:
|
1 |
# telnet mysql1-t2 6446<br>Trying 10.11.100.133...<br>Connected to mysql1-t2.<br>Escape character is '^]'.<br>HTTP/1.1 503 Service Unavailable<br>Content-Type: text/plain<br>Connection: close<br>Content-Length: 44<br><br>Group Replication member is not a viable routing candidate:<br>write cannot be routed to a readonly member.<br>Connection closed by foreign host.<br> |
We can also verify if a node can be a reader using port 6447:
|
1 |
# telnet mysql1-t2 6447<br>Trying 10.11.100.133...<br>Connected to mysql1-t2.<br>Escape character is '^]'.<br>HTTP/1.1 200 OK<br>Content-Type: text/plain<br>Connection: close<br>Content-Length: 40<br><br>Group Replication member is a viable routing candidate for read.<br>Connection closed by foreign host.<br> |
Haproxy needs to be configured with two dedicated endpoints for reads and writes, respectively. The HTTP checks defined will query our custom xinetd services to check for a node’s read/write availability.
Here’s an example of the haproxy configuration:
|
1 |
frontend mysql-gr-front_write<br>bind *:3307<br>mode tcp<br>default_backend mysql-gr-back_write<br><br>backend mysql-gr-back_write<br>mode tcp<br>balance leastconn<br>option httpchk<br>server mysql1-t1 mysql1-t1:3306 check port 6446 inter 1000 rise 1 fall 2 on-marked-up shutdown-backup-sessions<br>server mysql2-t1 mysql2-t1:3306 check port 6446 inter 1000 rise 1 fall 2 backup<br>server mysql3-t1 mysql3-t1:3306 check port 6446 inter 1000 rise 1 fall 2 backup<br>server mysql1-t2 mysql1-t2:3306 check port 6446 inter 1000 rise 1 fall 2 backup<br>server mysql2-t2 mysql2-t2:3306 check port 6446 inter 1000 rise 1 fall 2 backup<br>server mysql3-t2 mysql3-t2:3306 check port 6446 inter 1000 rise 1 fall 2 backup<br><br>frontend mysql-gr-front_read<br>bind *:3308<br>mode tcp<br>default_backend mysql-gr-back_read<br><br>backend mysql-gr-back_read<br>mode tcp<br>balance leastconn<br>option httpchk<br>server mysql1-t1 mysql1-t1:3306 check port 6447 inter 1000 rise 1 fall 2<br>server mysql2-t1 mysql2-t1:3306 check port 6447 inter 1000 rise 1 fall 2<br>server mysql3-t1 mysql3-t1:3306 check port 6447 inter 1000 rise 1 fall 2<br>server mysql1-t2 mysql1-t2:3306 check port 6447 inter 1000 rise 1 fall 2<br>server mysql2-t2 mysql2-t2:3306 check port 6447 inter 1000 rise 1 fall 2<br>server mysql3-t2 mysql3-t2:3306 check port 6447 inter 1000 rise 1 fall 2<br><br>frontend stats<br> mode http<br> bind *:443<br> stats enable<br> stats uri /stats<br> stats refresh 10s<br> stats admin if LOCALHOST<br> |
Here’s an example of how the application would connect to the database through Haproxy to write:
|
1 |
mysql -h haproxy1 -P3307 -u testuser -p'testpwd' |
For read-only connections, it would use the alternative port as configured above:
|
1 |
mysql -h haproxy1 -P3308 -u testuser -p'testpwd' |
Using this architecture, both local and regional failover can be triggered through MySQL Shell. Haproxy will automatically adjust the writer/reader nodes without human intervention.
We can also take individual nodes out of the pool by simply stopping the MySQL service or pausing group replication.
Hopefully, in the future, MySQL router will offer similar performance to other routing solutions so that we can take advantage of the integration with the rest of the components of InnoDB cluster.
Percona Distribution for MySQL is the most complete, stable, scalable, and secure open source MySQL solution available, delivering enterprise-grade database environments for your most critical business applications… and it’s free to use!