MySQL 8.0.22: Asynchronous Replication Automatic Connection (IO Thread) Failover

MySQL 8.0.22 Asynchronous Replication Automatic ConnectionMySQL 8.0.22 was released on Oct 19, 2020, and came with nice features and a lot of bug fixes. Now, you can configure your async replica to choose the new source in case the existing source connection (IO thread) fails. In this blog, I am going to explain the entire process involved in this configuration with a use case.

Overview

This feature is very helpful to keep your replica server in sync in case of current source fails. 

To activate asynchronous connection failover, we need to set the “SOURCE_CONNECTION_AUTO_FAILOVER=1” on the “CHANGE MASTER” statement.

Once the IO connection fails, it will try to connect the existing source based on the “MASTER_RETRY_COUNT, MASTER_CONNECT_RETRY”. Then only it will do the failover. 

The feature will only work when the IO connection is failed, maybe the source crashed or stopped, or any network failures. This will not work if the replica is manually stopped using the “STOP REPLICA”. 

We have two new functions, which will help to add and delete the server entries from the source list. 

  • asynchronous_connection_failover_add_source → Arguments (‘channel’,’host’,port,’network_namespace’,weight)
  • asynchronous_connection_failover_delete_source — Arguments (‘channel’,’host’,port,’network_namespace)

The source servers need to be configured in the table “mysql.replication_asynchronous_connection_failover”. We can also use the table “performance_schema.replication_asynchronous_connection_failover” to view the available servers in source list.

Requirements

  • GTID should be enabled on all the servers.
  • Regarding auto-positioning purpose, MASTER_AUTO_POSITION should be enabled on the replica ( CHANGE MASTER ).
  • The user and password should be the same on all the source servers.
  • Replication user and password must be set for the channel using the CHANGE MASTER .. FOR CHANNEL statement. 

Use Case

I have two data centers and three servers (dc1, dc2, report).

  • “dc1” and “report” servers are in the same data center.
  • “dc2” is in a different data center.
  • “dc1” and “dc2” are in active-passive async replication setup (dc1 – active, dc2 – passive)
  • “report” is configured as an async replica under “dc1” for reporting purposes.

Here, my requirement is, if the active node “dc1” is failed, I need to configure the “report” server under “dc2” to get the live data without manual work after a failure happens.

Configuration for Automatic Connection Failover

I have installed MySQL 8.0.22 on all three servers and configured the active – passive replication between “dc1” and “dc2”.

At dc1,

At dc2,

Now, I need to configure the “report” server as an async replica under “dc1” with automatic failover options.

At report,

 

source_connection_auto_failover : To activate the automatic failover feature.

master_retry_count, master_connect_retry : The default setting is huge ( master_retry_count = 86400, master_connect_retry = 60 ), with that we need to wait 60 days ( 86400 * 60 /60/60/24 ) for the failover. So, i reduced the settings to 30 seconds ( 10 *3 )

You can see the replication is started and the failover settings are applied. The current primary source is “dc1”. Now, I am going to use the function to add the server details into the source list for the failover to dc2.

At “report”,

It shows the source list is updated with dc2 details. We are good to perform the failover now. 

I am going to shut down the MySQL service on dc1. 

At dc1, 

At report server,

The IO thread is in “connecting” state. This means it is trying to establish the connection to the existing source (dc1) based on the “master_retry_count” and “master_connect_retry” settings. 

 After 30 seconds,