First I want to thank everyone who attended my May 21, 2020 webinar “How Safe is Asynchronous Master-Master Setup in MySQL?“. Recording and slides are available on the webinar page.
Here are answers to the questions from participants which I was not able to provide during the webinar.
Q: What do you generally think of hosting Relational Databases on VM’s as opposed to Bare metals?
A: With modern hardware and modern virtual machines this is absolutely possible. I know about many successful high loaded applications that run MySQL on VMs.
Just note that running a few VMs on a single physical machine may lead to resource loss rather than saving. For example, MySQL uses the InnoDB buffer pool to optimize operations on InnoDB. This option is essential for performance and should be big enough. Ideally to store whole active working set in memory. Still, MySQL uses memory for other operations. By installing a few MySQL instances on the same physical machine you will have a smaller amount of data in those individual buffer pools than if you’d run a single server. The same reasoning applies to CPU cores which are used to serve active threads.
Q: The ORACLE RDBMS currently has an ACTIVE/ACTIVE synchronous tool, called the Active Data Guard, to ensure deduplication amongst other functions. Are there any such tools in production, for MySQL?
A: Active Data Guard is a physical replication solution which uses redo log to transfer data:
WHITE PAPER/ Oracle (Active) Data Guard 19cHOW DATA GUARD SYNCHRONIZES STANDBYDATABASE(S)A Data Guard configuration includes a production database referred to as the primary database, and up to 30 directly connected replicas referred to as standby databases. Primary and standby databases connect over TCP/IP using Oracle Net Services. There are no restrictions on where the databases are physically located provided they can communicate with each other. A standby database is created from a backup of the primary database without requiring any downtime of the Production application or database. Once a standby database has been created and configured, Data Guard automatically synchronizes the primary database and the standby database by transmitting the primary database redo -the change vector information used by every Oracle Database to protect transactions –as it is generated at the Primary database and applying it to the standby database
Open Source versions of MySQL do not support physical replication. Rather they support only logical replication when updates, stored in the special update log files, called binary logs, are replayed to the slave server.
Physical replication is by nature synchronous and will simply stop working if data mismatches. With logical replication, data mismatch does not immediately lead to work interruption.
However, if a data mismatch is undesirable, there are few things that you can do.
If synchronous replication cannot be used, consider the following safety measures for the built-in, asynchronous, replication:
- Start the slave server with the option [super-]read-only (available in all supported versions of Percona Server)
- Use row-based replication to avoid data mismatch in case of not safe statements
- Use GTIDs to prevent applying the same transaction twice
- Periodically run pt-table-checksum to check if data mismatch happened
- Fix errors with the help of pt-table-sync
Note that physical replication does not support active-active setup. Active Data Guard redirects DML statements to the master node to imitate active-active behavior. For MySQL, when you do not have to write directly to two nodes, you can use ProxySQL for the same purpose.
Q: Also there is no equivalent to active data guard in MySQL. Oracle has MySQL utilities called mysqlfailover and mysqlreplicate that allow for automatic failover and fast promotion of former master to slave
A: mysqlfailover and mysqlreplicate are part of MySQL Utilities package which is currently deprecated. For failover capabilities, we currently recommend using ProxySQL and Orchestrator. Check also this blog post with details on how to set up both products.
Q: What is the best architecture to use master-master replication with active-active mode?
A: It depends on your needs. If you can install your nodes close enough to use synchronous replication – do it. If nodes should be geographically distributed, the only option is asynchronous replication. In this case, consider safety recommendations that I discussed at the webinar.