Buy Percona SupportEmergency? Get 24/7 Help Now!

PMM disc space usage

Lastest Forum Posts - September 20, 2016 - 6:53am
Just installed an instance of PMM as a trial, monitoring 3 Percona XtraDB Cluster nodes.

After just 48 hours of statistics gathered, the Prometheus data storage has reached 9.7GiB. So, ~35GiB of storage to monitor just 3 MySQL servers for a week?

Seems a little excessive to me...

Open Source Databases at Percona Live Europe, Amsterdam

Latest MySQL Performance Blog posts - September 19, 2016 - 3:18pm

In this blog post, I’ll review some of the open source database technologies discussions at Percona Live Europe.

I’ve already written about the exciting PostgreSQL and MongoDB content at Percona Live Europe in Amsterdam, and now I’m going to highlight some of our open source database content.  

In the last five years, the open source database community has been flourishing. There has been an explosion of creativity and innovation. The community has created many niche (and not so niche) solutions for various problems.

As a software engineer or architect, the number of available database options might excite you. You might also be intimidated about how to make the right technology choice. At Percona Live Europe, we have introductory talks for the relevant technologies that we find particularly interesting. These talks will help expand your knowledge about the available solutions and lessen intimidation at the same time.

I’m looking forward to the exciting technologies and talks that we’ll cover this year, such as:

For talks and tutorials on specific uses cases, check out the following sessions:

  • RocksDB is a very cool write optimized (LSM) storage engine, one of the few that has been in more than one database. In addition to the RocksDB-based systems inside Facebook, it can be used with MongoDB as MongoRocks and MySQL as MyRocks. It is also used inside next-generation database systems such as CockroachDB and TiDB. We have a lot of talks about RocksDB and related integrations, ranging from a MyRocks Tutorial by Yoshinori Matsunobu, to talk about MongoRocks by Igor Canadi, and a performance-focused talk by Mark Callaghan.
  • Elastic is the leading technology for open source full-text search implementations (hence previous name ElasticSearch) — but it is much more than that. ElasticSearch, Kibana, Logstash and Beats allow you to get data from a variety of data searches and analyze and visualize it. Philip Krenn will talk about full-text search in general in his Full-Text Search Explained talk, as well as talk in more details about ElasticSearch in ElasticSearch for SQL Users.
  • I am sure you’ve heard about Redis, the Swiss army knife of different data structures and operations. Redis covers many typical data tasks, from caching to maintaining counters and queues. Justin Starry will talk about Redis at Scale in Powering Million of live streams at Periscope, and Itamar Haber will talk about Extending Redis with Modules to make Redis an even more powerful data store.
  • Apache Spark is another technology you’ve surely heard about. Apache Spark adoption has skyrocketed in recent years due to its high-performance in-memory data analyses, replacing or supplementing Hadoop installations. We will hear about Badoo’s experience processing 11 billion events a day with Spark with Alexander Krasheninnikov, and also learn how to use Spark with MongoDB, MySQL and Redis with Tim Vaillancourt.
  • Apache Cassandra is a database focused on high availability and high performance, even when replicating among several data centers. When you think “eventual consistency,” perhaps Cassandra is the first technology that comes to mind. Cassandra allows you to do some impressive things, and Duy Hai Doan will show us some of them in his talk 7 things in Cassandra that you cannot find in RDBMS.
  • ClickHouse is a new guy on the block, but I’m very excited about this distributed column store system for high-performance analytics. Built by the Yandex team to power real-time analytics on the scale of trillions of database records, ClickHouse went open source earlier this year. Victor Tarnavsky will share more details in his talk.
  • Apache Ignite is another new but very exciting technology. Described as in-memory data fabric, it can be used for a variety of applications to supplement or replace relational databases — ranging from advanced data caching strategies to parallel in-memory processing of large quantities of data. Christos Erotocritou will talk about some of these use cases in his talk Turbocharge Your SQL Queries In-Memory with Apache Ignite.
  • RethinkDB is an interesting OpenSource NoSQL database built from the ground up for scalable real-time applications. The end-to-end real-time data streaming feature is really cool, and allows you build interactive real-time applications much easier. Ilya Verbitskiy will talk about RethinkDB in his Agile web-development with RethinkDB talk.
  • CockroachDB is a distributed database focused on survivability and high performance (borrowing some ideas from Google’s innovative Spanner database). Marc Berhault will talk database rocket science in his Inside CockroachDB’s Survivability Model.
  • TiDB is another open source NewSQL database, inspired by Google Spanner and F1. It can use a variety of storage engines for data store, and it supports MySQL wire protocol to ease application migration. Max Liu explains How TiDB was built in his talk.
  • ToroDB is a very interesting piece of technology. It is protocol-compatible with MongoDB, but stores data through a relational format in PostgreSQL. This can offer substantial space reduction and performance improvements for some workloads. Álvaro Hernández from 8Kdata will discuss this technology in his ToroDB: All your MongoDB data are belong to SQL talk.

As you can see we cover a wealth of exciting open source database technologies at Percona Live Europe. Do not miss a chance to expand your database horizons and learn about new developments in the industry. There is still time to register! Use the code PZBlog for a €30 discount off your registration price!

Amsterdam eWeek

Percona Live Europe 2016 is part of Amsterdam eWeek. Amsterdam eWeek provides a platform for national and international companies that focus on online marketing, media and technology and for business managers and entrepreneurs who use them, whether it comes to retail, healthcare, finance, game industry or media. Check it out!

Percona Server for MongoDB 3.2.9-2.1 is now available

Latest MySQL Performance Blog posts - September 19, 2016 - 9:17am

Percona announces the release of Percona Server for MongoDB 3.2.9-2.1 on September 19, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.2.9-2.1 is an enhanced, open-source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.2 protocol and drivers. It extends MongoDB with MongoRocks, Percona Memory Engine, and PerconaFT storage engine, as well as enterprise-grade features like external authentication and audit logging at no extra cost. Percona Server for MongoDB requires no changes to MongoDB applications or code.

Note:

We deprecated the PerconaFT storage engine. It will not be available in future releases.

This release is based on MongoDB 3.2.9. There are no additional improvements or new features on top of those upstream fixes.

The release notes are available in the official documentation.

 

pt-table-sync on master-master replication usage help

Lastest Forum Posts - September 18, 2016 - 11:00pm
Good day everyone

I am hoping someone can assist me here.

I have always used pt-table-checksum to verify integrity on master-slave replication and we had our own processes in place to sync those.

However, I recently started looking at someone else's databases and came across pt-table-sync option.

I do have some questions regarding the use of pt-table-sync however before I run this in order to ensure I do not overwrite data due to incorrect usage.


The databases is in a master-master replication setup and while syncing the tables , the replication from master2 back to master1 has been disabled temporarily.

I have run pt-table-checksum and have a list of databases and tables that are out of sync.

I now need to run pt-table-sync to actually synchronize the tables, however here are my questions / concerns :

1. on which one of the hosts do I actually run the pt-table-sync command? (master2 is the database that is mostly out of sync)
2. Despite the dry_run option as per the command below (will change to execute once I am happy with the command, or even to print and then process output from there) , is the command I have below correct?

pt-table-sync --dry-run --wait=0 --sync-to-master h=<ip of master2 [slave]>,u=checksum_user,p=<password> --replicate percona.checksums


I would really appreciate the help here as I have not used this tool before and thus very cautious.

Regards

can we initial new node from a slave?

Lastest Forum Posts - September 17, 2016 - 7:27pm
in case we have 3 node cluster(n1/n2/n3), and a slave s1 replicating from n1, and we want to add another node n4 to cluster, I did innobackupex full backup from s1, after apply-logs, i have
!1160 $ cat xtrabackup_galera_info
c11c2c75-6e7f-11e6-a8e4-f3070c514224:5975027

and make it in grastate.dat, but always go to SST, IST can not works, can you help us here?


2016-09-12 09:05:53 29469 [Note] WSREP: save pc into disk
2016-09-12 09:05:53 29469 [Note] WSREP: discarding pending addr without UUID: tcp://10.2.1.239:4567
2016-09-12 09:05:54 29469 [Note] WSREP: gcomm: connected
2016-09-12 09:05:54 29469 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2016-09-12 09:05:54 29469 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2016-09-12 09:05:54 29469 [Note] WSREP: Opened channel 'Cluster01'
2016-09-12 09:05:54 29469 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
2016-09-12 09:05:54 29469 [Note] WSREP: Waiting for SST to complete.
2016-09-12 09:05:54 29469 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 1c101e3d-78c8-11e6-b8c1-9bfa0f3f96d4
2016-09-12 09:05:54 29469 [Note] WSREP: STATE EXCHANGE: sent state msg: 1c101e3d-78c8-11e6-b8c1-9bfa0f3f96d4
2016-09-12 09:05:54 29469 [Note] WSREP: STATE EXCHANGE: got state msg: 1c101e3d-78c8-11e6-b8c1-9bfa0f3f96d4 from 0 (Node3)
2016-09-12 09:05:54 29469 [Note] WSREP: STATE EXCHANGE: got state msg: 1c101e3d-78c8-11e6-b8c1-9bfa0f3f96d4 from 1 (Node1)
2016-09-12 09:05:54 29469 [Note] WSREP: STATE EXCHANGE: got state msg: 1c101e3d-78c8-11e6-b8c1-9bfa0f3f96d4 from 2 (Node2)
2016-09-12 09:05:54 29469 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 24,
members = 2/3 (joined/total),
act_id = 8368131,
last_appl. = -1,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = c11c2c75-6e7f-11e6-a8e4-f3070c514224
2016-09-12 09:05:54 29469 [Note] WSREP: Flow-control interval: [28, 28]
2016-09-12 09:05:54 29469 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 8368131)
2016-09-12 09:05:54 29469 [Note] WSREP: State transfer required:
Group state: c11c2c75-6e7f-11e6-a8e4-f3070c514224:8368131
Local state: c11c2c75-6e7f-11e6-a8e4-f3070c514224:5975027
2016-09-12 09:05:54 29469 [Note] WSREP: New cluster view: global state: c11c2c75-6e7f-11e6-a8e4-f3070c514224:8368131, view# 25: Primary, number of nodes: 3, my index: 0, protocol version 3
2016-09-12 09:05:54 29469 [Warning] WSREP: Gap in state sequence. Need state transfer.
2016-09-12 09:05:54 29469 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'pxc03.stg' --datadir '/export/data/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '29469' --binlog 'mysql-bin' '
WSREP_SST: [INFO] Streaming with xbstream (20160912 09:05:54.230)
WSREP_SST: [INFO] Using socat as streamer (20160912 09:05:54.232)
WSREP_SST: [INFO] Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20160912 09:05:54.263)
2016-09-12 09:05:54 29469 [Note] WSREP: Prepared SST request: xtrabackup-v2|pxc03.stg:4444/xtrabackup_sst//1
2016-09-12 09:05:54 29469 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-09-12 09:05:54 29469 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-09-12 09:05:54 29469 [Note] WSREP: Service thread queue flushed.
2016-09-12 09:05:54 29469 [Note] WSREP: Assign initial position for certification: 8368131, protocol version: 3
2016-09-12 09:05:54 29469 [Note] WSREP: Service thread queue flushed.
2016-09-12 09:05:54 29469 [Note] WSREP: IST receiver addr using tcp://pxc03.stg:4568
2016-09-12 09:05:54 29469 [Note] WSREP: Prepared IST receiver, listening at: tcp://10.2.1.238:4568
2016-09-12 09:05:54 29469 [Note] WSREP: Member 0.0 (Node3) requested state transfer from '*any*'. Selected 1.0 (Node1)(SYNCED) as donor.
2016-09-12 09:05:54 29469 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 8368131)
2016-09-12 09:05:54 29469 [Note] WSREP: Requesting state transfer: success, donor: 1
WSREP_SST: [INFO] Proceeding with SST (20160912 09:05:54.624)
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20160912 09:05:54.624)
WSREP_SST: [INFO] Cleaning the existing datadir and innodb-data/log directories (20160912 09:05:54.626)
removed ‘/export/data/mysql/fwmrm_track/custom_report_schedule.frm’







can we initial new node from a slave? ( and why my post was deleted???!!)

Lastest Forum Posts - September 17, 2016 - 7:27pm
in case we have 3 node cluster(n1/n2/n3), and a slave s1 replicating from n1, and we want to add another node n4 to cluster, I did innobackupex full backup from s1, after apply-logs, i have
!1160 $ cat xtrabackup_galera_info
c11c2c75-6e7f-11e6-a8e4-f3070c514224:5975027

and make it in grastate.dat, but always go to SST, IST can not works, can you help us here?


2016-09-12 09:05:53 29469 [Note] WSREP: save pc into disk
2016-09-12 09:05:53 29469 [Note] WSREP: discarding pending addr without UUID: tcp://10.2.1.239:4567
2016-09-12 09:05:54 29469 [Note] WSREP: gcomm: connected
2016-09-12 09:05:54 29469 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
2016-09-12 09:05:54 29469 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
2016-09-12 09:05:54 29469 [Note] WSREP: Opened channel 'Cluster01'
2016-09-12 09:05:54 29469 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = no, my_idx = 0, memb_num = 3
2016-09-12 09:05:54 29469 [Note] WSREP: Waiting for SST to complete.
2016-09-12 09:05:54 29469 [Note] WSREP: STATE_EXCHANGE: sent state UUID: 1c101e3d-78c8-11e6-b8c1-9bfa0f3f96d4
2016-09-12 09:05:54 29469 [Note] WSREP: STATE EXCHANGE: sent state msg: 1c101e3d-78c8-11e6-b8c1-9bfa0f3f96d4
2016-09-12 09:05:54 29469 [Note] WSREP: STATE EXCHANGE: got state msg: 1c101e3d-78c8-11e6-b8c1-9bfa0f3f96d4 from 0 (Node3)
2016-09-12 09:05:54 29469 [Note] WSREP: STATE EXCHANGE: got state msg: 1c101e3d-78c8-11e6-b8c1-9bfa0f3f96d4 from 1 (Node1)
2016-09-12 09:05:54 29469 [Note] WSREP: STATE EXCHANGE: got state msg: 1c101e3d-78c8-11e6-b8c1-9bfa0f3f96d4 from 2 (Node2)
2016-09-12 09:05:54 29469 [Note] WSREP: Quorum results:
version = 4,
component = PRIMARY,
conf_id = 24,
members = 2/3 (joined/total),
act_id = 8368131,
last_appl. = -1,
protocols = 0/7/3 (gcs/repl/appl),
group UUID = c11c2c75-6e7f-11e6-a8e4-f3070c514224
2016-09-12 09:05:54 29469 [Note] WSREP: Flow-control interval: [28, 28]
2016-09-12 09:05:54 29469 [Note] WSREP: Shifting OPEN -> PRIMARY (TO: 8368131)
2016-09-12 09:05:54 29469 [Note] WSREP: State transfer required:
Group state: c11c2c75-6e7f-11e6-a8e4-f3070c514224:8368131
Local state: c11c2c75-6e7f-11e6-a8e4-f3070c514224:5975027
2016-09-12 09:05:54 29469 [Note] WSREP: New cluster view: global state: c11c2c75-6e7f-11e6-a8e4-f3070c514224:8368131, view# 25: Primary, number of nodes: 3, my index: 0, protocol version 3
2016-09-12 09:05:54 29469 [Warning] WSREP: Gap in state sequence. Need state transfer.
2016-09-12 09:05:54 29469 [Note] WSREP: Running: 'wsrep_sst_xtrabackup-v2 --role 'joiner' --address 'pxc03.stg' --datadir '/export/data/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '29469' --binlog 'mysql-bin' '
WSREP_SST: [INFO] Streaming with xbstream (20160912 09:05:54.230)
WSREP_SST: [INFO] Using socat as streamer (20160912 09:05:54.232)
WSREP_SST: [INFO] Evaluating timeout -k 110 100 socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20160912 09:05:54.263)
2016-09-12 09:05:54 29469 [Note] WSREP: Prepared SST request: xtrabackup-v2|pxc03.stg:4444/xtrabackup_sst//1
2016-09-12 09:05:54 29469 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
2016-09-12 09:05:54 29469 [Note] WSREP: REPL Protocols: 7 (3, 2)
2016-09-12 09:05:54 29469 [Note] WSREP: Service thread queue flushed.
2016-09-12 09:05:54 29469 [Note] WSREP: Assign initial position for certification: 8368131, protocol version: 3
2016-09-12 09:05:54 29469 [Note] WSREP: Service thread queue flushed.
2016-09-12 09:05:54 29469 [Note] WSREP: IST receiver addr using tcp://pxc03.stg:4568
2016-09-12 09:05:54 29469 [Note] WSREP: Prepared IST receiver, listening at: tcp://10.2.1.238:4568
2016-09-12 09:05:54 29469 [Note] WSREP: Member 0.0 (Node3) requested state transfer from '*any*'. Selected 1.0 (Node1)(SYNCED) as donor.
2016-09-12 09:05:54 29469 [Note] WSREP: Shifting PRIMARY -> JOINER (TO: 8368131)
2016-09-12 09:05:54 29469 [Note] WSREP: Requesting state transfer: success, donor: 1
WSREP_SST: [INFO] Proceeding with SST (20160912 09:05:54.624)
WSREP_SST: [INFO] Evaluating socat -u TCP-LISTEN:4444,reuseaddr stdio | xbstream -x; RC=( ${PIPESTATUS[@]} ) (20160912 09:05:54.624)
WSREP_SST: [INFO] Cleaning the existing datadir and innodb-data/log directories (20160912 09:05:54.626)
removed ‘/export/data/mysql/fwmrm_track/custom_report_schedule.frm’







auth_pam_compat delayed failure?

Lastest Forum Posts - September 16, 2016 - 8:29am
We have multiple instances of MySQL (Percona 5.6) on a single box. All are utilizing the auth_pam_compat plugin to allow AD authentication to each instance.

Two nights ago (8PM EDT) we made some changes to the *.cnf files for all instances and restarted the instances.

As of noon yesterday we've lost the ability to login to one instance (PETE) with active directory credentials. We can still access PETE with local accounts. AND, we are still able to access the other instances via AD authentication.

My /var/log/secure is showing these errors when trying to connect to PETE with AD credentials:

account checks fail for '<user>@<domain>': user disallowed by .k5login file for '<user>'
authentication fails for '<user>' (<user>@<domain>): Permission denied (Success)
pam_krb5[8245]: pam_authenticate returning 6 (Permission denied)


Of interest: there are no $HOME directories for these AD users on this server (hence no .k5login file) My users connect to the MySQL instances through windows clients (MySQL Workbench, SQLYog...etc)

In the *.cnf file for PETE, the entries we changed are these:
# 2016-9-13
innodb-write-io-threads = 64 (original value: unset)
innodb-read-io-threads = 64 (original value: unset)
innodb-use-native-aio = 0 (original value: unset)
innodb-buffer-pool-size = 250G (original value: 25G)
innodb-log-file-size = 5000M (original value: 32M)


These same changes were made to 3 other instances last night (except innodb-buffer-pool-size was not set as large on those instances) on the same box and all are working fine except this one. I can find nothing to correlate the innodb-buffer-pool-size with authentication errors, but could it be?

After exhausting all options we finally just gave in a restarted this instance (not the entire server) and AD authentication was restored!

For about 15 hours.

The behavior is back and now I'm suspecting the log changes to be the culprit but can find not evidence of such a thing

Could there be a relation between innodb-buffer-pool-size or innodb-log-file-size and Percona's auth_pam(_compat) plugin?

Consul, ProxySQL and MySQL HA

Latest MySQL Performance Blog posts - September 16, 2016 - 8:20am

When it comes to “decision time” about which type of MySQL HA (high-availability) solution to implement, and how to architect the solution, many questions come to mind. The most important questions are:

  • “What are the best tools to provide HA and Load Balancing?”
  • “Should I be deploying this proxy tool on my application servers or on a standalone server?”.

Ultimately, the best tool really depends on the needs of your application and your environment. You might already be using specific tools such as Consul or MHA, or you might be looking to implement tools that provide richer features. The dilemma of deploying a proxy instance per application host versus a standalone proxy instance is usually a trade-off between “a less effective load balancing algorithm” or “a single point of failure.” Neither are desirable, but there are ways to implement a solution that balances all aspects.

In this article, we’ll go through a solution that is suitable for an application that has not been coded to split reads and writes over separate MySQL instances. An application like this would rely on a proxy or 3rd party tool to split reads/writes, and preferably a solution that has high-availability at the proxy layer. The solution described here is comprised of ProxySQLConsul and Master High Availability (MHA). Within this article, we’ll focus on the configuration required for ProxySQL and Consul since there are many articles that cover MHA configuration (such as Miguel’s recent MHA Quick Start Guide blog post).

When deploying Consul in production, a minimum of 3x instances are recommended – in this example, the Consul agents run on the Application Server (appserver) as well as on the two “ProxySQL servers” mysql1 and mysql2 (which act as the HA proxy pair). This is not a hard requirement, and these instances can easily run on another host or docker container. MySQL is deployed locally on mysql1 and mysql2, however this could just as well be 1..n separate standalone DB server instances:

So let’s move on to the actual configuration of this HA solution, starting with Consul.

Installation of Consul:

Firstly, we’ll need to install the required packages, download the Consul archive and perform the initial configuration. We’ll need to perform the same installation on each of the nodes (i.e., appserver, mysql1 and mysql2).

### Install pre-requisite packages: sudo yum -y install wget unzip bind-utils dnsmasq ### Install Consul: sudo useradd consul sudo mkdir -p /opt/consul /etc/consul.d sudo touch /var/log/consul.log /etc/consul.d/proxysql.json cd /opt/consul sudo wget https://releases.hashicorp.com/consul/0.6.4/consul_0.6.4_linux_amd64.zip sudo unzip consul_0.6.4_linux_amd64.zip sudo ln -s /opt/consul/consul /usr/bin/consul sudo chown consul:consul -R /etc/consul* /opt/consul* /var/log/consul.log

Configuration of Consul on Application Server (used as ‘bootstrap’ node):

Now, that we’re done with the installation on each of the hosts, let’s continue with the configuration. In this example we’ll bootstrap the Consul cluster using “appserver”:

### Edit configuration files $ sudo vi /etc/consul.conf { "datacenter": "dc1", "data_dir": "/opt/consul/", "log_level": "INFO", "node_name": "agent1", "server": true, "ui": true, "bootstrap": true, "client_addr": "0.0.0.0", "advertise_addr": "192.168.1.119" ## Add server IP here } ###### $ sudo vi /etc/consul.d/proxysql.json {"services": [ { "id": "proxy1", "name": "proxysql", "address": "192.168.1.120", "tags": ["mysql"], "port": 6033, "check": { "script": "mysqladmin ping --host=192.168.1.120 --port=6033 --user=root --password=123", "interval": "3s"} }, { "id": "proxy2", "name": "proxysql", "address": "192.168.1.121", "tags": ["mysql"], "port": 6033, "check": { "script": "mysqladmin ping --host=192.168.1.121 --port=6033 --user=root --password=123", "interval": "3s"} } ] } ###### ### Start Consul agent $ sudo su - consul -c 'consul agent -config-file=/etc/consul.conf -config-dir=/etc/consul.d > /var/log/consul.log &' ### Setup DNSMASQ (as root) echo "server=/consul/127.0.0.1#8600" > /etc/dnsmasq.d/10-consul service dnsmasq restart ### Remember to add the localhost as a DNS server (this step can vary ### depending on how your DNS servers are managed... here I'm just ### adding the following line to resolve.conf: sudo vi /etc/resolve.conf #... snippet ...# nameserver 127.0.0.1 #... snippet ...# ### Restart dnsmasq sudo service dnsmasq restart

The service should now be started, and you can verify this in the logs in “/var/log/consul.log”.

Configuration of Consul on Proxy Servers:

The next item is to configure each of the proxy Consul agents. Note that the “agent name” and the “IP address” need to be updated for each host (values for both must be unique):

### Edit configuration files $ sudo vi /etc/consul.conf { "datacenter": "dc1", "data_dir": "/opt/consul/", "log_level": "INFO", "node_name": "agent2", ### Agent node name must be unique "server": true, "ui": true, "bootstrap": false, ### Disable bootstrap on joiner nodes "client_addr": "0.0.0.0", "advertise_addr": "192.168.1.xxx", ### Set to local instance IP "dns_config": { "only_passing": true } } ###### $ sudo vi /etc/consul.d/proxysql.json {"services": [ { "id": "proxy1", "name": "proxysql", "address": "192.168.1.120", "tags": ["mysql"], "port": 6033, "check": { "script": "mysqladmin ping --host=192.168.1.120 --port=6033 --user=root --password=123", "interval": "3s"} }, { "id": "proxy2", "name": "proxysql", "address": "192.168.1.121", "tags": ["mysql"], "port": 6033, "check": { "script": "mysqladmin ping --host=192.168.1.121 --port=6033 --user=root password=123", "interval": "3s"} } ] } ###### ### Start Consul agent: $ sudo su - consul -c 'consul agent -config-file=/etc/consul.conf -config-dir=/etc/consul.d > /var/log/consul.log &' ### Join Consul cluster specifying 1st node IP e.g. $ consul join 192.168.1.119 ### Verify logs and look out for the following messages: $ cat /var/log/consul.log ==> Starting Consul agent... ==> Starting Consul agent RPC... ==> Consul agent running! Node name: 'agent2' Datacenter: 'dc1' Server: true (bootstrap: false) Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400) Cluster Addr: 192.168.1.120 (LAN: 8301, WAN: 8302) Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false Atlas: ==> Log data will now stream in as it occurs: # ... snippet ... 2016/09/05 19:48:04 [INFO] agent: Synced service 'consul' 2016/09/05 19:48:04 [INFO] agent: Synced check 'service:proxysql1' 2016/09/05 19:48:04 [INFO] agent: Synced check 'service:proxysql2' # ... snippet ...

At this point, we have Consul installed, configured and running on each of our hosts appserver (mysql1 and mysql2). Now it’s time to install and configure ProxySQL on mysql1 and mysql2.

Installation & Configuration of ProxySQL:

The same procedure should be run on both mysql1 and mysql2 hosts:

### Install ProxySQL packages and initialise ProxySQL DB sudo yum -y install https://github.com/sysown/proxysql/releases/download/v1.2.2/proxysql-1.2.2-1-centos7.x86_64.rpm sudo service proxysql initial sudo service proxysql stop ### Edit the ProxySQL configuration file to update username / password vi /etc/proxysql.cnf ### admin_variables= { admin_credentials="admin:admin" mysql_ifaces="127.0.0.1:6032;/tmp/proxysql_admin.sock" } ### ### Start ProxySQL sudo service proxysql start ### Connect to ProxySQL and configure mysql -P6032 -h127.0.0.1 -uadmin -padmin ### First we create a replication hostgroup: mysql> INSERT INTO mysql_replication_hostgroups VALUES (10,11,'Standard Replication Groups'); ### Add both nodes to the hostgroup 11 (ProxySQL will automatically put the writer node in hostgroup 10) mysql> INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.120',11,3306,1000); mysql> INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.121',11,3306,1000); ### Save server configuration mysql> LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK; ### Add query rules for RW split mysql> INSERT INTO mysql_query_rules (active, match_pattern, destination_hostgroup, cache_ttl, apply) VALUES (1, '^SELECT .* FOR UPDATE', 10, NULL, 1); mysql> INSERT INTO mysql_query_rules (active, match_pattern, destination_hostgroup, cache_ttl, apply) VALUES (1, '^SELECT .*', 11, NULL, 1); mysql> LOAD MYSQL QUERY RULES TO RUNTIME; SAVE MYSQL QUERY RULES TO DISK; ### Finally configure ProxySQL user and save configuration mysql> INSERT INTO mysql_users (username,password,active,default_hostgroup,default_schema) VALUES ('root','123',1,10,'test'); mysql> LOAD MYSQL USERS TO RUNTIME; SAVE MYSQL USERS TO DISK; mysql> EXIT;

MySQL Configuration:

We also need to perform one configuration step on the MySQL servers in order to create a user for ProxySQL to monitor the instances:

### ProxySQL's monitor user on the master MySQL server (default username and password is monitor/monitor) mysql -h192.168.1.120 -P3306 -uroot -p123 -e"GRANT USAGE ON *.* TO monitor@'%' IDENTIFIED BY 'monitor';"

We can view the configuration of the monitor user on the ProxySQL host by checking the global variables on the admin interface:

mysql> SHOW VARIABLES LIKE 'mysql-monitor%'; +----------------------------------------+---------+ | Variable_name | Value | +----------------------------------------+---------+ | mysql-monitor_enabled | true | | mysql-monitor_connect_timeout | 200 | | mysql-monitor_ping_max_failures | 3 | | mysql-monitor_ping_timeout | 100 | | mysql-monitor_replication_lag_interval | 10000 | | mysql-monitor_replication_lag_timeout | 1000 | | mysql-monitor_username | monitor | | mysql-monitor_password | monitor | | mysql-monitor_query_interval | 60000 | | mysql-monitor_query_timeout | 100 | | mysql-monitor_slave_lag_when_null | 60 | | mysql-monitor_writer_is_also_reader | true | | mysql-monitor_history | 600000 | | mysql-monitor_connect_interval | 60000 | | mysql-monitor_ping_interval | 10000 | | mysql-monitor_read_only_interval | 1500 | | mysql-monitor_read_only_timeout | 500 | +----------------------------------------+---------+

Testing Consul:

Now that Consul and ProxySQL are configured we can do some tests from the “appserver”. First, we’ll verify that the hosts we’ve added are both reporting [OK] on our DNS requests:

$ dig @127.0.0.1 -p 53 proxysql.service.consul ; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.3 <<>> @127.0.0.1 -p 53 proxysql.service.consul ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 9975 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;proxysql.service.consul. IN A ;; ANSWER SECTION: proxysql.service.consul. 0 IN A 192.168.1.121 proxysql.service.consul. 0 IN A 192.168.1.120 ;; Query time: 1 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Mon Sep 05 19:32:12 UTC 2016 ;; MSG SIZE rcvd: 158

As you can see from the output above, DNS is reporting both 192.168.120 and 192.168.1.121 as available for the ProxySQL service. As soon as the ProxySQL check fails, the nodes will no longer report in the output above.

We can also view the status of our cluster and agents through the Consul Web GUI which runs on port 8500 of all the Consul servers in this configuration (e.g. http://192.168.1.120:8500/):

Testing ProxySQL:

So now that we have this configured we can also do some basic tests to see that ProxySQL is load balancing our connections:

[percona@appserver consul.d]$ mysql -hproxysql.service.consul -e"select @@hostname" +--------------------+ | @@hostname | +--------------------+ | mysql1.localdomain | +--------------------+ [percona@appserver consul.d]$ mysql -hproxysql.service.consul -e"select @@hostname" +--------------------+ | @@hostname | +--------------------+ | mysql2.localdomain | +--------------------+

Perfect! We’re ready to use the hostname “proxysql.service.consul” to connect to our MySQL instances using a round-robin load balancing and HA proxy solution. If one of the two ProxySQL instances fails, we’ll continue communicating with the database through the other. Of course, this configuration is not limited to just two hosts, so feel free to add as many as you need. Be aware that in this example the two hosts’ replication hierarchy is managed by MHA in order to allow for master/slave promotion. By performing an automatic or manual failover using MHA, ProxySQL automatically detects the change in replication topology and redirect writes to the newly promoted master instance.

To make this configuration more durable, it is encouraged to create a more intelligent Consul check – i.e., a check that checks more than just the availability of the MySQL service (an example would be to select some data from a table). It is also recommended to fine tune the interval of the check to suit the requirements of your application.

Not getting query analytics with 1.0.4

Lastest Forum Posts - September 16, 2016 - 2:30am
I've added 4 client - 3 modes of PXC and one node of mariadb. OS and mysql metrics show up for all the hosts but no query analytics. On the qan page I get this message for mariadb "There is no data for the selected MySQL instance, time range or search query". For the PXC nodes I get this message "QAN API error: "MySQL Query Analytics configuration not found".
Check the /var/log/qan-api.log file in docker container for more information."

[root@mysql-cluster2 data]# pmm-admin list
pmm-admin 1.0.4

PMM Server | 192.168.12.92
Client Name | mysql-cluster2
Client Address | 192.168.12.66
Service manager | unix-systemv

-------------- --------------- ------------ -------- ----------------------------------------- ---------------------
SERVICE TYPE NAME CLIENT PORT RUNNING DATA SOURCE OPTIONS
-------------- --------------- ------------ -------- ----------------------------------------- ---------------------
linux:metrics mysql-cluster2 42000 YES -
mysql:queries mysql-cluster2 42001 YES root:***@unix(/var/lib/mysql/mysql.sock) query_source=slowlog
mysql:metrics mysql-cluster2 42002 YES root:***@unix(/var/lib/mysql/mysql.sock)

[root@mysql-cluster3 ~]# pmm-admin list
pmm-admin 1.0.4

PMM Server | 192.168.12.92
Client Name | mysql-cluster3
Client Address | 192.168.12.67
Service manager | unix-systemv

-------------- --------------- ------------ -------- ----------------------------------------- ---------------------
SERVICE TYPE NAME CLIENT PORT RUNNING DATA SOURCE OPTIONS
-------------- --------------- ------------ -------- ----------------------------------------- ---------------------
linux:metrics mysql-cluster3 42000 YES -
mysql:queries mysql-cluster3 42001 YES root:***@unix(/var/lib/mysql/mysql.sock) query_source=slowlog
mysql:metrics mysql-cluster3 42002 YES root:***@unix(/var/lib/mysql/mysql.sock)


[root@mysql-cluster1 ~]# pmm-admin list
pmm-admin 1.0.4

PMM Server | 192.168.12.92
Client Name | mysql-cluster1
Client Address | 192.168.12.65
Service manager | unix-systemv

-------------- --------------- ------------ -------- ----------------------------------------- ---------------------
SERVICE TYPE NAME CLIENT PORT RUNNING DATA SOURCE OPTIONS
-------------- --------------- ------------ -------- ----------------------------------------- ---------------------
linux:metrics mysql-cluster1 42000 YES -
mysql:queries mysql-cluster1 42001 YES root:***@unix(/var/lib/mysql/mysql.sock) query_source=slowlog
mysql:metrics mysql-cluster1 42002 YES root:***@unix(/var/lib/mysql/mysql.sock)


[root@pmm-dev-noc01 vmware-tools-distrib]# pmm-admin list
pmm-admin 1.0.4

PMM Server | 192.168.12.92
Client Name | pmm-dev-noc01
Client Address | 192.168.12.92
Service manager | linux-systemd

-------------- -------------- ------------ -------- ----------------------------- ---------------------
SERVICE TYPE NAME CLIENT PORT RUNNING DATA SOURCE OPTIONS
-------------- -------------- ------------ -------- ----------------------------- ---------------------
linux:metrics pmm-dev-noc01 42000 YES -
mysql:queries pmm-dev-noc01 42001 YES root:***@tcp(localhost:3306) query_source=slowlog
mysql:metrics pmm-dev-noc01 42002 YES root:***@tcp(localhost:3306)


I've checked that the slowlog is collecting queries, but nothing shows up on the qan page.

Thanks.

PMM Server on Suse SLES12

Lastest Forum Posts - September 15, 2016 - 3:57pm
Hello,
trying to run version 1.0.4 on SLES12 server and getting this error message :
docker: Error response from daemon: driver failed programming external connectivity on endpoint pmm-server ( (f5f26a224832a1a6fa5a57a81faa2d2a10209f0d5d49ccd8f 179bcde34306f05): iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 32807 -j DNAT --to-destination 172.17.0.2:443 ! -i docker0: iptables: No chain/target/match by that name.

David

ProxySQL and Percona XtraDB Cluster (Galera) Integration

Latest MySQL Performance Blog posts - September 15, 2016 - 3:37pm

In this post, we’ll discuss how an integrated ProxySQL and Percona XtraDB Cluster (Galera) helps manage node states and failovers.

ProxySQL is designed to not perform any specialized operation in relation to the servers with which it communicates. Instead, it uses an event scheduler to extend functionalities and cover any special needs.

Given that specialized products like Percona XtraDB Cluster are not managed by ProxySQL, they require the design and implementation of good/efficient extensions.

In this article, I will illustrate how Percona XtraDB Cluster/Galera can be integrated with ProxySQL to get the best from both.

Brief digression

Before discussing their integration, we need to review a couple of very important concepts in ProxySQL. ProxySQL has a very important logical component: Hostgroup(s) (HG).

A hostgroup is a relation of:

+-----------+       +------------------------+ |Host group +------>|Server (1:N)            | +-----------+       +------------------------+

In ProxySQL, QueryRules (QR) can be directly mapped to an HG. Using QRs, you can define a specific user to ONLY go to that HG. For instance, you may want to have user app1_user go only on servers A-B-C. Simply set a QR that says app1_user has the destination hostgroup 5, where HG 5 has the servers A-B-C:

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.5',5,3306,10); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.6',5,3306,10); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.7',5,3306,10); INSERT INTO mysql_query_rules (username,destination_hostgroup,active) values('app1_user',5,1);

Easy isn’t it?

Another important concept in ProxySQL also related to HG is ReplicationHostgroup(s) (RHG). This is a special HG that ProxySQL uses to automatically manage the nodes that are connected by replication and configured in Write/Read and Read_only mode.

What does this mean? Let’s say you have four nodes A-B-C-D, connected by standard asynchronous replication. A is the master and B-C-D are the slaves. What you want is to have your application pointing writes to server A, and reads to B-C (keeping D as a backup slave). Also, you don’t want to have any reads go to B-C if the replication delay is more than two seconds.

RHG, in conjunction with HG, ProxySQL can manage all this for you. Simply instruct the proxy to:

  1. Use RHG
  2. Define the value of the maximum latency

Using the example above:

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.5',5,3306,10,2); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.6',5,3306,10,2); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.7',5,3306,10,2); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_replication_lag) VALUES ('192.168.1.8',10,3306,10,2); INSERT INTO mysql_query_rules (username,destination_hostgroup,active) values('app1_user',5,1); INSERT INTO mysql_query_rules (username,destination_hostgroup,active) values('app1_user',6,1); INSERT INTO mysql_replication_hostgroups VALUES (5,6);

From now on ProxySQL will split the R/W using the RHG and the nodes defined in HG 5. The flexibility introduced by using HGs is obviously not limited to what I mention here. It will play a good part in the integration of Percona XtraDB Cluster and ProxySQL, as I illustrate below. Percona XtraDB Cluster/Galera Integration

In an XtraDB cluster, a node has many different states and conditions that affect if and how your application operates on the node.

The most common one is when a node become a DONOR. If you’ve ever installed Percona XtraDB Cluster (or any Galera implementation), you’ve faced the situation when a node become a DONOR it changes state to DESYNC. If the node is under a heavy load, the DONOR process might affect the node itself.

But that is just one of the possible node states:

  • A node can be JOINED but not synced
  • It can have wsrep_rejectqueries, wsrep_donorrejectqueries, wsrep_ready (off)
  • It can be in a different segment
  • The number of nodes per segment is relevant.

To show what can be done and how, we will use the following setup:

  • Five nodes
  • Two segments
  • Applications requiring R/W split

And two options:

  • Single writer node
  • Multiple writers node

We’ll analyze how the proxy behaves under the use of a script run by the ProxySQL scheduler.

The use of a script is necessary for ProxySQL to respond correctly to Percona XtraDB Cluster state modifications. ProxySQL comes with two scripts for Galera, both of them are too basic and don’t consider a lot of relevant conditions. I’ve written a more complete script: https://github.com/Tusamarco/proxy_sql_tools galera_check.pl

This script is a prototype and requires QA and debugging, but is still more powerful than the default ones.

The script is designed to manage X number of nodes that belong to a given HG. The script works by HG, and as such it will perform isolated actions/checks by the HG. It is not possible to have more than one check running on the same HG. The check will create a lock file {proxysql_galera_check_${hg}.pid} that will be used to prevent duplicates. galera_check will connect to the ProxySQL node and retrieve all the information regarding the nodes/proxysql configuration. It will then check in parallel each node and will retrieve the status and configuration. galera_check analyzes and manages the following node states:

  • read_only
  • wsrep_status
  • wsrep_rejectqueries
  • wsrep_donorrejectqueries
  • wsrep_connected
  • wsrep_desinccount
  • wsrep_ready
  • wsrep_provider
  • wsrep_segment
  • Number of nodes in by segment
  • Retry loop

As mentioned, the number of nodes inside a segment is relevant. If a node is the only one in a segment, the check behaves accordingly. For example, if a node is the only one in the MAIN segment, it will not put the node in OFFLINE_SOFT when the node becomes a donor, to prevent the cluster from becoming unavailable for applications.

The script allows you to declare a segment as MAIN — quite useful when managing production and DR sites, as the script manages the segment acting as main in a more conservative way. The check can be configured to perform retries after a given interval, where the interval is the time define in the ProxySQL scheduler. As such, if the check is set to have two retries for UP and three for DOWN, it will loop that number before doing anything.

Percona XtraDB Cluster/Galera performs some actions under the hood, some of them not totally correct. This feature is useful in some uncommon circumstances, where Galera behaves weirdly. For example, whenever a node is set to READ_ONLY=1, Galera desyncs and resyncs the node. A check that doesn’t take this into account sets the node to OFFLINE and back for no reason.

Another important differentiation for this check is that it use special HGs for maintenance, all in the range of 9000. So if a node belongs to HG 10, and the check needs to put it in maintenance mode, the node will be moved to HG 9010. Once all is normal again, the node will be put back on its original HG.

This check does NOT modify any node states. This means it will NOT modify any variables or settings in the original node. It will ONLY change node states in ProxySQL.

Multi-writer mode

The recommended way to use Galera is in multi-writer mode. You can then play with the weight to have a node act as MAIN node and prevent/reduce certification failures and Brutal force Abort from Percona XtraDB Cluster. Use this configuration:

Delete from mysql_replication_hostgroups where writer_hostgroup=500 ; delete from mysql_servers where hostgroup_id in (500,501); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.5',500,3306,1000000000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.5',501,3306,100); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.6',500,3306,1000000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.6',501,3306,1000000000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.7',500,3306,100); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.7',501,3306,1000000000); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.8',500,3306,1); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.8',501,3306,1); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.9',500,3306,1); INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight) VALUES ('192.168.1.9',501,3306,1); LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL TO DISK;

In this test, we will NOT use Replication HostGroup. We will do that later when testing a single writer. For now, we’ll focus on multi-writer.

Segment 1 covers HG 500 and 501, while segment two only covers 501. Weight for the servers in HG 500 is progressive from 1 to 1 billion, in order to reduce the possible random writes on the non-main node.

As such nodes:

  • HG 500S1 192.168.1.5 – 1.000.000.000
    • S1 192.168.1.6 – 1.000.000
    • S1 192.168.1.7 – 100
    • S2 192.168.1.8 – 1
    • S2 192.168.1.9 – 1
  • HG 501S1 192.168.1.5 – 100
    • S1 192.168.1.6 – 1000000000
    • S1 192.168.1.7 – 1000000000
    • S2 192.168.1.8 – 1
    • S2 192.168.1.9 – 1

The following command shows what ProxySQL is doing:

watch -n 1 'mysql -h 127.0.0.1 -P 3310 -uadmin -padmin -t -e "select * from stats_mysql_connection_pool where hostgroup in (500,501,9500,9501) order by hostgroup,srv_host ;" -e " select hostgroup_id,hostname,status,weight,comment from mysql_servers where hostgroup_id in (500,501,9500,9501) order by hostgroup_id,hostname ;"'

Download the check from GitHub (https://github.com/Tusamarco/proxy_sql_tools) and activate it in ProxySQL. Be sure to set the parameters that match your installation:

delete from scheduler where id=10; INSERT INTO scheduler (id,active,interval_ms,filename,arg1) values (10,0,2000,"/var/lib/proxysql/galera_check.pl","-u=admin -p=admin -h=192.168.1.50 -H=500:W,501:R -P=3310 --execution_time=1 --retry_down=2 --retry_up=1 --main_segment=1 --debug=0 --log=/var/lib/proxysql/galeraLog"); LOAD SCHEDULER TO RUNTIME;SAVE SCHEDULER TO DISK;

If you want to activate it:

update scheduler set active=1 where id=10; LOAD SCHEDULER TO RUNTIME;

The following is the kind of scenario we have:

+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | 500 | 192.168.1.9 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 413 | | 500 | 192.168.1.8 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 420 | | 500 | 192.168.1.7 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 227 | | 500 | 192.168.1.6 | 3306 | ONLINE | 0 | 10 | 10 | 0 | 12654 | 1016975 | 0 | 230 | | 500 | 192.168.1.5 | 3306 | ONLINE | 0 | 9 | 29 | 0 | 107358 | 8629123 | 0 | 206 | | 501 | 192.168.1.9 | 3306 | ONLINE | 0 | 4 | 6 | 0 | 12602425 | 613371057 | 34467286486 | 413 | | 501 | 192.168.1.8 | 3306 | ONLINE | 0 | 6 | 7 | 0 | 12582617 | 612422028 | 34409606321 | 420 | | 501 | 192.168.1.7 | 3306 | ONLINE | 0 | 6 | 6 | 0 | 18580675 | 905464967 | 50824195445 | 227 | | 501 | 192.168.1.6 | 3306 | ONLINE | 0 | 6 | 14 | 0 | 18571127 | 905075154 | 50814832276 | 230 | | 501 | 192.168.1.5 | 3306 | ONLINE | 0 | 1 | 10 | 0 | 169570 | 8255821 | 462706881 | 206 | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+

To generate a load, use the following commands (or whatever you like, but use a different one for read-only and reads/writes):

Write sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.50 --mysql-port=3311 --mysql-user=stress_RW --mysql-password=test --mysql-db=test_galera --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=50000 --max-requests=0 --max-time=9000 --oltp-point-selects=5 --oltp-read-only=off --oltp-dist-type=uniform --oltp-reconnect-mode=transaction --oltp-skip-trx=off --num-threads=10 --report-interval=10 --mysql-ignore-errors=all run Read only sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.50 --mysql-port=3311 --mysql-user=stress_RW --mysql-password=test --mysql-db=test_galera --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=50000 --max-requests=0 --max-time=9000 --oltp-point-selects=5 --oltp-read-only=on --num-threads=10 --oltp-reconnect-mode=query --oltp-skip-trx=on --report-interval=10 --mysql-ignore-errors=all run

The most common thing that could happen to a cluster node is to become a donor. This is a planned activity for Percona XtraDB Cluster and is suppose to be managed in a less harmful way.

We’re going to simulate crashing a node and forcing it to elect our main node as DONOR (the one with the highest WEIGHT).

To do so, we need to have the parameter wsrep_sst_donor set.

show global variables like 'wsrep_sst_donor'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | wsrep_sst_donor | node1 | <--- +-----------------+-------+

Activate the check if not already done:

update scheduler set active=1 where id=10;

And now run traffic. Check load:

select * from stats_mysql_connection_pool where hostgroup in (500,501,9500,9501) order by hostgroup,srv_host ; +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 10 | 0 | 30 | 0 | 112662 | 9055479 | 0 | 120 | <--- our Donor | 500 | 192.168.1.6 | 3306 | ONLINE | 0 | 10 | 10 | 0 | 12654 | 1016975 | 0 | 111 | | 500 | 192.168.1.7 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 115 | | 500 | 192.168.1.8 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 316 | | 500 | 192.168.1.9 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 329 | | 501 | 192.168.1.5 | 3306 | ONLINE | 0 | 1 | 10 | 0 | 257271 | 12533763 | 714473854 | 120 | | 501 | 192.168.1.6 | 3306 | ONLINE | 0 | 10 | 18 | 0 | 18881582 | 920200116 | 51688974309 | 111 | | 501 | 192.168.1.7 | 3306 | ONLINE | 3 | 6 | 9 | 0 | 18927077 | 922317772 | 51794504662 | 115 | | 501 | 192.168.1.8 | 3306 | ONLINE | 0 | 1 | 8 | 0 | 12595556 | 613054573 | 34447564440 | 316 | | 501 | 192.168.1.9 | 3306 | ONLINE | 1 | 3 | 6 | 0 | 12634435 | 614936148 | 34560620180 | 329 | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+

Now on one of the nodes:

  1. Kill mysql
  2. Remove the content of the data directory
  3. Restart the node

The node will go in SST and our galera_check script will manage it:

+--------------+-------------+--------------+------------+--------------------------------------------------+ | hostgroup_id | hostname | status | weight | comment | +--------------+-------------+--------------+------------+--------------------------------------------------+ | 500 | 192.168.1.5 | OFFLINE_SOFT | 1000000000 | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; | <---- the donor | 500 | 192.168.1.6 | ONLINE | 1000000 | | | 500 | 192.168.1.7 | ONLINE | 100 | | | 500 | 192.168.1.8 | ONLINE | 1 | | | 500 | 192.168.1.9 | ONLINE | 1 | | | 501 | 192.168.1.5 | OFFLINE_SOFT | 100 | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; | | 501 | 192.168.1.6 | ONLINE | 1000000000 | | | 501 | 192.168.1.7 | ONLINE | 1000000000 | | | 501 | 192.168.1.8 | ONLINE | 1 | | | 501 | 192.168.1.9 | ONLINE | 1 | | +--------------+-------------+--------------+------------+--------------------------------------------------+

We can also check the galera_check log and see what happened:

2016/09/02 16:13:27.298:[WARN] Move node:192.168.1.5;3306;500;3010 SQL: UPDATE mysql_servers SET status='OFFLINE_SOFT' WHERE hostgroup_id=500 AND hostname='192.168.1.5' AND port='3306' 2016/09/02 16:13:27.303:[WARN] Move node:192.168.1.5;3306;501;3010 SQL: UPDATE mysql_servers SET status='OFFLINE_SOFT' WHERE hostgroup_id=501 AND hostname='192.168.1.5' AND port='3306'

The node will remain in OFFLINE_SOFT while the other node (192.168.1.6 with the 2nd WEIGHT) serves the writes, until the node is in DONOR state.

All as expected, the node was set in OFFLINE_SOFT state, which mean the existing connections finished, while the node was not accepting any NEW connections.

As soon the node stops sending data to the Joiner, it was moved back and traffic restarted:

2016/09/02 16:14:58.239:[WARN] Move node:192.168.1.5;3306;500;1000 SQL: UPDATE mysql_servers SET status='ONLINE' WHERE hostgroup_id=500 AND hostname='192.168.1.5' AND port='3306' 2016/09/02 16:14:58.243:[WARN] Move node:192.168.1.5;3306;501;1000 SQL: UPDATE mysql_servers SET status='ONLINE' WHERE hostgroup_id=501 AND hostname='192.168.1.5' AND port='3306'

+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 6 | 1 | 37 | 0 | 153882 | 12368557 | 0 | 72 | <--- | 500 | 192.168.1.6 | 3306 | ONLINE | 1 | 9 | 10 | 0 | 16008 | 1286492 | 0 | 42 | | 500 | 192.168.1.7 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 1398 | 112371 | 0 | 96 | | 500 | 192.168.1.8 | 3306 | ONLINE | 0 | 0 | 24545 | 791 | 24545 | 122725 | 0 | 359 | | 500 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 15108 | 1214366 | 0 | 271 | | 501 | 192.168.1.5 | 3306 | ONLINE | 1 | 0 | 11 | 0 | 2626808 | 128001112 | 7561278884 | 72 | | 501 | 192.168.1.6 | 3306 | ONLINE | 5 | 7 | 20 | 0 | 28629516 | 1394974468 | 79289633420 | 42 | | 501 | 192.168.1.7 | 3306 | ONLINE | 2 | 8 | 10 | 0 | 29585925 | 1441400648 | 81976494740 | 96 | | 501 | 192.168.1.8 | 3306 | ONLINE | 0 | 0 | 16779 | 954 | 12672983 | 616826002 | 34622768228 | 359 | | 501 | 192.168.1.9 | 3306 | ONLINE | 0 | 4 | 6 | 0 | 13567512 | 660472589 | 37267991677 | 271 | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+

This was easy, and more or less managed by the standard script. But what would happen if my donor was set to DO NOT serve query when in the DONOR state?

Wait, what?? Yes, Percona XtraDB Cluster (and Galera in general) can be set to refuse any query when the node goes in DONOR state. If not managed this can cause issues as the node will simply reject queries (but ProxySQL sees the node as alive).

Let me show you:

show global variables like 'wsrep_sst_donor_rejects_queries'; +---------------------------------+-------+ | Variable_name | Value | +---------------------------------+-------+ | wsrep_sst_donor_rejects_queries | ON | +---------------------------------+-------+

For the moment, let’s deactivate the check. Then, do the same stop and delete of the data dir, then restart the node. SST takes place.

Sysbench will report:

ALERT: mysql_drv_query() returned error 2013 (Lost connection to MySQL server during query) for query 'BEGIN' FATAL: failed to execute function `event': 3 ALERT: mysql_drv_query() returned error 2013 (Lost connection to MySQL server during query) for query 'BEGIN' FATAL: failed to execute function `event': 3

But ProxySQL?

+-----------+-------------+----------+---------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+---------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 0 | 0 | 101 | 0 | 186331 | 14972717 | 0 | 118 | <-- no writes in wither HG | 500 | 192.168.1.6 | 3306 | ONLINE | 0 | 9 | 10 | 0 | 20514 | 1648665 | 0 | 171 | | | 500 | 192.168.1.7 | 3306 | ONLINE | 0 | 1 | 3 | 0 | 5881 | 472629 | 0 | 134 | | | 500 | 192.168.1.8 | 3306 | ONLINE | 0 | 0 | 205451 | 1264 | 205451 | 1027255 | 0 | 341 | | | 500 | 192.168.1.9 | 3306 | ONLINE | 0 | 1 | 2 | 0 | 15642 | 1257277 | 0 | 459 | - | 501 | 192.168.1.5 | 3306 | ONLINE | 1 | 0 | 13949 | 0 | 4903347 | 238627310 | 14089708430 | 118 | | 501 | 192.168.1.6 | 3306 | ONLINE | 2 | 10 | 20 | 0 | 37012174 | 1803380964 | 103269634626 | 171 | | 501 | 192.168.1.7 | 3306 | ONLINE | 2 | 11 | 13 | 0 | 38782923 | 1889507208 | 108288676435 | 134 | | 501 | 192.168.1.8 | 3306 | SHUNNED | 0 | 0 | 208452 | 1506 | 12864656 | 626156995 | 34622768228 | 341 | | 501 | 192.168.1.9 | 3306 | ONLINE | 1 | 3 | 6 | 0 | 14451462 | 703534884 | 39837663734 | 459 | +-----------+-------------+----------+---------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ mysql> select * from mysql_server_connect_log where hostname in ('192.168.1.5','192.168.1.6','192.168.1.7','192.168.1.8','192.168.1.9') order by time_start_us desc limit 10; +-------------+------+------------------+-------------------------+--------------------------------------------------------------------------------------------------------+ | hostname | port | time_start_us | connect_success_time_us | connect_error | +-------------+------+------------------+-------------------------+--------------------------------------------------------------------------------------------------------+ | 192.168.1.9 | 3306 | 1472827444621954 | 1359 | NULL | | 192.168.1.8 | 3306 | 1472827444618883 | 0 | Can't connect to MySQL server on '192.168.1.8' (107) | | 192.168.1.7 | 3306 | 1472827444615819 | 433 | NULL | | 192.168.1.6 | 3306 | 1472827444612722 | 538 | NULL | | 192.168.1.5 | 3306 | 1472827444606560 | 473 | NULL | <-- donor is seen as up | 192.168.1.9 | 3306 | 1472827384621463 | 1286 | NULL | | 192.168.1.8 | 3306 | 1472827384618442 | 0 | Lost connection to MySQL server at 'handshake: reading inital communication packet', system error: 107 | | 192.168.1.7 | 3306 | 1472827384615317 | 419 | NULL | | 192.168.1.6 | 3306 | 1472827384612241 | 415 | NULL | | 192.168.1.5 | 3306 | 1472827384606117 | 454 | NULL | <-- donor is seen as up +-------------+------+------------------+-------------------------+--------------------------------------------------------------------------------------------------------+ select * from mysql_server_ping_log where hostname in ('192.168.1.5','192.168.1.6','192.168.1.7','192.168.1.8','192.168.1.9') order by time_start_us desc limit 10; +-------------+------+------------------+----------------------+------------------------------------------------------+ | hostname | port | time_start_us | ping_success_time_us | ping_error | +-------------+------+------------------+----------------------+------------------------------------------------------+ | 192.168.1.9 | 3306 | 1472827475062217 | 311 | NULL | | 192.168.1.8 | 3306 | 1472827475060617 | 0 | Can't connect to MySQL server on '192.168.1.8' (107) | | 192.168.1.7 | 3306 | 1472827475059073 | 108 | NULL | | 192.168.1.6 | 3306 | 1472827475057281 | 102 | NULL | | 192.168.1.5 | 3306 | 1472827475054188 | 74 | NULL | <-- donor is seen as up | 192.168.1.9 | 3306 | 1472827445061877 | 491 | NULL | | 192.168.1.8 | 3306 | 1472827445060254 | 0 | Can't connect to MySQL server on '192.168.1.8' (107) | | 192.168.1.7 | 3306 | 1472827445058688 | 53 | NULL | | 192.168.1.6 | 3306 | 1472827445057124 | 131 | NULL | | 192.168.1.5 | 3306 | 1472827445054015 | 98 | NULL | <-- donor is seen as up +-------------+------+------------------+----------------------+------------------------------------------------------+

As you can see, all seems OK. Let’s turn on galera_check and see what happens when we run some read and write loads.

And now let me do the stop-delete-restart-SST process again:

kill -9 <mysqld_safe_pid> <mysqld_pid>; rm -fr data/*;rm -fr logs/*;sleep 2;./start

As soon as the node goes down, ProxySQL shuns the node.

+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 7 | 3 | 34 | 0 | 21570 | 1733833 | 0 | 146 | | 500 | 192.168.1.6 | 3306 | ONLINE | 1 | 8 | 12 | 0 | 9294 | 747063 | 0 | 129 | | 500 | 192.168.1.7 | 3306 | ONLINE | 1 | 0 | 4 | 0 | 3396 | 272950 | 0 | 89 | | 500 | 192.168.1.8 | 3306 | SHUNNED | 0 | 0 | 1 | 6 | 12 | 966 | 0 | 326 | <-- crashed | 500 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 2 | 0 | 246 | 19767 | 0 | 286 | | 501 | 192.168.1.5 | 3306 | ONLINE | 0 | 1 | 2 | 0 | 772203 | 37617973 | 2315131214 | 146 | | 501 | 192.168.1.6 | 3306 | ONLINE | 9 | 3 | 12 | 0 | 3439458 | 167514166 | 10138636314 | 129 | | 501 | 192.168.1.7 | 3306 | ONLINE | 1 | 12 | 13 | 0 | 3183822 | 155064971 | 9394612877 | 89 | | 501 | 192.168.1.8 | 3306 | SHUNNED | 0 | 0 | 1 | 6 | 11429 | 560352 | 35350726 | 326 | <-- crashed | 501 | 192.168.1.9 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 312253 | 15227786 | 941110520 | 286 | +-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

Immediately after, galera_check identifies the node is requesting the SST, and that the DONOR is our writer (given it is NOT the only writer in the HG, and it has the variable wsrep_sst_donor_rejects_queries active), it cannot be set to OFFLINE_SOFT. We do not want ProxySQL to consider it OFFLINE_HARD (because it is not).

As such, the script moves it to a special HG:

2016/09/04 16:11:22.091:[WARN] Move node:192.168.1.5;3306;500;3001 SQL: UPDATE mysql_servers SET hostgroup_id=9500 WHERE hostgroup_id=500 AND hostname='192.168.1.5' AND port='3306' 2016/09/04 16:11:22.097:[WARN] Move node:192.168.1.5;3306;501;3001 SQL: UPDATE mysql_servers SET hostgroup_id=9501 WHERE hostgroup_id=501 AND hostname='192.168.1.5' AND port='3306'

+--------------+-------------+------+--------+------------+-------------+-----------------+---------------------+---------+----------------+--------------------------------------------------+ | hostgroup_id | hostname | port | status | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment | +--------------+-------------+------+--------+------------+-------------+-----------------+---------------------+---------+----------------+--------------------------------------------------+ | 500 | 192.168.1.6 | 3306 | ONLINE | 1000000 | 0 | 1000 | 0 | 0 | 0 | | | 500 | 192.168.1.7 | 3306 | ONLINE | 100 | 0 | 1000 | 0 | 0 | 0 | | | 500 | 192.168.1.8 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; | | 500 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; | | 501 | 192.168.1.6 | 3306 | ONLINE | 1000000000 | 0 | 1000 | 0 | 0 | 0 | | | 501 | 192.168.1.7 | 3306 | ONLINE | 1000000000 | 0 | 1000 | 0 | 0 | 0 | | | 501 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 1000 | 0 | 0 | 0 | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; | | 9500 | 192.168.1.5 | 3306 | ONLINE | 1000000000 | 0 | 1000 | 0 | 0 | 0 | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; | <-- Special HG | 9501 | 192.168.1.5 | 3306 | ONLINE | 100 | 0 | 1000 | 0 | 0 | 0 | 500_W_501_R_retry_up=0;500_W_501_R_retry_down=0; | <-- Special HG +--------------+-------------+------+--------+------------+-------------+-----------------+---------------------+---------+----------------+--------------------------------------------------+

The Donor continues to serve the Joiner, but applications won’t see it.

What applications see is also very important. Applications doing WRITEs will see:

[ 10s] threads: 10, tps: 9.50, reads: 94.50, writes: 42.00, response time: 1175.77ms (95%), errors: 0.00, reconnects: 0.00 ... [ 40s] threads: 10, tps: 2.80, reads: 26.10, writes: 11.60, response time: 3491.45ms (95%), errors: 0.00, reconnects: 0.10 [ 50s] threads: 10, tps: 4.80, reads: 50.40, writes: 22.40, response time: 10062.13ms (95%), errors: 0.80, reconnects: 351.60 <--- Main writer moved to another HG [ 60s] threads: 10, tps: 5.90, reads: 53.10, writes: 23.60, response time: 2869.82ms (95%), errors: 0.00, reconnects: 0.00 ...

When one node shifts to another, the applications will have to manage the RE-TRY, but it will only be a short time and will cause limited impact on the production flow.

Application readers see no errors:

[ 10s] threads: 10, tps: 0.00, reads: 13007.31, writes: 0.00, response time: 9.13ms (95%), errors: 0.00, reconnects: 0.00 [ 50s] threads: 10, tps: 0.00, reads: 9613.70, writes: 0.00, response time: 10.66ms (95%), errors: 0.00, reconnects: 0.20 <-- just a glitch in reconnect [ 60s] threads: 10, tps: 0.00, reads: 10807.90, writes: 0.00, response time: 11.07ms (95%), errors: 0.00, reconnects: 0.20 [ 70s] threads: 10, tps: 0.00, reads: 9082.61, writes: 0.00, response time: 23.62ms (95%), errors: 0.00, reconnects: 0.00 ... [ 390s] threads: 10, tps: 0.00, reads: 13050.80, writes: 0.00, response time: 8.97ms (95%), errors: 0.00, reconnects: 0.00

When the Donor ends providing SST, it comes back and the script manages it. Then galera_check puts it in the right HG:

2016/09/04 16:12:34.266:[WARN] Move node:192.168.1.5;3306;9500;1010 SQL: UPDATE mysql_servers SET hostgroup_id=500 WHERE hostgroup_id=9500 AND hostname='192.168.1.5' AND port='3306' 2016/09/04 16:12:34.270:[WARN] Move node:192.168.1.5;3306;9501;1010 SQL: UPDATE mysql_servers SET hostgroup_id=501 WHERE hostgroup_id=9501 AND hostname='192.168.1.5' AND port='3306'

The crashed node is restarted by the SST process, and the node goes up. But if the level of load in the cluster is mid/high, it will remain in the JOINED state for sometime, becoming visible by the ProxySQL again. ProxySQL will not, however, correctly recognize the state.

2016-09-04 16:17:15 21035 [Note] WSREP: 3.2 (node4): State transfer from 1.1 (node1) complete. 2016-09-04 16:17:15 21035 [Note] WSREP: Shifting JOINER -> JOINED (TO: 254515)

To avoid this issue, the script will move it to a special HG, allowing it to recovery without interfering with a real load.

+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 6 | 2 | 15 | 0 | 3000 | 241060 | 0 | 141 | | 500 | 192.168.1.6 | 3306 | ONLINE | 1 | 9 | 13 | 0 | 13128 | 1055268 | 0 | 84 | | 500 | 192.168.1.7 | 3306 | ONLINE | 1 | 0 | 4 | 0 | 3756 | 301874 | 0 | 106 | | 500 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 2 | 0 | 4080 | 327872 | 0 | 278 | | 501 | 192.168.1.5 | 3306 | ONLINE | 1 | 0 | 2 | 0 | 256753 | 12508935 | 772048259 | 141 | | 501 | 192.168.1.6 | 3306 | ONLINE | 4 | 8 | 12 | 0 | 5116844 | 249191524 | 15100617833 | 84 | | 501 | 192.168.1.7 | 3306 | ONLINE | 2 | 11 | 13 | 0 | 4739756 | 230863200 | 13997231724 | 106 | | 501 | 192.168.1.9 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 496524 | 24214563 | 1496482104 | 278 | | 9500 | 192.168.1.8 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 331 |<-- Joined not Sync | 9501 | 192.168.1.8 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 331 |<-- Joined not Sync +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

Once the node fully recovers, galera_check puts it back in the original HG, ready serve requests:

+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 0 | 1 | 15 | 0 | 3444 | 276758 | 0 | 130 | | 500 | 192.168.1.6 | 3306 | ONLINE | 0 | 9 | 13 | 0 | 13200 | 1061056 | 0 | 158 | | 500 | 192.168.1.7 | 3306 | ONLINE | 0 | 0 | 4 | 0 | 3828 | 307662 | 0 | 139 | | 500 | 192.168.1.8 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |<-- up again | 500 | 192.168.1.9 | 3306 | ONLINE | 0 | 0 | 2 | 0 | 4086 | 328355 | 0 | 336 | | 501 | 192.168.1.5 | 3306 | ONLINE | 0 | 1 | 2 | 0 | 286349 | 13951366 | 861638962 | 130 | | 501 | 192.168.1.6 | 3306 | ONLINE | 0 | 12 | 12 | 0 | 5239212 | 255148806 | 15460951262 | 158 | | 501 | 192.168.1.7 | 3306 | ONLINE | 0 | 13 | 13 | 0 | 4849970 | 236234446 | 14323937975 | 139 | | 501 | 192.168.1.8 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |<-- up again | 501 | 192.168.1.9 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 507910 | 24768898 | 1530841172 | 336 | +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

A summary of the logical steps is:

+---------+ | Crash | +----+----+ | v +--------+-------+ | ProxySQL | | shun crashed | | node | +--------+-------+ | | v +-----------------+-----------------+ | Donor has one of the following? | | wsrep_sst_dono _rejects_queries | | OR | | wsrep_reject_queries | +-----------------------------------+ |No |Yes v v +-----+----------+ +-----------+----+ | Galera_check | | Galera_check | | put the donor | | put the donor | | in OFFLINE_SOFT| | in special HG | +---+------------+ +-----------+----+ | | | | v v +---+--------------------------------+-----+ | Donor SST ends | +---+---------------+----------------+-----+ | | | | | | +---+------------+ | +-----------+----+ | Galera_check | | | Galera_check | | put the donor | | | put the donor | | ONLINE | | | in Original HG | +----------------+ | +----------------+ | | +------------------------------------------+ | crashed SST ends | +-------------------+----------------------+ | | +------------+-------------+ | Crashed node back but +<------------+ | Not Sync? | | +--------------------------+ | |No |Yes | | | | | | | +---------+------+ +------+---------+ | | Galera_check | | Galera_check | | | put the node | | put the node +-----+ | back orig. HG | | Special HG | +--------+-------+ +----------------+ | | | | +---------+ +------> END | +---------+

As mentioned, galera_check can manage several node states.

Another case is when we can’t have the node accept ANY queries. We might need that for several reasons, including preparing the node for maintenance (or whatever).

In Percona XtraDB Cluster (and other Galera implementations) we can set the value of wsrep_reject_queries to:

  • NONE
  • ALL
  • ALL_KILL

Let see how it works. Run some load, then on the main writer node (192.168.1.5):

set global wsrep_reject_queries=ALL;

This blocks any new query execution until the run is complete. Do a simple select on the node:

(root@localhost:pm) [test]>select * from tbtest1; ERROR 1047 (08S01): WSREP has not yet prepared node for application use

ProxySQL won’t see these conditions:

+-------------+------+------------------+----------------------+------------+ | hostname | port | time_start_us | ping_success_time_us | ping_error | +-------------+------+------------------+----------------------+------------+ | 192.168.1.5 | 3306 | 1473005467628001 | 35 | NULL | <--- ping ok | 192.168.1.5 | 3306 | 1473005437628014 | 154 | NULL | +-------------+------+------------------+----------------------+------------+ +-------------+------+------------------+-------------------------+---------------+ | hostname | port | time_start_us | connect_success_time_us | connect_error | +-------------+------+------------------+-------------------------+---------------+ | 192.168.1.5 | 3306 | 1473005467369575 | 246 | NULL | <--- connect ok | 192.168.1.5 | 3306 | 1473005407369441 | 353 | NULL | +-------------+------+------------------+-------------------------+---------------+

The script galera_check will instead manage it:

+-----------+-------------+----------+--------------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | OFFLINE_SOFT | 0 | 0 | 8343 | 0 | 10821 | 240870 | 0 | 93 | <--- galera check put it OFFLINE | 500 | 192.168.1.6 | 3306 | ONLINE | 10 | 0 | 15 | 0 | 48012 | 3859402 | 0 | 38 | <--- writer | 500 | 192.168.1.7 | 3306 | ONLINE | 0 | 1 | 6 | 0 | 14712 | 1182364 | 0 | 54 | | 500 | 192.168.1.8 | 3306 | ONLINE | 0 | 1 | 2 | 0 | 1092 | 87758 | 0 | 602 | | 500 | 192.168.1.9 | 3306 | ONLINE | 0 | 1 | 4 | 0 | 5352 | 430152 | 0 | 238 | | 501 | 192.168.1.5 | 3306 | OFFLINE_SOFT | 0 | 0 | 1410 | 0 | 197909 | 9638665 | 597013919 | 93 | | 501 | 192.168.1.6 | 3306 | ONLINE | 2 | 10 | 12 | 0 | 7822682 | 380980455 | 23208091727 | 38 | | 501 | 192.168.1.7 | 3306 | ONLINE | 0 | 13 | 13 | 0 | 7267507 | 353962618 | 21577881545 | 54 | | 501 | 192.168.1.8 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 241641 | 11779770 | 738145270 | 602 | | 501 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 756415 | 36880233 | 2290165636 | 238 | +-----------+-------------+----------+--------------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

In this case, the script will put the node in OFFLINE_SOFT, given the set global wsrep_reject_queries=ALL means do not accept NEW and complete the existing as OFFLINE_SOFT.

The script also manages the case of set global wsrep_reject_queries=ALL_KILL;. From the ProxySQL point of view, these do not exist either:

+-------------+------+------------------+----------------------+------------+ | hostname | port | time_start_us | ping_success_time_us | ping_error | +-------------+------+------------------+----------------------+------------+ | 192.168.1.5 | 3306 | 1473005827629069 | 59 | NULL |<--- ping ok | 192.168.1.5 | 3306 | 1473005797628988 | 57 | NULL | +-------------+------+------------------+----------------------+------------+ +-------------+------+------------------+-------------------------+---------------+ | hostname | port | time_start_us | connect_success_time_us | connect_error | +-------------+------+------------------+-------------------------+---------------+ | 192.168.1.5 | 3306 | 1473005827370084 | 370 | NULL | <--- connect ok | 192.168.1.5 | 3306 | 1473005767369915 | 243 | NULL | +-------------+------+------------------+-------------------------+---------------+ +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 9500 | 192.168.1.5 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |<--- galera check put it in special HG | 9501 | 192.168.1.5 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

The difference here is that the script moves the node to the special HG to isolate it, instead leaving it in the original HG.

The integration between ProxySQL and Percona XtraDB Custer (Galera) works perfectly for multi-writer if you have a script like galera_check that correctly manages the different Percona XtraDB Custer/Galera states.

ProxySQL and PXC using Replication HostGroup

Sometimes we might need to have 100% of the write going to only one node at a time. As explained above, ProxySQL uses weight to redirect a % of the load to a specific node.

In most cases, it will be enough to set the weight in the main writer to a very high value (like 10 billion) and one thousand on the next node to almost achieve a single writer.

But this is not 100% effective, it still allows ProxySQL to send a query once every X times to the other node(s). To keep it consistent with the ProxySQL logic, the solution is to use replication Hostgroups.

Replication HGs are special HGs that the proxy sees as connected for R/W operations. ProxySQL analyzes the value of the READ_ONLY variables and assigns to the READ_ONLY HG the nodes that have it enabled.

The node having READ_ONLY=0 resides in both HGs. As such the first thing we need to modify is to tell ProxySQL that HG 500 and 501 are replication HGs.

INSERT INTO mysql_replication_hostgroups VALUES (500,501,''); LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK; select * from mysql_replication_hostgroups ; +------------------+------------------+---------+ | writer_hostgroup | reader_hostgroup | comment | +------------------+------------------+---------+ | 500 | 501 | | +------------------+------------------+---------+

Now whenever I set the value of READ_ONLY on a node, ProxySQL will move the node accordingly. Let see how. Current:

+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 6 | 1 | 7 | 0 | 16386 | 1317177 | 0 | 97 | | 500 | 192.168.1.6 | 3306 | ONLINE | 1 | 9 | 15 | 0 | 73764 | 5929366 | 0 | 181 | | 500 | 192.168.1.7 | 3306 | ONLINE | 1 | 0 | 6 | 0 | 18012 | 1447598 | 0 | 64 | | 500 | 192.168.1.8 | 3306 | ONLINE | 1 | 0 | 2 | 0 | 1440 | 115728 | 0 | 341 | | 501 | 192.168.1.5 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 1210029 | 58927817 | 3706882671 | 97 | | 501 | 192.168.1.6 | 3306 | ONLINE | 1 | 11 | 12 | 0 | 16390790 | 798382865 | 49037691590 | 181 | | 501 | 192.168.1.7 | 3306 | ONLINE | 1 | 12 | 13 | 0 | 15357779 | 748038558 | 45950863867 | 64 | | 501 | 192.168.1.8 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 1247662 | 60752227 | 3808131279 | 341 | | 501 | 192.168.1.9 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 1766309 | 86046839 | 5374169120 | 422 | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+

Set global READ_ONLY=1 on the following nodes: 192.168.1.6/7/8/9.

After:

+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 10 | 0 | 20 | 0 | 25980 | 2088346 | 0 | 93 | | 501 | 192.168.1.5 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 1787979 | 87010074 | 5473781192 | 93 | | 501 | 192.168.1.6 | 3306 | ONLINE | 4 | 8 | 12 | 0 | 18815907 | 916547402 | 56379724890 | 79 | | 501 | 192.168.1.7 | 3306 | ONLINE | 1 | 12 | 13 | 0 | 17580636 | 856336023 | 52670114510 | 131 | | 501 | 192.168.1.8 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 15324 | 746109 | 46760779 | 822 | | 501 | 192.168.1.9 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 16210 | 789999 | 49940867 | 679 | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+

IF in this scenario a reader node crashes, the application will not suffer at all given the redundancy.

But if the writer is going to crash, THEN the issue exists because there will be NO node available to manage the failover. The solution is to either do the node election manually or to have the script elect the node with the lowest read weight in the same segment as the new writer.

Below is what happens when a node crashes (bird-eye view):

+---------+ | Crash | +----+----+ | v +--------+-------+ | ProxySQL | | shun crashed | | node | +--------+-------+ | | v +-----------------+-----------------+ +-----------> HostGroup has another active | | | Node in HG writer? | | +--+--------------+---------------+-+ | | | | | | | | | |No | |Yes | | | | | +-----v----------+ | +-----------v----+ | |ProxySQL will | | |ProxySQL will | | |stop serving | | |redirect load >--------+ | |writes | | |there | | | +----------------+ | +----------------+ | | | | | v | | +-------+--------+ | | |ProxySQL checks | | | |READ_ONLY on | | | |Reader HG | | | | | | | +-------+--------+ | | | | | v | | +-------+--------+ | | |Any Node with | | | |READ_ONLY = 0 ? | | | +----------------+ | | |No |Yes | | | | | | +----------v------+ +--v--------------+ | | |ProxySQL will | |ProxySQL will | | | |continue to | |Move node to | | +<---------<+do not serve | |Writer HG | | | |Writes | | | | | +-----------------+ +--------v--------+ | | | | +-------------------------------------------+ | +---------+ | | END <------------------------+ +---------+

The script should act immediately after the ProxySQL SHUNNED the node step, just replacing the READ_ONLY=1 with READ_ONLY=0 on the reader node with the lowest READ WEIGHT.

ProxySQL will do the rest, copying the node into the WRITER HG, keeping low weight, such that WHEN/IF the original node will comeback the new node will not compete for traffic.

Since it included that special function in the check, the feature allows automatic fail-over. This experimental feature is active only if explicitly set in the parameter that the scheduler passes to the script. To activate it add --active_failover in the scheduler. My recommendation is to have two entries in the scheduler and activate the one with --active_failover for testing, and remember to deactivate the other one.

Let see the manual procedure first:

The process is:

1 Generate some load
2 Kill the writer node
3 Manually elect a reader as writer
4 Recover crashed node

Current load:

+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 10 | 0 | 10 | 0 | 30324 | 2437438 | 0 | 153 | | 501 | 192.168.1.5 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 1519612 | 74006447 | 4734427711 | 153 | | 501 | 192.168.1.6 | 3306 | ONLINE | 4 | 8 | 12 | 0 | 7730857 | 376505014 | 24119645457 | 156 | | 501 | 192.168.1.7 | 3306 | ONLINE | 2 | 10 | 12 | 0 | 7038332 | 342888697 | 21985442619 | 178 | | 501 | 192.168.1.8 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 612523 | 29835858 | 1903693835 | 337 | | 501 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 611021 | 29769497 | 1903180139 | 366 | +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

Kill the main node 192.168.1.5:

+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 501 | 192.168.1.5 | 3306 | SHUNNED | 0 | 0 | 1 | 11 | 1565987 | 76267703 | 4879938857 | 119 | | 501 | 192.168.1.6 | 3306 | ONLINE | 1 | 11 | 12 | 0 | 8023216 | 390742215 | 25033271548 | 112 | | 501 | 192.168.1.7 | 3306 | ONLINE | 1 | 11 | 12 | 0 | 7306838 | 355968373 | 22827016386 | 135 | | 501 | 192.168.1.8 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 638326 | 31096065 | 1984732176 | 410 | | 501 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 636857 | 31025014 | 1982213114 | 328 | +-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ +-------------+------+------------------+----------------------+------------------------------------------------------+ | hostname | port | time_start_us | ping_success_time_us | ping_error | +-------------+------+------------------+----------------------+------------------------------------------------------+ | 192.168.1.5 | 3306 | 1473070640798571 | 0 | Can't connect to MySQL server on '192.168.1.5' (107) | | 192.168.1.5 | 3306 | 1473070610798464 | 0 | Can't connect to MySQL server on '192.168.1.5' (107) | +-------------+------+------------------+----------------------+------------------------------------------------------+ +-------------+------+------------------+-------------------------+------------------------------------------------------+ | hostname | port | time_start_us | connect_success_time_us | connect_error | +-------------+------+------------------+-------------------------+------------------------------------------------------+ | 192.168.1.5 | 3306 | 1473070640779903 | 0 | Can't connect to MySQL server on '192.168.1.5' (107) | | 192.168.1.5 | 3306 | 1473070580779977 | 0 | Can't connect to MySQL server on '192.168.1.5' (107) | +-------------+------+------------------+-------------------------+------------------------------------------------------+

When the node is killed ProxySQL, shun it and report issues with the checks (connect and ping). During this time frame the application will experience issues if it is not designed to manage the retry and eventually a queue, and it will crash.

Sysbench reports the errors:

Writes

[ 10s] threads: 10, tps: 6.70, reads: 68.50, writes: 30.00, response time: 1950.53ms (95%), errors: 0.00, reconnects: 0.00 ... [1090s] threads: 10, tps: 4.10, reads: 36.90, writes: 16.40, response time: 2226.45ms (95%), errors: 0.00, reconnects: 1.00 <-+ killing the node [1100s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects: 0.00 | [1110s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects: 0.00 | [1120s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects: 0.00 | [1130s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects: 0.00 |-- Gap waiting for a node to become [1140s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects: 0.00 | READ_ONLY=0 [1150s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects: 0.00 | [1160s] threads: 10, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 1.00, reconnects: 0.00 | [1170s] threads: 10, tps: 4.70, reads: 51.30, writes: 22.80, response time: 80430.18ms (95%), errors: 0.00, reconnects: 0.00 <-+ [1180s] threads: 10, tps: 8.90, reads: 80.10, writes: 35.60, response time: 2068.39ms (95%), errors: 0.00, reconnects: 0.00 ... [1750s] threads: 10, tps: 5.50, reads: 49.80, writes: 22.80, response time: 2266.80ms (95%), errors: 0.00, reconnects: 0.00 -- No additional errors

I decided to promote node 192.168.1.6 given the weight for readers was equal and as such no difference in this setup.

(root@localhost:pm) [(none)]>set global read_only=0; Query OK, 0 rows affected (0.00 sec)

Checking ProxySQL:

+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 500 | 192.168.1.6 | 3306 | ONLINE | 10 | 0 | 10 | 0 | 1848 | 148532 | 0 | 40 | | 501 | 192.168.1.5 | 3306 | SHUNNED | 0 | 0 | 1 | 72 | 1565987 | 76267703 | 4879938857 | 38 | | 501 | 192.168.1.6 | 3306 | ONLINE | 2 | 10 | 12 | 0 | 8843069 | 430654903 | 27597990684 | 40 | | 501 | 192.168.1.7 | 3306 | ONLINE | 1 | 11 | 12 | 0 | 8048826 | 392101994 | 25145582384 | 83 | | 501 | 192.168.1.8 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 725820 | 35371512 | 2259974847 | 227 | | 501 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 723582 | 35265066 | 2254824754 | 290 | +-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

As the READ_ONLY value is modified, ProxySQL moves it to the writer HG, and writes can take place again. At this point in time production activities are recovered.

Reads had just a minor glitch:

Reads

[ 10s] threads: 10, tps: 0.00, reads: 20192.15, writes: 0.00, response time: 6.96ms (95%), errors: 0.00, reconnects: 0.00 ... [ 410s] threads: 10, tps: 0.00, reads: 16489.03, writes: 0.00, response time: 9.41ms (95%), errors: 0.00, reconnects: 2.50 ... [ 710s] threads: 10, tps: 0.00, reads: 18789.40, writes: 0.00, response time: 6.61ms (95%), errors: 0.00, reconnects: 0.00

The glitch happened when node 192.168.1.6 was copied over to HG 500, but with no interruptions or errors. At this point let us put back the crashed node, which comes back elect Node2 (192.168.1.6) as Donor.

This was a Percona XtraDB Cluster/Galera choice, and we have to accept and manage it.

Note that the other basic scripts put the node in OFFLINE_SOFT, given the node will become a DONOR.
Galera_check will recognize that Node2 (192.168.1.6) is the only active node in the segment for that specific HG (writer), while is not the only present for the READER HG.

As such it will put the node in OFFLINE_SOFT only for the READER HG, trying to reduce the load on the node, but it will keep it active in the WRITER HG, to prevent service interruption.

Restart the node and ask for a donor:

2016-09-05 12:21:43 8007 [Note] WSREP: Flow-control interval: [67, 67] 2016-09-05 12:21:45 8007 [Note] WSREP: Member 1.1 (node1) requested state transfer from '*any*'. Selected 0.1 (node2)(SYNCED) as donor. 2016-09-05 12:21:46 8007 [Note] WSREP: (ef248c1f, 'tcp://192.168.1.8:4567') turning message relay requesting off 2016-09-05 12:21:52 8007 [Note] WSREP: New cluster view: global state: 234bb6ed-527d-11e6-9971-e794f632b140:324329, view# 7: Primary, number of nodes: 5, my index: 3, protocol version 3

galera_check  sets OFFLINE_SOFT 192.168.1.6 only for the READER HG, and ProxySQL uses the others to serve reads.

+-----------+-------------+----------+--------------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | 500 | 192.168.1.6 | 3306 | ONLINE | 10 | 0 | 10 | 0 | 7746 | 622557 | 0 | 86 | | 501 | 192.168.1.5 | 3306 | ONLINE | 0 | 0 | 1 | 147 | 1565987 | 76267703 | 4879938857 | 38 | | 501 | 192.168.1.6 | 3306 | OFFLINE_SOFT | 0 | 0 | 12 | 0 | 9668944 | 470878452 | 30181474498 | 86 | <-- Node offline | 501 | 192.168.1.7 | 3306 | ONLINE | 9 | 3 | 12 | 0 | 10932794 | 532558667 | 34170366564 | 62 | | 501 | 192.168.1.8 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 816599 | 39804966 | 2545765089 | 229 | | 501 | 192.168.1.9 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 814893 | 39724481 | 2541760230 | 248 | +-----------+-------------+----------+--------------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+

When the SST donor task is over, galera_check moves the 192.168.1.6 back ONLINE as expected. But at the same time, it moves the recovering node to the special HG to avoid to have it included in any activity until ready.

2016-09-05 12:22:36 27352 [Note] WSREP: 1.1 (node1): State transfer from 0.1 (node2) complete. 2016-09-05 12:22:36 27352 [Note] WSREP: Shifting JOINER -> JOINED (TO: 325062)

+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | 500 | 192.168.1.6 | 3306 | ONLINE | 10 | 0 | 10 | 0 | 1554 | 124909 | 0 | 35 | | 501 | 192.168.1.6 | 3306 | ONLINE | 2 | 8 | 22 | 0 | 10341612 | 503637989 | 32286072739 | 35 | | 501 | 192.168.1.7 | 3306 | ONLINE | 3 | 9 | 12 | 0 | 12058701 | 587388598 | 37696717375 | 13 | | 501 | 192.168.1.8 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 890102 | 43389051 | 2776691164 | 355 | | 501 | 192.168.1.9 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 887994 | 43296865 | 2772702537 | 250 | | 9500 | 192.168.1.5 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 57 | <-- Special HG for recover | 9501 | 192.168.1.5 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 57 | <-- Special HG for recover +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+

Once finally the node is in SYNC with the group, it is put back online in the READER HG and in the writer HG:

2016-09-05 12:22:36 27352 [Note] WSREP: 1.1 (node1): State transfer from 0.1 (node2) complete. 2016-09-05 12:22:36 27352 [Note] WSREP: Shifting JOINER -> JOINED (TO: 325062)

+-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | <-- Back on line | 500 | 192.168.1.6 | 3306 | ONLINE | 10 | 0 | 10 | 0 | 402 | 32317 | 0 | 68 | | 501 | 192.168.1.5 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 6285 | 305823 | 19592814 | 312 | <-- Back on line | 501 | 192.168.1.6 | 3306 | ONLINE | 4 | 6 | 22 | 0 | 10818694 | 526870710 | 33779586475 | 68 | | 501 | 192.168.1.7 | 3306 | ONLINE | 0 | 12 | 12 | 0 | 12492316 | 608504039 | 39056093665 | 26 | | 501 | 192.168.1.8 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 942023 | 45924082 | 2940228050 | 617 | | 501 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 939975 | 45834039 | 2935816783 | 309 | +-----------+-------------+----------+--------+----------+----------+--------+---------+----------+-----------------+-----------------+------------+ +--------------+-------------+------+--------+------------+ | hostgroup_id | hostname | port | status | weight | +--------------+-------------+------+--------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 100 | | 500 | 192.168.1.6 | 3306 | ONLINE | 1000000000 | | 501 | 192.168.1.5 | 3306 | ONLINE | 100 | | 501 | 192.168.1.6 | 3306 | ONLINE | 1000000000 | | 501 | 192.168.1.7 | 3306 | ONLINE | 1000000000 | | 501 | 192.168.1.8 | 3306 | ONLINE | 1 | | 501 | 192.168.1.9 | 3306 | ONLINE | 1 | +--------------+-------------+------+--------+------------+

But given it is coming back with its READER WEIGHT, it will NOT compete with the previously elected WRITER.

The recovered node will stay on “hold” waiting for a DBA to act and eventually put it back, or be set as READ_ONLY and as such be fully removed from the WRITER HG.

Let see the automatic procedure now:

For the moment, we just stick to the MANUAL failover process. The process is:

  1. Generate some load
  2. Kill the writer node
  3. Script will do auto-failover
  4. Recover crashed node

Check our scheduler config:

+----+--------+-------------+-----------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+------+---------+ | id | active | interval_ms | filename | arg1 | arg2 | arg3 | arg4 | arg5 | comment | +----+--------+-------------+-----------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+------+---------+ | 10 | 1 | 2000 | /var/lib/proxysql/galera_check.pl | -u=admin -p=admin -h=192.168.1.50 -H=500:W,501:R -P=3310 --execution_time=1 --retry_down=2 --retry_up=1 --main_segment=1 --active_failover --debug=0 --log=/var/lib/proxysql/galeraLog | NULL | NULL | NULL | NULL | | <--- Active | 20 | 0 | 1500 | /var/lib/proxysql/galera_check.pl | -u=admin -p=admin -h=192.168.1.50 -H=500:W,501:R -P=3310 --execution_time=1 --retry_down=2 --retry_up=1 --main_segment=1 --debug=0 --log=/var/lib/proxysql/galeraLog | NULL | NULL | NULL | NULL | | +----+--------+-------------+-----------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------+------+------+------+---------+

The active one is the one with auto-failover. Start load and check current load:

+-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 500 | 192.168.1.5 | 3306 | ONLINE | 10 | 0 | 10 | 0 | 952 | 76461 | 0 | 0 | | 501 | 192.168.1.5 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 53137 | 2587784 | 165811100 | 167 | | 501 | 192.168.1.6 | 3306 | ONLINE | 5 | 5 | 11 | 0 | 283496 | 13815077 | 891230826 | 109 | | 501 | 192.168.1.7 | 3306 | ONLINE | 3 | 7 | 10 | 0 | 503516 | 24519457 | 1576198138 | 151 | | 501 | 192.168.1.8 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 21952 | 1068972 | 68554796 | 300 | | 501 | 192.168.1.9 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 21314 | 1038593 | 67043935 | 289 | +-----------+-------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

Kill the main node 192.168.1.5:

+-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 500 | 192.168.1.6 | 3306 | ONLINE | 10 | 0 | 10 | 0 | 60 | 4826 | 0 | 0 | | 501 | 192.168.1.5 | 3306 | SHUNNED | 0 | 0 | 1 | 11 | 177099 | 8626778 | 552221651 | 30 | | 501 | 192.168.1.6 | 3306 | ONLINE | 3 | 7 | 11 | 0 | 956724 | 46601110 | 3002941482 | 49 | | 501 | 192.168.1.7 | 3306 | ONLINE | 2 | 8 | 10 | 0 | 1115685 | 54342756 | 3497575125 | 42 | | 501 | 192.168.1.8 | 3306 | ONLINE | 0 | 1 | 1 | 0 | 76289 | 3721419 | 240157393 | 308 | | 501 | 192.168.1.9 | 3306 | ONLINE | 1 | 0 | 1 | 0 | 75803 | 3686067 | 236382784 | 231 | +-----------+-------------+----------+---------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

When the node is killed the node is SHUNNED, but this time the script already set the new node 192.168.1.6 to ONLINE. See script log:

2016/09/08 14:04:02.494:[INFO] END EXECUTION Total Time:102.347850799561 2016/09/08 14:04:04.478:[INFO] This Node Try to become a WRITER set READ_ONLY to 0 192.168.1.6:3306:HG501 2016/09/08 14:04:04.479:[INFO] This Node NOW HAS READ_ONLY = 0 192.168.1.6:3306:HG501 2016/09/08 14:04:04.479:[INFO] END EXECUTION Total Time:71.8140602111816

More importantly, let’s look at the application experience:

Writes

[ 10s] threads: 10, tps: 9.40, reads: 93.60, writes: 41.60, response time: 1317.41ms (95%), errors: 0.00, reconnects: 0.00 [ 20s] threads: 10, tps: 8.30, reads: 74.70, writes: 33.20, response time: 1350.96ms (95%), errors: 0.00, reconnects: 0.00 [ 30s] threads: 10, tps: 8.30, reads: 74.70, writes: 33.20, response time: 1317.81ms (95%), errors: 0.00, reconnects: 0.00 [ 40s] threads: 10, tps: 7.80, reads: 70.20, writes: 31.20, response time: 1407.51ms (95%), errors: 0.00, reconnects: 0.00 [ 50s] threads: 10, tps: 6.70, reads: 60.30, writes: 26.80, response time: 2259.35ms (95%), errors: 0.00, reconnects: 0.00 [ 60s] threads: 10, tps: 6.60, reads: 59.40, writes: 26.40, response time: 3275.78ms (95%), errors: 0.00, reconnects: 0.00 [ 70s] threads: 10, tps: 5.70, reads: 60.30, writes: 26.80, response time: 1492.56ms (95%), errors: 0.00, reconnects: 1.00 <-- just a reconnect experience [ 80s] threads: 10, tps: 6.70, reads: 60.30, writes: 26.80, response time: 7959.74ms (95%), errors: 0.00, reconnects: 0.00 [ 90s] threads: 10, tps: 6.60, reads: 59.40, writes: 26.40, response time: 2109.03ms (95%), errors: 0.00, reconnects: 0.00 [ 100s] threads: 10, tps: 6.40, reads: 57.60, writes: 25.60, response time: 1883.96ms (95%), errors: 0.00, reconnects: 0.00 [ 110s] threads: 10, tps: 5.60, reads: 50.40, writes: 22.40, response time: 2167.27ms (95%), errors: 0.00, reconnects: 0.00

With no errors and no huge delay, our application (managing to reconnect) had only a glitch, and had to reconnect.

Read had no errors or reconnects.

The connection errors were managed by ProxySQL, and given it found five in one second it SHUNNED the node. The galera_script was able to promote a reader, and given it is a failover, no delay with retry loop. The whole thing was done in such brief time that application barely saw it.

Obviously, an application with thousands of connections/sec will experience larger impact, but the time-window will be very narrow. Once the failed node is ready to come back, either we choose to start it with READ_ONLY=1, and it will come back as the reader.
Or we will keep it as it is and it will come back as the writer.

No matter what, the script manages the case as it had done in the previous (manual) exercise.

Conclusions

ProxySQL and galera_check, when working together, are quite efficient in managing the cluster and its different scenarios. When using the single-writer mode, solving the manual part of the failover dramatically improves the efficiency in production state recovery performance — going from few minutes to seconds or less.

The multi-writer mode remains the preferred and most recommended way to use ProxySQL/Percona XtraDB Cluster given it performs failover without the need of additional scripts or extensions. It’s also the preferred method if a script is required to manage the integration with ProxySQL.

In both cases, the use of a script can identify the multiple states of Percona XtraDB Cluster and the mutable node scenario. It is a crucial part of the implementation, without which ProxySQL might not behave correctly.

Percona XtraDB Cluster 5.6.30-25.16.2 is now available (CVE-2016-6662 fix)

Latest MySQL Performance Blog posts - September 15, 2016 - 6:53am

Percona  announces the new release of Percona XtraDB Cluster 5.6 on September 15, 2016. Binaries are available from the downloads area or our software repositories.

Percona XtraDB Cluster 5.6.30-25.16.2 is now the current release, based on the following:

  • Percona Server 5.6.30-76.3
  • Galera Replication library 3.16
  • Codership wsrep API version 25

This release provides a fix for CVE-2016-6662. More information about this security issue can be found here.

Bug Fixed:

  • Due to security reasons ld_preload libraries can now only be loaded from the system directories (/usr/lib64, /usr/lib) and the MySQL installation base directory.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Unable to monitor mysql queries with 1.0.4

Lastest Forum Posts - September 15, 2016 - 5:41am
Hello, I have installed 1.0.4 and then tried to add a client.
I get this error

Code: pmm-admin add mysql --user=root --password=mysql-password [linux:metrics] OK, now monitoring this system. [mysql:metrics] OK, now monitoring MySQL metrics using DSN root:***@unix(/var/lib/mysql/mysql.sock) [mysql:queries] Error adding MySQL queries: cannot find os instance on QAN API. I had an old installation of PMM 1.0.3 that I removed.
I have installed the client using the yum repository; the client is running on percona server Percona-Server-server-56-5.6.28-rel76.1.el6.x86_64 on centos 6

Docker timesync

Lastest Forum Posts - September 15, 2016 - 5:13am
This is my first experience with docker, and I am trying to diagnose why Grafana monitor essentially stopped working. The first 15 minutes it did and was showing stats, then after everything was empty. I figured out how to get into the docker console and checked the prometheus.log and that has filled up with:

time="2016-09-14T15:29:27Z" level=warning msg="Scrape health sample discarded" error="sample timestamp out of order" sample=up{instance="db2", job="mysql"} => 0 @[1473866966.329] source="scrape.go:500"

So I checked the time between both the host and the docker image:
Host: Thu Sep 15 07:09:17 CDT 2016
Docker: Thu Sep 15 12:09:00 UTC 2016

I am guessing the next step is to get the timesynced, but that I have not figured out how to do. Host is Oracle Linux 7.2












MySQL Default Configuration Changes between 5.6 and 5.7

Latest MySQL Performance Blog posts - September 14, 2016 - 3:26pm

In this blog post, we’ll discuss the MySQL default configuration changes between 5.6 and 5.7.

MySQL 5.7 has added a variety of new features that might excite you. However, there are also changes in the current variables that you might have overlooked. MySQL 5.7 updated nearly 40 of the defaults from 5.6. Some of the changes could severely impact your server performance, while others might go unnoticed. I’m going to go over each of the changes and what they mean.

The change that can have the largest impact on your server is likely sync_binlog. My colleague, Roel Van de Paar, wrote about this impact in depth in another blog post, so I won’t go in much detail. Sync_binlog controls how MySQL flushes the binlog to disk. The new value of 1 forces MySQL to write every transaction to disk prior to committing. Previously, MySQL did not force flushing the binlog, and trusted the OS to decide when to flush the binlog.

(https://www.percona.com/blog/2016/06/03/binary-logs-make-mysql-5-7-slower-than-5-6/)

Variables 5.6.29 5.7.11 sync_binlog 0 1

 

The performance schema variables stand out as unusual, as many have a default of -1. MySQL uses this notation to call out variables that are automatically adjusted. The only performance schema variable change that doesn’t adjust itself is  performance_schema_max_file_classes. This is the number of file instruments used for the performance schema. It’s unlikely you will ever need to alter it.

Variables 5.6.29 5.7.11 performance_schema_accounts_size 100 -1 performance_schema_hosts_size 100 -1 performance_schema_max_cond_instances 3504 -1 performance_schema_max_file_classes 50 80 performance_schema_max_file_instances 7693 -1 performance_schema_max_mutex_instances 15906 -1 performance_schema_max_rwlock_instances 9102 -1 performance_schema_max_socket_instances 322 -1 performance_schema_max_statement_classes 168 -1 performance_schema_max_table_handles 4000 -1 performance_schema_max_table_instances 12500 -1 performance_schema_max_thread_instances 402 -1 performance_schema_setup_actors_size 100 -1 performance_schema_setup_objects_size 100 -1 performance_schema_users_size 100 -1

 

The optimizer_switch, and sql_mode variables have a variety of options that can each be enabled and cause a slightly different action to occur. MySQL 5.7 enables both variables for flags, increasing their sensitivity and security. These additions make the optimizer more efficient in determining how to correctly interpret your queries.

Three flags have been added to the optimzer_switch, all of which existed in MySQL 5.6 and were set as the default in MySQL 5.7 (with the intent to increase the optimizer’s efficiency): duplicateweedout=on, condition_fanout_filter=on, and derived_merge=on. duplicateweedout is part of the optimizer’s semi join materialization strategy. condition_fanout_filter controls the use of condition filtering, and derived_merge controls the merging of derived tables, and views into the outer query block.

https://dev.mysql.com/doc/refman/5.7/en/switchable-optimizations.html

http://www.chriscalender.com/tag/condition_fanout_filter/

The additions to SQL mode do not affect performance directly, however they will improve the way you write queries (which can increase performance). Some notable changes include requiring all fields in a select … group by statement must either be aggregated using a function like SUM, or be in the group by clause. MySQL will not assume they should be grouped, and will raise an error if a field is missing. Strict_trans_tables causes a different effect depending on if it used with a transactional table.

Statements are rolled back on transaction tables if there is an invalid or missing value in a data change statement. For tables that do not use a transactional engine, MySQL’s behavior depends on the row in which the invalid data occurs. If it is the first row, then the behavior matches that of a transactional engine. If not, then the invalid value is converted to the closest valid value, or the default value for the columns. A warning is generated, but the data is still inserted.

Variables 5.6.29 5.7.11 optimizer_switch index_merge=on
index_merge_union=on
index_merge_sort_union=on
index_merge_intersection=on
engine_condition_pushdown=on
index_condition_pushdown=on
mrr=on,mrr_cost_based=on
block_nested_loop=on
batched_key_access=off
materialization=on, semijoin=on
loosescan=on, firstmatch=on
subquery_materialization_cost_based=on
use_index_extensions=on
index_merge=on
index_merge_union=on
index_merge_sort_union=on
index_merge_intersection=on
engine_condition_pushdown=on
index_condition_pushdown=on
mrr=on
mrr_cost_based=on
block_nested_loop=on
batched_key_access=off
materialization=on
semijoin=on
loosescan=on
firstmatch=on
duplicateweedout=on
subquery_materialization_cost_based=on
use_index_extensions=on
condition_fanout_filter=on
derived_merge=on sql_mode NO_ENGINE_SUBSTITUTION ONLY_FULL_GROUP_BY
STRICT_TRANS_TABLES
NO_ZERO_IN_DATE
NO_ZERO_DATE
ERROR_FOR_DIVISION_BY_ZERO
NO_AUTO_CREATE_USER
NO_ENGINE_SUBSTITUTION

 

There have been a couple of variable changes surrounding the binlog. MySQL 5.7 updated the binlog_error_action so that if there is an error while writing to the binlog, the server aborts. These kind of incidents are rare, but cause a big impact to your application and replication when they occurs, as the server will not perform any further transactions until corrected.

The binlog default format was changed to ROW, instead of the previously used statement format. Statement writes less data to the logs. However there are many statements that cannot be replicated correctly, including “update … order by rand()”. These non-deterministic statements could result in different resultsets on the master and slave. The change to Row format writes more data to  the binlog, but the information is more accurate and ensures correct replication.

MySQL has begun to focus on replication using GTID’s instead of the traditional binlog position. When MySQL is started, or restarted, it must generate a list of the previously used GTIDs. If binlog_gtid_simple_recovery is OFF, or FALSE, then the server starts with the newest binlog and iterates backwards through the binlog files searching for a previous_gtids_log_event. With it set to ON, or TRUE, then the server only reviews the newest and oldest binlog files and computes the used gtids.Binlog_gtid_simple_recovery  makes it much faster to identify the binlogs, especially if there are a large number of binary logs without GTID events. However, in specific cases it could cause gtid_executed and gtid_purged to be populated incorrectly. This should only happen when the newest binarly log was generated by MySQL5.7.5 or older, or if a SET GTID_PURGED statement was run on MySQL earlier than version 5.7.7.

Another replication-based variable updated in 5.7 is slave_net_timeout. It is lowered to only 60 seconds. Previously the replication thread would not consider it’s connection to the master broken until the problem existed for at least an hour. This change informs you much sooner if there is a connectivity problem, and ensures replication does not fall behind significantly before informing you of an issue.

Variables 5.6.29 5.7.11 binlog_error_action IGNORE_ERROR ABORT_SERVER binlog_format STATEMENT ROW binlog_gtid_simple_recovery OFF ON slave_net_timeout 3600 60

 

InnoDB buffer pool changes impact how long starting and stopping the server takes. innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup are used together to prevent you from having to “warm up” the server. As the names suggest, this causes a buffer pool dump at shutdown and load at startup. Even though you might have a buffer pool of 100’s of gigabytes, you will not need to reserve the same amount of space on disk, as the data written is much smaller. The only things written to disk for this is the information necessary to locate the actual data, the tablespace and page IDs.

Variables 5.6.29 5.7.11 innodb_buffer_pool_dump_at_shutdown OFF ON innodb_buffer_pool_load_at_startup OFF ON

 

MySQL now made some of the options implemented in InnoDB during 5.6 and earlier into its defaults. InnoDB’s checksum algorithm was updated from innodb to crc32, allowing you to benefit from the hardware acceleration recent Intel CPU’s have available.

The Barracuda file format has been available since 5.5, but had many improvements in 5.6. It is now the default in 5.7. The Barracuda format allows you to use the compressed and dynamic row formats. My colleague Alexey has written about the utilization of the compressed format and the results he saw when optimizing a server: https://www.percona.com/blog/2008/04/23/real-life-use-case-for-barracuda-innodb-file-format/

The innodb_large_prefix defaults to “on”, and when combined with the Barracuda file format allows for creating larger index key prefixes, up to 3072 bytes. This allows larger text fields to benefit from an index. If this is set to “off”, or the row format is not either dynamic or compressed, any index prefix larger than 767 bytes gets silently be truncated. MySQL has introduced larger InnoDB page sizes (32k and 64k) in 5.7.6.

MySQL 5.7 increased the innodb_log_buffer_size value as well. InnoDB uses the log buffer to log transactions prior to writing them to disk in the binary log. The increased size allows the log to flush to the disk less often, reducing IO, and allows larger transactions to fit in the log without having to write to disk before committing.

MySQL 5.7 moved InnoDB’s purge operations to a background thread in order to reduce the thread contention in MySQL 5.5.The latest version increases the default to four purge threads, but can be changed to have anywhere from 1 to 32 threads.

MySQL 5.7 now enables innodb_strict_mode by default, turning some of the warnings into errors. Syntax errors in create table, alter table, create index, and optimize table statements generate errors and force the user to correct them prior to running. It also enables a record size check, ensuring that insert or update statements will not fail due to the record being too large for the selected page size.

Variables 5.6.29 5.7.11 innodb_checksum_algorithm innodb crc32 innodb_file_format Antelope Barracuda innodb_file_format_max Antelope Barracuda innodb_large_prefix OFF ON innodb_log_buffer_size 8388608 16777216 innodb_purge_threads 1 4 innodb_strict_mode OFF ON

 

MySQL has increased the number of times the optimizer dives into the index when evaluating equality ranges. If the optimizer needs to dive into the index more than the eq_range_index_dive_limit , defaulted to 200 in MySQL 5.7, then it uses the existing index statistics. You can adjust this limit from 0, eliminating index dives, to 4294967295. This can have a significant impact to query performance since the table statistics are based on the cardinality of a random sample. This can cause the optimizer to estimate a much larger set of rows to review than it would with the index dives, changing the method the optimizer chooses to execute the query.

MySQL 5.7 deprecated log_warnings. The new preference is utilize log_error_verbosity. By default this is set to 3, and logs errors, warnings, and notes to the error log. You can alter this to 1 (log errors only) or 2 (log errors and warnings). When consulting the error log, verbosity is often a good thing. However this increases the IO and disk space needed for the error log.

Variables 5.6.29 5.7.11 eq_range_index_dive_limit 10 200 log_warnings 1 2

 

There are many changes to the defaults in 5.7. But many of these options have existed for a long time and should be familiar to users. Many people used these variables, and they are the best method to push MySQL forward. Remember, however, you can still edit these variables, and configure them to ensure that your server works it’s best for your data.

pmp-check-pt-table-checksum Percona Monitoring Plugin

Latest MySQL Performance Blog posts - September 14, 2016 - 1:52pm

Recently, I worked on a customer case where the customer needed to monitor the checksum via Nagios monitoring. The pmp-check-pt-table-checksum plugin from Percona Monitoring Plugins for MySQL achieves this goal. I thought it was worth a blogpost.

pmp-check-pt-table-checksum alerts you when the pt-table-checksum tool from Percona Toolkit finds data drifts on a replication slave. pmp-checksum-pt-table-checksum monitors data differences on the slave from the checksum table as per information in the last checksum performed by the pt-table-checksum tool. By default, the plugin queries the percona.checksum table to fetch information about data discrepancies. You can override this behavior with the “-T” option. You can check the pmp-check-pt-table-checksum documentation for details.

Let’s demonstrate checksum monitoring via Nagios. My setup contains a master with two slave(s) connected, as follows:

  • Host 10.0.3.131 is master.
  • Host 10.0.3.83 is slave1
  • Host 10.0.3.36 is slave2

I intentionally generated more data on the master so pt-table-checksum can catch the differences on the slave(s). Here’s what it looks like:

mysql-master> SELECT * FROM test.t1; +------+ | id | +------+ | 1 | | 2 | | 3 | | 4 | | 5 | | 6 | | 7 | | 8 | | 9 | | 10 | +------+ 10 rows in set (0.00 sec) mysql-slave1> SELECT * FROM test.t1; +------+ | id | +------+ | 1 | | 2 | | 3 | | 4 | | 5 | +------+ 5 rows in set (0.00 sec) mysql-slave2> SELECT * FROM test.t1; +------+ | id | +------+ | 1 | | 2 | | 3 | | 4 | | 5 | +------+ 5 rows in set (0.00 sec)

As you can see, slave1 and slave2 are different from the master: the master has ten rows while the slave(s) have five rows each (table t1).

Then, I executed pt-table-checksum from the master to check for data discrepancies:

[root@master]# pt-table-checksum --replicate=percona.checksums --ignore-databases mysql h=10.0.3.131,u=checksum_user,p=checksum_password TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 08-25T04:57:10 0 1 10 1 0 0.018 test.t1 [root@master]# pt-table-checksum --replicate=percona.checksums --replicate-check-only --ignore-databases mysql h=10.0.3.131,u=checksum_user,p=checksum_password Differences on slave1 TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARY test.t1 1 -5 1 Differences on slave2 TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARY test.t1 1 -5 1

pt-table-checksum correctly identifies the differences for the test.t1 table on slave1 and slave2. Now, you can use the pmp-check-pt-table-checksum  Percona checksum monitoring plugin. Let’s try to run it locally (via CLI) from the Nagios host.

[root@nagios]# pmp-check-pt-table-checksum -H slave1 -l checksum_user -p checksum_password -P 3306 WARN pt-table-checksum found 1 chunks differ in 1 tables, including test.t1 [root@nagios]# pmp-check-pt-table-checksum -H slave2 -l checksum_user -p checksum_password -P 3306 WARN pt-table-checksum found 1 chunks differ in 1 tables, including test.t1]

NOTE: The checksum_user database user needs SELECT privileges on both the checksum table (Percona.checksums) and the slave(s) in order for SQL to alert for checksum differences on slave(s).

On the Nagios monitoring server, you need to add the pmp-check-pt-table-checksum command to the commands.cfg file:

define command{ command_name pmp-check-pt-table-checksum command_line $USER1$/pmp-check-pt-table-checksum -H $HOSTADDRESS$ -c $ARG1$ }

NOTE: I used “-c” option for pmp-check-pt-table-checksum, which raises a critical error instead of a warning.

And, on the existing hosts.cfg file (i.e., slave1.cfg and slave2.cfg), you need to add a monitoring command accordingly as below:

define service{ use generic-service host_name slave1 service_description Checksum Status check_command pmp-check-pt-table-checksum!1 }

In this command “1” is an argument to command “-c $ARG1$” so pmp-check-pt-table-checksum will raise a critical error when one or more chunks on the slave(s) are different from the master.

Last but not least, restart the Nagios daemon on the monitoring host to make the change.

Below is how it looks like on the Nagios monitoring on the web:


I also think the “INTERVAL” option is useful:

-i INTERVAL     Interval over which to ensure pt-table-checksum was run, in days; default - not to check.

It makes sure that chunks are recent on the checksum table. Used the other way around, it checks on how old your chunks are. This option ensures the checksum cron executes at a defined number of days. Let’s say you have pt-table-checksum cron running once per week. In that case, setting INTERVAL 14 or 21 alerts you if chunks are older then defined number of days (i.e., the INTERVAL number).

Conclusion:

Percona Monitoring plugins for MySQL are very useful and easy to embed in your centralize monitoring dashboard. You can schedule pt-table-checksum via a cron job, and get reports regarding master/slave(s) data drifts (if any) from one global dashboard on the monitoring host. There are various plugins available from Percona, e.g. processlist plugin, replication delay plugin, etc. Along with that, Percona offers Cacti and Zabbix templates to graph various MySQL activities.

PMM 1.0.4 available - Not beta any more!

Lastest Forum Posts - September 14, 2016 - 11:02am
We've launched Percona Monitoring and Management 1.0.4 - full blog post about it here.
https://www.percona.com/blog/2016/09...now-available/

PMM and pt-heartbeat

Lastest Forum Posts - September 14, 2016 - 9:11am
Hi,
are there any plans to integrate the PMM replication graphs (under /graph/dashboard/db/mysql-replication ) with the pt-heartbeat tool? I don't much trust the "seconds behind master" information and prefer replication monitoring based on pt-heartbeat, like the cacti monitoring templates are doing, so I'm wondering if that's already configurable or on the roadmap?

Thanks
Alex

Webinar Thursday Sept. 15: Identifying and Solving Database Performance Issues with PMM

Latest MySQL Performance Blog posts - September 14, 2016 - 6:51am

Please join Roman Vynar, Lead Platform Engineer on Thursday, September 15, 2016 at 10 am PDT (UTC-7) for a webinar on Identifying and Solving Database Performance Issues with PMM.

Database performance is the key to high-performance applications. Gaining visibility into the database is the key to improving database performance. Percona’s Monitoring and Management (PMM) provides the insight you need into your database environment.

In this webinar, we will demonstrate how using PMM for query analytics, in combination with database and host performance metrics, can more efficiently drive tuning, issue management and application development. Using PMM can result in faster resolution times, more focused development and a more efficient IT team.

Register for the webinar here.

Roman Vynar, Lead Platform Engineer Roman is a Lead Platform Engineer at Percona. He joined the company to establish and develop the Remote DBA service from scratch. Over time, the growing service successfully expanded to Managed Services. Roman develops the monitoring tools, automated scripts, backup solution, notification and incident tracking web system and currently leading Percona Monitoring and Management project.

Updates cause Lock wait timeout exceeded

Lastest Forum Posts - September 14, 2016 - 6:11am
Hello
We setup a XtraDB cluster with 3 nodes in failover mode. We are using JDBC for the failover and experience a lock issue that shows the following error message:
Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction

It looks like the jdbc driver will failover to the second node even thought the first node is still available. We investigated flow control, but could not find an issue there. Any recommendation on how we can further debug our issue?

We use the following versions:
MySQL 5.1.73
Percona XtraDB 5.6.21
Java MySQL Connector 5.1.18

JDBC Configuration: jdbc:mysql://host1,host2,host3/somedb?useConfigs=maxPerformance&characterEncoding =utf8&connectTimeout=15&autoReconnect=true&autoRec onnectForPools=true&failOverReadOnly=false&explain SlowQueries=true


General Inquiries

For general inquiries, please send us your question and someone will contact you.