]]>
]]>

Latest MySQL Performance Blog posts

You are here

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 1 hour 30 min ago

Tips on benchmarking Go + MySQL

May 14, 2014 - 9:00am

We just released, as an open source release, our new percona-agent (https://github.com/percona/percona-agent), the agent to work with Percona Cloud Tools. This agent is written in Go.

I will give a webinar titled “Monitoring All MySQL Metrics with Percona Cloud Tools” on June 25 that will cover the new features in percona-agent and Percona Cloud Tools, where I will also explain how it works. You are welcome to register now and join me.

There will be more posts about percona-agent, but in the meantime I want to dedicate this one to Go, Go with MySQL and some performance topics.

I have had an interest in the Go programming language for a long time, but in initial versions I did not quite like the performance of the gorountine scheduler. See my report from more than two years ago on runtime: reduce scheduling contention for large $GOMAXPROCS.

Supposedly this performance issue was fixed in Go 1.1, so this is a good time to revisit my benchmark experiment.

A simple run of prime or fibonachi numbers calculation in N threas is quite boring, so I am going to run queries against Percona Server. Of course it adds some complication as there are more moving parts (i.e. go scheduler, go sql driver, MySQL by itself), but it just makes the experiment more interesting.

Source code of my benchmark: Go-pk-bench:
This is probably not the best example of how to code in Go, but that was not the point of this exercise. This post is really about some tips to take into account when writing an application in Go using a MySQL (Percona Server) database.

So, first, we will need a MySQL driver for Go. The one I used two years ago (https://github.com/Philio/GoMySQL) is quite outdated. It seems the most popular choice today is Go-MySQL-Driver, and this is the one we use for internal development. This driver is based on the standard Go “database/sql” package. This package kind of provides a standard Go-way to deal with SQL-like databases. “database/sql” seems to work out OK, with some questionable design decisions as for my taste. So using “database/sql” and Go-MySQL-Driver you will need to deal with some quirks like almost unmanageable connection pool.

The first thing you should take into account it is a proper setting of
runtime.GOMAXPROCS().

If you do not do that, Go scheduler will use the default, which is 1. That binary will use one and only 1 CPU (so much for a modern concurrent language).

The command runtime.GOMAXPROCS(runtime.NumCPU())
will prescribe to use all available CPUs. Always remember to use this if you care about multi-threaded performance.

The next problem I faced in the benchmark is that when I ran queries in a loop, i.e. to repeat as much possible…

rows, err := db.Query("select k from sbtest"+strconv.Itoa(tab+1)+" where id = "+strconv.Itoa(i))

… very soon we ran out of TCP ports. Apparently “database/sql” and Go-MySQL-Driver and its smart connection pool creates a NEW CONNECTION for each query. I can explain why this happens, but using the following statement:

'db.SetMaxIdleConns(10000)'

helps (I hope somebody with “database/sql” knowledge will explain what it is doing).

So after these adjustments we now can run the benchmark, which by query you see is quite simple – run primary key lookups against Percona Server which we know scales perfectly in this scenario (I used sysbench to create 64 tables 1mln rows each, all this fits into memory). I am going to run this benchmark with 1, 2, 4, 8, 16, 24, 32, 48, 64 user threads.

Below you can see graphs for MySQL Throughput and CPU Usage (both graph are built using new metrics graphing in Percona Cloud Tools)

MySQL Throughput (user threads are increasing from 1 to 64)

CPU Usage (user threads are increasing from 1 to 64)

I would say the result scales quite nicely, at least it is really much better than it was two years ago. It is interesting to compare with something, so there is a graph from an identical run, but now I will use sysbench + lua for main workload driver.

MySQL Throughput (sysbench, user threads are increasing from 1 to 64)

CPU Usage (sysbench, user threads are increasing from 1 to 64)

From the graphs (this is what I like them for), we can clearly see increases in User CPU utilization (and actually we are able to use CPUs on 100% in user+system usage) and it clearly corresponds to increased throughput.

And if you are a fan of raw numbers:

MySQL Throughput, q/s (more is better) Threads | Go-MySQL | sysbench 1 | 13,189 | 16,765 2 | 26,837 | 33,534 4 | 52,629 | 65,943 8 | 95,553 | 116,953 16 | 146,979 | 182,781 24 | 169,739 | 231,895 32 | 181,334 | 245,939 48 | 198,238 | 250,497 64 | 207,732 | 251,972

(one with a knowledge of Universal Scalability Law can draw a prediction till 1000 threads, I leave it as a homework)

So, in conclusion, I can say that Go+MySQL is able to show decent results, but it is still not as effective as plan raw C (sysbench), as it seems it spends some extra CPU time in system calls.

If you want to try these new graphs in Percona Cloud Tools and see how it works with your system – join the free beta!

The post Tips on benchmarking Go + MySQL appeared first on MySQL Performance Blog.

max_allowed_packet and binary log corruption in MySQL

May 14, 2014 - 1:00am

The combination of max_allowed_packet variable and replication in MySQL is a common source of headaches. In a nutshell, max_allowed_packet is the maximum size of a MySQL network protocol packet that the server can create or read. It has a default value of 1MB (<= 5.6.5) or 4MB (>= 5.6.6) and a maximum size of 1GB. This adds some constraints in our replication environment:

  • The master server shouldn’t write events to the binary log larger than max_allowed_packet
  • All the slaves in the replication chain should have the same max_allowed_packet as the master server

Sometimes, even following those two basic rules we can have problems.

For example, there are situations (also called bugs) where the master writes more data than the max_allowed_packet limit causing the slaves to stop working. In order to fix this Oracle created a new variable called slave_max_allowed_packet. This new configuration variable available from 5.1.64, 5.5.26 and 5.6.6 overrides the max_allowed_packet value for slave threads. Therefore, regardless of the max_allowed_packet value the slaves’ threads will have 1GB limit, the default value of slave_max_allowed_packet. Nice trick that works as expected.

Sometimes even with that workaround we can get the max_allowed_packet error in the slave servers. That means that there is a packet larger than 1GB, something that shouldn’t happen in a normal situation. Why? Usually it is caused by a binary log corruption. Let’s see the following example:

Slave stops working with the following message:

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'log event entry exceeded max_allowed_packet; Increase max_allowed_packet on master'

The important part is “got fatal error 1236 from master”. The master cannot read the event it wrote to the binary log seconds ago. To check the problem we can:

  • Use mysqlbinlog to read the binary log from the position it failed with –start-position.

This is an example taken from our Percona Forums:

#121003 5:22:26 server id 1 end_log_pos 398528 # Unknown event # at 398528 #960218 6:48:44 server id 1813111337 end_log_pos 1835008 # Unknown event ERROR: Error in Log_event::read_log_event(): 'Event too big', data_len: 1953066613, event_type: 8 DELIMITER ; # End of log file

Check the size of the event, 1953066613 bytes. Or the “Unknown event” messages. Something is clearly wrong there. Another usual thing to check is the server id that sometimes doesn’t correspond with the real value. In this example the person who posted the binary log event confirmed that the server id was wrong.

  • Check master’s error log.

[ERROR] Error in Log_event::read_log_event(): 'Event too big', data_len: 1953066613, event_type: 8

Again, the event is bigger than expected. There is no way the master and slave can read/write it, so the solution is to skip that event in the slave and rotate the logs on the master. Then, use pt-table-checksum to check data consistency.

MySQL 5.6 includes replication checksums to avoid problems with log corruptions. You can read more about it in Stephan’s blog post.

Conclusion

Errors on slave servers about max_allowed_packet can be caused by very different reasons. Although binary log corruption is not a common one, it is something worth checking when you have run out of ideas.

The post max_allowed_packet and binary log corruption in MySQL appeared first on MySQL Performance Blog.

Using Percona Server 5.6 with the Docker open-source engine

May 13, 2014 - 12:00am

There are a couple of posts about setting up Percona XtraDB Cluster on Vagrant and Percona Server on MySQL Sandbox – those are two of the top tools used by the Percona Support team for testing and bug processing among other things.

In this post, however, I will show you how to use Docker with Percona Server on Ubuntu 12.04.

As per Docker’s official site:

Docker is an open-source engine that automates the deployment of any application as a lightweight, portable, self-sufficient container that will run virtually anywhere.

Docker containers can encapsulate any payload, and will run consistently on and between virtually any server. The same container that a developer builds and tests on a laptop will run at scale, in production*, on VMs, bare-metal servers, OpenStack clusters, public instances, or combinations of the above.

To install Docker on Ubuntu 12.04 you need to follow instructions from Docker’s official documentation:
http://docs.docker.io/installation/ubuntulinux/#ubuntu-precise-1204-lts-64-bit

After installing Docker, you may either download docker images via ‘docker pull’ and store docker images on your server so you can spin a new docker container in an instance or you may choose to do a ‘docker run’ on the terminal and implicitly download/store the specific docker image from index.docker.io and run the image afterward.

root@Perconallc-Support / # docker images | grep centos centos centos6 0b443ba03958 2 weeks ago 297.6 MB centos latest 0b443ba03958 2 weeks ago 297.6 MB centos 6.4 539c0211cd76 13 months ago 300.6 MB

Let us create a CentOS docker container by running the following command:

root@Perconallc-Support / # docker run -i -t centos:latest bash bash-4.1# cat /etc/redhat-release CentOS release 6.5 (Final)

As you may have noticed, we have just created a new interactive (-i) CentOS 6.5 docker container and ran bash in a single line of command. Detaching from the container is as easy as typing CTRL+p – CTRL+q, you’ll get to your terminal if you typed the right keys.

Verify the active containers:

root@Perconallc-Support / # docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 202765d754b7 centos:centos6 bash 11 minutes ago Up 11 minutes elegant_rosalind

To list all existing containers active or not use the following command:

root@Perconallc-Support / # docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 202765d754b7 centos:centos6 bash 12 minutes ago Up 12 minutes elegant_rosalind aae47d193a22 centos:6.4 bash 2 hours ago Exited (1) 2 hours ago boring_bardeen

To attach to the active docker container:

root@Perconallc-Support / # docker attach 202765d754b7 bash-4.1#

*Tip: Hit enter twice to get to the container’s bash prompt.*

Install Percona Server 5.6

Now that you have a working docker container you will then have to install the needed packages and repository.

bash-4.1# rpm -Uhv http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm Retrieving http://www.percona.com/downloads/percona-release/percona-release-0.0-1.x86_64.rpm Preparing... ########################################### [100%] 1:percona-release ########################################### [100%] bash-4.1# yum install Percona-Server-server-56 Percona-Server-client-56 Percona-Server-shared-56 Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile * base: ftp.hosteurope.de * extras: ftp.plusline.de * updates: ftp.plusline.de Setting up Install Process Resolving Dependencies --> Running transaction check ---> Package Percona-Server-client-56.x86_64 0:5.6.17-rel65.0.el6 will be installed --> Processing Dependency: /usr/bin/perl for package: Percona-Server-client-56-5.6.17-rel65.0.el6.x86_64 ---> Package Percona-Server-server-56.x86_64 0:5.6.17-rel65.0.el6 will be installed --> Processing Dependency: libaio.so.1(LIBAIO_0.1)(64bit) for package: Percona-Server-server-56-5.6.17-rel65.0.el6.x86_64 --> Processing Dependency: libaio.so.1(LIBAIO_0.4)(64bit) for package: Percona-Server-server-56-5.6.17-rel65.0.el6.x86_64 --> Processing Dependency: libaio.so.1()(64bit) for package: Percona-Server-server-56-5.6.17-rel65.0.el6.x86_64 ---> Package Percona-Server-shared-56.x86_64 0:5.6.17-rel65.0.el6 will be installed --> Running transaction check ---> Package libaio.x86_64 0:0.3.107-10.el6 will be installed ---> Package perl.x86_64 4:5.10.1-136.el6 will be installed --> Processing Dependency: perl-libs = 4:5.10.1-136.el6 for package: 4:perl-5.10.1-136.el6.x86_64 --> Processing Dependency: perl-libs for package: 4:perl-5.10.1-136.el6.x86_64 --> Processing Dependency: perl(version) for package: 4:perl-5.10.1-136.el6.x86_64 --> Processing Dependency: perl(Pod::Simple) for package: 4:perl-5.10.1-136.el6.x86_64 --> Processing Dependency: perl(Module::Pluggable) for package: 4:perl-5.10.1-136.el6.x86_64 --> Processing Dependency: libperl.so()(64bit) for package: 4:perl-5.10.1-136.el6.x86_64 --> Running transaction check ---> Package perl-Module-Pluggable.x86_64 1:3.90-136.el6 will be installed ---> Package perl-Pod-Simple.x86_64 1:3.13-136.el6 will be installed --> Processing Dependency: perl(Pod::Escapes) >= 1.04 for package: 1:perl-Pod-Simple-3.13-136.el6.x86_64 ---> Package perl-libs.x86_64 4:5.10.1-136.el6 will be installed ---> Package perl-version.x86_64 3:0.77-136.el6 will be installed --> Running transaction check ---> Package perl-Pod-Escapes.x86_64 1:1.04-136.el6 will be installed --> Finished Dependency Resolution Dependencies Resolved ======================================================================================================================================================================== Package Arch Version Repository Size ======================================================================================================================================================================== Installing: Percona-Server-client-56 x86_64 5.6.17-rel65.0.el6 percona 6.8 M Percona-Server-server-56 x86_64 5.6.17-rel65.0.el6 percona 19 M Percona-Server-shared-56 x86_64 5.6.17-rel65.0.el6 percona 714 k Installing for dependencies: libaio x86_64 0.3.107-10.el6 base 21 k perl x86_64 4:5.10.1-136.el6 base 10 M perl-Module-Pluggable x86_64 1:3.90-136.el6 base 40 k perl-Pod-Escapes x86_64 1:1.04-136.el6 base 32 k perl-Pod-Simple x86_64 1:3.13-136.el6 base 212 k perl-libs x86_64 4:5.10.1-136.el6 base 578 k perl-version x86_64 3:0.77-136.el6 base 51 k Transaction Summary ======================================================================================================================================================================== Install 10 Package(s) Total download size: 38 M Installed size: 158 M Is this ok [y/N]: y Downloading Packages: (1/10): Percona-Server-client-56-5.6.17-rel65.0.el6.x86_64.rpm | 6.8 MB 00:02 (2/10): Percona-Server-server-56-5.6.17-rel65.0.el6.x86_64.rpm | 19 MB 00:00 (3/10): Percona-Server-shared-56-5.6.17-rel65.0.el6.x86_64.rpm | 714 kB 00:00 (4/10): libaio-0.3.107-10.el6.x86_64.rpm | 21 kB 00:00 (5/10): perl-5.10.1-136.el6.x86_64.rpm | 10 MB 00:00 (6/10): perl-Module-Pluggable-3.90-136.el6.x86_64.rpm | 40 kB 00:00 (7/10): perl-Pod-Escapes-1.04-136.el6.x86_64.rpm | 32 kB 00:00 (8/10): perl-Pod-Simple-3.13-136.el6.x86_64.rpm | 212 kB 00:00 (9/10): perl-libs-5.10.1-136.el6.x86_64.rpm | 578 kB 00:00 (10/10): perl-version-0.77-136.el6.x86_64.rpm | 51 kB 00:00 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Total 7.4 MB/s | 38 MB 00:05 Running rpm_check_debug Running Transaction Test Transaction Test Succeeded Running Transaction Warning: RPMDB altered outside of yum. ** Found 2 pre-existing rpmdb problem(s), 'yum check' output follows: udev-147-2.51.el6.x86_64 has missing requires of /sbin/service udev-147-2.51.el6.x86_64 has missing requires of MAKEDEV >= ('0', '3.11', None) Installing : Percona-Server-shared-56-5.6.17-rel65.0.el6.x86_64 1/10 Installing : 1:perl-Pod-Escapes-1.04-136.el6.x86_64 2/10 Installing : 1:perl-Module-Pluggable-3.90-136.el6.x86_64 3/10 Installing : 4:perl-libs-5.10.1-136.el6.x86_64 4/10 Installing : 3:perl-version-0.77-136.el6.x86_64 5/10 Installing : 1:perl-Pod-Simple-3.13-136.el6.x86_64 6/10 Installing : 4:perl-5.10.1-136.el6.x86_64 7/10 Installing : Percona-Server-client-56-5.6.17-rel65.0.el6.x86_64 8/10 Installing : libaio-0.3.107-10.el6.x86_64 9/10 Installing : Percona-Server-server-56-5.6.17-rel65.0.el6.x86_64 10/10 ... Installed: Percona-Server-client-56.x86_64 0:5.6.17-rel65.0.el6 Percona-Server-server-56.x86_64 0:5.6.17-rel65.0.el6 Percona-Server-shared-56.x86_64 0:5.6.17-rel65.0.el6 Dependency Installed: libaio.x86_64 0:0.3.107-10.el6 perl.x86_64 4:5.10.1-136.el6 perl-Module-Pluggable.x86_64 1:3.90-136.el6 perl-Pod-Escapes.x86_64 1:1.04-136.el6 perl-Pod-Simple.x86_64 1:3.13-136.el6 perl-libs.x86_64 4:5.10.1-136.el6 perl-version.x86_64 3:0.77-136.el6 Complete! bash-4.1# /etc/init.d/mysql start Starting MySQL (Percona Server). SUCCESS! bash-4.1# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.6.17-65.0-56 Percona Server (GPL), Release 65.0, Revision 587 Copyright (c) 2009-2014 Percona LLC and/or its affiliates Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql>

There you have it, a freshly installed Percona Server 5.6.17 on CentOS 6.5 Linux container. Yay!

You can now create a new MySQL user to access the server outside the container:

mysql> grant all privileges on *.* to 'test_user'@'%' identified by 'test_pass'; Query OK, 0 rows affected (0.00 sec) mysql> flush privileges; Query OK, 0 rows affected (0.00 sec) root@Perconallc-Support / # mysql -h 172.17.0.2 -utest_user -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 4 Server version: 5.6.17-65.0-56 Percona Server (GPL), Release 65.0, Revision 587 Copyright (c) 2009-2014 Percona LLC and/or its affiliates Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql>

We’re basically following installation instructions for Percona Server from the documentation.

Just the same way as you would do on any server, you can install other equally important Percona software on the same container to complete your docker environment.

As an added bonus, you can contribute to the Docker Community by commiting your container and pushing it to the docker registry.

Summary

I’ve shown you how easy it is to spin a docker container and install Percona Server 5.6 in it. This is by far one of the fastest ways to create test/staging environments to simulate production servers. For further reading you may want to read Docker’s official documentation here.

I hope that this has piqued your interest to try it out yourself. And if that happens we would love to know how Percona Server and other Percona software perform on your docker environment. Cheers!

The post Using Percona Server 5.6 with the Docker open-source engine appeared first on MySQL Performance Blog.

Practical MySQL performance optimization: May 14 Webinar

May 12, 2014 - 3:00am

Achieving the best possible MySQL Performance doesn’t have to be complicated. It’s all about knowing which tools are designed for the task at hand – along with some basic (yet often overlooked) best practices.

Join me Wednesday, May 14 at 10 a.m. Pacific for a free webinar titled, “Practical MySQL performance optimization.” I’ll be sharing the main areas for improving MySQL performance – along with what to specifically focus on in each. These will include:

  • Hardware
  • MySQL Configuration
  • Schema and Queries
  • Application Architecture

And as I mentioned earlier, I’ll also show you the best tools for the job and how to use them efficiently. This will help you optimize your time by focusing on the queries that are most important for your application.

At the end of this webinar, you will know how to optimize MySQL performance in the most practical way possible. The webinar is free but I recommend registering now to reserve your spot. I hope to see you on May 14!

The post Practical MySQL performance optimization: May 14 Webinar appeared first on MySQL Performance Blog.

GTIDs in MySQL 5.6: New replication protocol; new ways to break replication

May 9, 2014 - 12:00am

One of the MySQL 5.6 features many people are interested in is Global Transactions IDs (GTIDs). This is for a good reason: Reconnecting a slave to a new master has always been a challenge while it is so trivial when GTIDs are enabled. However, using GTIDs is not only about replacing good old binlog file/position with unique identifiers, it is also using a new replication protocol. And if you are not aware of it, it can bite.

Replication protocols: old vs new

The old protocol is pretty straightforward: the slave connects to a given binary log file at a specific offset, and the master sends all the transactions from there.

The new protocol is slightly different: the slave first sends the range of GTIDs it has executed, and then the master sends every missing transaction. It also guarantees that a transaction with a given GTID can only be executed once on a specific slave.

In practice, does it change anything? Well, it may change a lot of things. Imagine the following situation: you want to start replicating from trx 4, but trx 2 is missing on the slave for some reason.

With the old replication protocol, trx 2 will never be executed while with the new replication protocol, it WILL be executed automatically.

Here are 2 common situations where you can see the new replication protocol in action.

Skipping transactions

It is well known that the good old SET GLOBAL sql_slave_skip_counter = N is no longer supported when you want to skip a transaction and GTIDs are enabled. Instead, to skip the transaction with GTID XXX:N, you have to inject an empty transaction:

mysql> SET gtid_next = 'XXX:N'; mysql> BEGIN; COMMIT; mysql> SET gtid_next = 'AUTOMATIC';

Why can’t we use sql_slave_skip_counter? Because of the new replication protocol!

Imagine that we have 3 servers like the picture below:

Let’s assume that sql_slave_skip_counter is allowed and has been used on S2 to skip trx 2. What happens if you make S2 a slave of S1?

Both servers will exchange the range of executed GTIDs, and S1 will realize that it has to send trx 2 to S2. Two options then:

  • If trx 2 is still in the binary logs of S1, it will be sent to S2, and the transaction is no longer skipped.
  • If trx 2 no longer exists in the binary logs of S1, you will get a replication error.

This is clearly not safe, that’s why sql_slave_skip_counter is not allowed with GTIDs. The only safe option to skip a transaction is to execute a fake transaction instead of the real one.

Errant transactions

If you execute a transaction locally on a slave (called errant transaction in the MySQL documentation), what will happen if you promote this slave to be the new master?

With the old replication protocol, basically nothing (to be accurate, data will be inconsistent between the new master and its slaves, but that can probably be fixed later).

With the new protocol, the errant transaction will be identified as missing everywhere and will be automatically executed on failover, which has the potential to break replication.

Let’s say you have a master (M), and 2 slaves (S1 and S2). Here are 2 simple scenarios where reconnecting slaves to the new master will fail (with different replication errors):

# Scenario 1

# S1 mysql> CREATE DATABASE mydb; # M mysql> CREATE DATABASE IF NOT EXISTS mydb; # Thanks to 'IF NOT EXITS', replication doesn't break on S1. Now move S2 to S1: # S2 mysql> STOP SLAVE; CHANGE MASTER TO MASTER_HOST='S1'; START SLAVE; # This creates a conflict with existing data! mysql> SHOW SLAVE STATUS\G [...] Last_SQL_Errno: 1007 Last_SQL_Error: Error 'Can't create database 'mydb'; database exists' on query. Default database: 'mydb'. Query: 'CREATE DATABASE mydb' [...]

# Scenario 2

# S1 mysql> CREATE DATABASE mydb; # Now, we'll remove this transaction from the binary logs # S1 mysql> FLUSH LOGS; mysql> PURGE BINARY LOGS TO 'mysql-bin.000008'; # M mysql> CREATE DATABASE IF NOT EXISTS mydb; # S2 mysql> STOP SLAVE; CHANGE MASTER TO MASTER_HOST='S1'; START SLAVE; # The missing transaction is no longer available in the master's binary logs! mysql> SHOW SLAVE STATUS\G [...] Last_IO_Errno: 1236 Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.' [...]

As you can understand, errant transactions should be avoided with GTID-based replication. If you need to run a local transaction, your best option is to disable binary logging for that specific statement:

mysql> SET SQL_LOG_BIN = 0; mysql> # Run local transaction

Conclusion

GTIDs are a great step forward in the way we are able to reconnect replicas to other servers. But they also come with new operational challenges. If you plan to use GTIDs, make sure you correctly understand the new replication protocol, otherwise you may end up breaking replication in new and unexpected ways.

I’ll do more exploration about errant transactions in a future post.

The post GTIDs in MySQL 5.6: New replication protocol; new ways to break replication appeared first on MySQL Performance Blog.

Percona XtraDB Cluster 5.5.37-25.10 is now available

May 8, 2014 - 10:40am

Percona is glad to announce the release of Percona XtraDB Cluster 5.5.37-25.10 on May 8, 2014. Binaries are available from the downloads area or from our software repositories.

Based on Percona Server 5.5.37-35.0 including all the bug fixes in it, Galera Replicator 2.10 (including fixes in 2.9 and 2.10 milestones) and on wsrep API 25.10 (including fixes in 25.10), Percona XtraDB Cluster 5.5.37-25.10 is now the current stable release. All of Percona‘s software is open-source and free.

New Features:

  • Percona XtraDB Cluster now supports stream compression/decompression with new xtrabackup-sst compressor/decompressor options.
  • Garbd init script and configuration files have been packaged for CentOS and Debian, in addition, in Debian garbd is packaged separately in percona-xtradb-cluster-garbd-2.x package.
  • New meta packages are now available in order to make the Percona XtraDB Cluster installation easier.
  • Debian/Ubuntu debug packages are now available for Galera and garbd.
  • New SST options have been implemented: inno-backup-opts, inno-apply-opts, inno-move-opts which pass options to backup, apply and move stages of innobackupex.
  • Initial configurable timeout, of 100 seconds, to receive a first packet via SST has been implemented, so that if donor dies somewhere in between, joiner doesn’t hang. Timeout can be configured with the sst-initial-timeout variable.
  • The joiner would wait and not fall back to choosing other potential donor nodes (not listed in wsrep_sst_donor) by their state. This happened even when comma was added at the end. This fixes it for that particular case.
  • Support for Query Cache has been implemented.
  • Percona XtraDB Cluster packages are now available for Ubuntu 14.04.

Bugs Fixed:

  • To avoid disabling Parallel Apply in case of SAVEPOINT or ROLLBACK TO SAVEPOINT the wsrep_cert_deps_distance was set to 1.0 at all times. Bug fixed #1277703.
  • First connection would hang after changing the wsrep_cluster_address variable. Bug fixed #1022250.
  • When gmcast.listen_addr variable was set manually it did not allow node’s own address in gcomm address list. Bug fixed #1099478.
  • wsrep_sst_rsync would silently fail on joiner when rsync server port was already taken. Bug fixed #1099783.
  • Server would segfault on INSERT DELAYED with wsrep_replicate_myisam set to 1 due to unchecked dereference of NULL pointer. Bug fixed #1165958.
  • When grastate.dat file was not getting zeroed appropriately it would lead to RBR error during the IST. Bug fixed #1180791.
  • Due to the Provides: line in Percona XtraDB Cluster (which provides Percona-Server-server), the command yum install Percona-Server-server would install Percona XtraDB Cluster instead of the expected Percona Server. Bug fixed #1201499.
  • Replication of partition tables without binlogging enabled failed, partition truncation didn’t work because of lack of TO isolation there. Bug fixed #1219605.
  • Exception during group merge after partitioning event has been fixed. Bug fixed #1232747.
  • Default value for binlog_format is now ROW. This is done so that Percona XtraDB Cluster is not started with wrong defaults leading to non-deterministic outcomes like crash. Bug fixed #1243228.
  • CREATE TABLE AS SELECT was not replicated, if the select result set was empty. Bug fixed #1246921.
  • INSERT would return deadlock instead of duplicate key on secondary unique key collision. Bug fixed #1255147.
  • Joiner node would not initialize storage engines if rsync was used for SST and the first view was non-primary. Bug fixed #1257341.
  • Table level lock conflict resolving was releasing the wrong lock. Bug fixed #1257678.
  • Resolved the perl dependencies needed for Percona XtraDB Cluster 5.5. Bug fixed #1258563.
  • Obsolete dependencies have been removed from Percona XtraDB Cluster. Bug fixed #1259256.
  • GCache file allocation could fail if file size was a multiple of page size. Bug fixed #1259952.
  • Percona XtraDB Cluster didn’t validate the parameters of wsrep_provider_options when starting it up. Bug fixed #1260193.
  • Runtime checks have been added for dynamic variables which are Galera incompatible. Bug fixed #1262188.
  • Node would get stuck and required restart if DDL was performed after FLUSH TABLES WITH READ LOCK. Bug fixed #1265656.
  • xtrabackup-v2 is now used as default SST method in wsrep_sst_method. Bug fixed #1268837.
  • FLUSH TABLES WITH READ LOCK behavior on the same connection was changed to conform to MySQL behavior. Bug fixed #1269085.
  • Read-only detection has been added in clustercheck, which can be helpful during major upgrades (this is used by xinetd for HAProxy etc.) Bug fixed #1269469.
  • Binary log directory is now being cleanup as part of the XtraBackup SST. Bug fixed #1273368.
  • Deadlock would happen when NULL unique key was inserted. Workaround has been implemented to support NULL keys, by using the md5 sum of full row as key value. Bug fixed #1276424.

This release contains almost 100 fixed bugs, complete list of fixed bugs can be found in our release notes.

Please see the 5.5.37-25.10 releases notes for complete details on all of the new features and bug fixes. Help us improve quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Percona XtraDB Cluster Errata can be found in our documentation.

The post Percona XtraDB Cluster 5.5.37-25.10 is now available appeared first on MySQL Performance Blog.

Percona XtraBackup 2.2.2 beta release is now available

May 8, 2014 - 7:58am

Percona is glad to announce the release of Percona XtraBackup 2.2.2-beta1 on May 8th 2014. Downloads are available from our download site here. This BETA release, will be available in our Debian experimental and CentOS testing repositories.

This is a BETA quality release and it is not intended for production. If you want a high quality, Generally Available release, the current Stable version should be used (Percona XtraBackup 2.1.9 in the 2.1 series at the time of writing).

New Features

  • Percona XtraBackup has now been rebased on MySQL 5.6.17.
  • Percona XtraBackup package is now available for Ubuntu 14.04.

Bugs Fixed

  • Percona XtraBackup couldn’t be built with Bison 3.0. Bug fixed #1262439.
  • The xtrabackup binaries now recognize a new my.cnf option, open_files_limit. The effect is the same as for the server: it changes the maximum number of file descriptors available to the xtrabackup process. The actual limit depends on the platform and ulimit settings. Bug fixed #1183793.
  • If a remote InnoDB tablespace got CREATEd or ALTERed during the backup, an attempt to prepare such a backup later would lead to Percona XtraBackup crash. Bug fixed #1291299.
  • Code in both innobackupex and xtrabackup, that was supposed to make sure no child processes are left running in case innobackupex got killed or failed with an error, relied on the fact the SIGTERM and SIGINT signals were not blocked by the xtrabackup process. However, both SIGTERM and SIGINT might be blocked by the process that had invoked innobackupex, for example, by the Percona XtraDB Cluster server processes doing an SST, in which case they were also blocked by the xtrabackup process, since the signal mask is inherited by child processes. Fixed by replacing SIGTERM in innobackupex and SIGINT in xtrabackup auto-termination with SIGKILL. Bug fixed #1294782.

Release notes with all the bugfixes for Percona XtraBackup 2.2.2 are available in our online documentation. All of Percona‘s software is open source and free, all the details of the release can be found in the 2.2.2 milestone at Launchpad. Bugs can be reported on the launchpad bug tracker.

The post Percona XtraBackup 2.2.2 beta release is now available appeared first on MySQL Performance Blog.

Doing a rolling upgrade of Percona XtraDB Cluster from 5.5 to 5.6

May 8, 2014 - 3:00am
Overview

Percona XtraDB Cluster 5.6 has been GA for several months now and people are thinking more and more about moving from 5.5 to 5.6. Most people don’t want to upgrade all at once, but would prefer a rolling upgrade to avoid downtime and ensure 5.6 is behaving in a stable fashion before putting all of production on it. The official guide to a rolling upgrade can be found in the PXC 5.6 manual. This blog post will attempt to summarize the basic process.

However, there are a few caveats to trying to do a rolling 5.6 upgrade from 5.5:

  1. If you mix Galera 2 and Galera 3 nodes, you must set wsrep_provider_options=”socket.checksum=1″ on the Galera 3 nodes for backwards compatibility between Galera versions.
  2. You must set some 5.6 settings for 5.5 compatibility with respect to replication:
    1. You can’t enable GTID async replication on the 5.6 nodes
    2. You should use –log-bin-use-v1-row-events on the 5.6 nodes
    3. You should set binlog_checksum=NONE  on the 5.6 nodes
  3. You must not SST a 5.5 donor to a 5.6 joiner as the SST script does not handle mysql_upgrade
  4. You should set the 5.6 nodes read_only and not write to them!
The basic upgrade flow

The basic upgrade flow is:

  1. For some node(s): upgrade to 5.6, do all the above stuff, put them into a read_only pool
  2. Repeat step 1 as desired
  3. Once your 5.6 pool is sufficiently large, cut over writes to the 5.6 pool (turn off read_only, etc.) and upgrade the rest.

This is, in essence, exactly like upgrading a 5.5 master/slave cluster to 5.6 — you upgrade the slaves first, promote a slave and upgrade the master; we just have more masters to think about.

Once your upgrade is fully to 5.6, then you can go back through and remove all the 5.5 backwards compatibility

Why can’t I write to the 5.6 nodes?

The heaviest caveat is probably the fact that in a mixed 5.5 / 5.6 cluster, you are not supposed to write to the 5.6 nodes.  Why is that?  Well, the reason goes back to MySQL itself.  PXC/Galera uses standard RBR binlog events from MySQL for replication.   Replication between major MySQL versions is only ever officially supported:

  • across 1 major version (i.e., 5.5 to 5.6, though multiple version hops do often work)
  • from a lower version master to a higher version slave (i.e., 5.5 is your master and 5.6 is your slave, but not the other way around)

This compatibility requirement (which has existed for a very long time in MySQL) works great when you have a single Master replication topology, but true multi-master (multi-writer) has obviously never been considered.

Some alternativesDoes writing to the 5.6 nodes REALLY break things?

The restriction on 5.6 masters of 5.5 slaves is probably too strict in many cases.  Technically only older to newer replication is ever truly supported, but in practice you may be able to run a mixed cluster with writes to all nodes as long as you are careful.  This means (at least) that any modifications to column type formats in the newer version NOT be upgraded while the old version remains active in the cluster.  There might be other issues, I’m not sure, I cannot say I’ve tested every possible circumstance.

So, can I truly say I recommend this?  I cannot say that officially, but you may find it works fine.  As long as you acknowledge that something unforeseen may break your cluster and your migration plan, it may be reasonable.  If you decide to explore this option, please test this thoroughly and be willing to accept the consequences of it not working before trying it in production!

Using Async replication to upgrade

Another alternative is rather than trying to mix the clusters and keeping 5.6 nodes read_only, why not just setup the 5.6 cluster as an async slave of your 5.5 cluster and migrate your application to the new cluster when you are ready?  This is practically the same as maintaining a split 5.5/5.6 read_write/read_only cluster without so much risk and a smaller list of don’ts.  Cutover in this case would be effectively like promoting a 5.6 slave to master, except you would promote the 5.6 cluster.

One caveat with this approach might be dealing with replication throughput:  async may not be able to keep up replicating your 5.5 cluster writes to a separate 5.6 cluster.  Definitely check out wsrep_preordered to speed things up, it may help.  But realize some busy workloads just may not ever be able to use async into another cluster.

Just take the outage

A final alternative for this post is the idea of simply upgrading the entire cluster to 5.6 all at once during a maintenance window.  I grant that this defeats the point of a rolling upgrade, but it may offer a lot of simplicity in the longer run.

Conclusion

A rolling PXC / Galera upgrade across major MySQL versions is limited by the fact that there is no official support or reason for Oracle to support newer master to older slave.  In practice, it may work much of the time, but these situations should be considered carefully and the risk weighed against all other options.

The post Doing a rolling upgrade of Percona XtraDB Cluster from 5.5 to 5.6 appeared first on MySQL Performance Blog.

Percona XtraBackup 2.1.9 is now available

May 7, 2014 - 7:21am

Percona is glad to announce the release of Percona XtraBackup 2.1.9 on May 7th, 2014. Downloads are available from our download site here and Percona Software Repositories.

This release is the current GA (Generally Available) stable release in the 2.1 series. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases.

New Features

  • Percona XtraBackup has now been rebased on MySQL 5.6.17.
  • Percona XtraBackup package is now available for Ubuntu 14.04.

Bugs Fixed

  • Percona XtraBackup couldn’t be built with Bison 3.0. Bug fixed #1262439.
  • The xtrabackup binaries now recognize a new my.cnf option, open_files_limit. The effect is the same as for the server: it changes the maximum number of file descriptors available to the xtrabackup process. The actual limit depends on the platform and ulimit settings. Bug fixed #1183793.
  • If a remote InnoDB tablespace got CREATEd or ALTERed during the backup, an attempt to prepare such a backup later would lead to Percona XtraBackup crash. Bug fixed #1291299.
  • Code in both innobackupex and xtrabackup, that was supposed to make sure no child processes are left running in case innobackupex got killed or failed with an error, relied on the fact the SIGTERM and SIGINT signals were not blocked by the xtrabackup process. However, both SIGTERM and SIGINT might be blocked by the process that had invoked innobackupex, for example, by the Percona XtraDB Cluster server processes doing an SST, in which case they were also blocked by the xtrabackup process, since the signal mask is inherited by child processes. Fixed by replacing SIGTERM in innobackupex and SIGINT in xtrabackup auto-termination with SIGKILL. Bug fixed #1294782.
  • A new table flag in the InnoDB data dictionary, introduced in MySQL 5.6.16, wasn’t recognized by Percona XtraBackup which lead to “InnoDB: in InnoDB data dictionary has unknown flags 50.” warnings. Fixed by rebasing Percona XtraBackup on MySQL 5.6.17. Bug fixed #1302882.

Release notes with all the bugfixes for Percona XtraBackup 2.1.9 are available in our online documentation. All of Percona‘s software is open source and free, all the details of the release can be found in the 2.1.9 milestone at Launchpad. Bugs can be reported on the launchpad bug tracker.

The post Percona XtraBackup 2.1.9 is now available appeared first on MySQL Performance Blog.

MySQL Audit Plugin now available in Percona Server 5.5 and 5.6

May 7, 2014 - 6:00am

The new Percona Server 5.5.37-35.0 and Percona Server 5.6.17-65.0-56, announced yesterday (May 6), both include the open source version of the MySQL Audit Plugin. The MySQL Audit Plugin is used to log all queries or connections (“audit” MySQL usage). Until yesterday’s release, the MySQL Audit Plugin was only available in MySQL Enterprise.

Logging all MySQL usage is very important for a number of applications, for example:

  • Required: applications which deals with sensitive data (credit cards, medical records, etc); required for security compliances (i.e. HIPAA)
  • Very helpful: multi-tenants applications or MySQL as a service; MySQL administrators can audit the MySQL usage from the security and performance standpoint
  • Very helpful: investigating and troubleshooting; it is great to have a full log of all queries, which can help a lot for troubleshooting of MySQL and even for performance audit.

Originally, the only “easy” option was to enable general log. (Other options included using binary logs which does not include select queries or enabling queries “trace” in the application or MySQL connector). However, logging all queries using a general log may dramatically decrease performance in the highly loaded MySQL applications: Aleksandr Kuzminsky published a benchmark in 2009 to show the overhead of MySQL general and slow log. The main benefit of MySQL Log Audit plugin is that it logs all queries asynchronously (can be changed in the config). I’ve decided to try the new audit plugin in Percona Server and measure the performance impact of the new plugin compared to enabling the general log for the CPU bound applications.

How to start with MySQL Audit Plugin

First, we will need to enable (or “install”) MySQL audit plugin as decribed in the doc:

mysql> select version(); +-------------+ | version() | +-------------+ | 5.5.37-35.0 | +-------------+ 1 row in set (0.00 sec) mysql> INSTALL PLUGIN audit_log SONAME 'audit_log.so'; Query OK, 0 rows affected (0.00 sec)

Now can see all MySQL audit plugin options:

mysql> show global variables like '%audit%'; +--------------------------+--------------+ | Variable_name | Value | +--------------------------+--------------+ | audit_log_buffer_size | 1048576 | | audit_log_file | audit.log | | audit_log_flush | OFF | | audit_log_format | OLD | | audit_log_policy | ALL | | audit_log_rotate_on_size | 0 | | audit_log_rotations | 0 | | audit_log_strategy | ASYNCHRONOUS | +--------------------------+--------------+ 8 rows in set (0.00 sec)

There are a bunch of options we can tweak here, the most important for MySQL performance are:

  • audit_log_buffer_size; this buffer is used to cache the queries (for asynchronous operation).
  • audit_log_strategy; All options are listed in the documentation page:
ValueMeaningASYNCHRONOUSLog asynchronously, wait for space in output bufferPERFORMANCELog asynchronously, drop request if insufficient space in output bufferSEMISYNCHRONOUSLog synchronously, permit caching by operating systemSYNCHRONOUSLog synchronously, call sync() after each request

The most useful option in my mind is ASYNCHRONOUS, providing us with good balance between performance and not loosing transactions if the output buffer is not large enough.

  •  audit_log_policy; we can log all queries or MySQL logins only (very useful if we only need to audit MySQL connections)

Open Source Audit Plugin in MySQL Community server

You can also use Percona Open Source version of Audit Plugin in MySQL community version (5.5.37 and 5.6.17). Simply download the linux tarball of Percona Server and copy the  audit_log.so to your MySQL plugin dir.

Find plugin dir:

mysql> show global variables like '%plugin%'; +---------------+------------------------------+ | Variable_name | Value | +---------------+------------------------------+ | plugin_dir | /usr/local/mysql/lib/plugin/ | +---------------+------------------------------+ 1 row in set (0.00 sec)

Copy the file:

# cp audit_log.so /usr/local/mysql/lib/plugin/

Install plugin:

Server version: 5.5.37 MySQL Community Server (GPL) mysql> INSTALL PLUGIN audit_log SONAME 'audit_log.so'; Query OK, 0 rows affected (0.00 sec) Server version: 5.6.17 MySQL Community Server (GPL) mysql> INSTALL PLUGIN audit_log SONAME 'audit_log.so'; Query OK, 0 rows affected (0.00 sec)

Using MySQL audit plugin

When plugin is enabled, it will log entries in audit.log file in XML format. Example:

<AUDIT_RECORD "NAME"="Audit" "RECORD"="1_2014-04-30T00:04:42" "TIMESTAMP"="2014-04-30T00:04:42 UTC" "MYSQL_VERSION"="5.5.37-35.0" "STARTUP_OPTIONS"="--basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/lib/mysql/localhost.localdomain.pid --socket=/var/lib/mysql/mysql.sock" "OS_VERSION"="x86_64-Linux", /> <AUDIT_RECORD "NAME"="Query" "RECORD"="2_2014-04-30T00:04:42" "TIMESTAMP"="2014-04-30T00:04:42 UTC" "COMMAND_CLASS"="install_plugin" "CONNECTION_ID"="1" "STATUS"="0" "SQLTEXT"="INSTALL PLUGIN audit_log SONAME 'audit_log.so'" "USER"="root[root] @ localhost []" "HOST"="localhost" "OS_USER"="" "IP"="" /> <AUDIT_RECORD "NAME"="Query" "RECORD"="3_2014-04-30T00:04:42" "TIMESTAMP"="2014-04-30T00:05:07 UTC" "COMMAND_CLASS"="show_variables" "CONNECTION_ID"="1" "STATUS"="0" "SQLTEXT"="show global variables like '%audit%'" "USER"="root[root] @ localhost []" "HOST"="localhost" "OS_USER"="" "IP"="" />

<AUDIT_RECORD "NAME"="Query" "RECORD"="10_2014-04-30T00:04:42" "TIMESTAMP"="2014-04-30T12:33:20 UTC" "COMMAND_CLASS"="grant" "CONNECTION_ID"="2" "STATUS"="0" "SQLTEXT"="grant all on sbtest.* to sb@localhost identified by 'sb'" "USER"="root[root] @ localhost []" "HOST"="localhost" "OS_USER"="" "IP"="" /> <AUDIT_RECORD "NAME"="Connect" "RECORD"="11_2014-04-30T00:04:42" "TIMESTAMP"="2014-04-30T12:34:53 UTC" "CONNECTION_ID"="3" "STATUS"="0" "USER"="sb" "PRIV_USER"="sb" "OS_LOGIN"="" "PROXY_USER"="" "HOST"="localhost" "IP"="" "DB"="sbtest" /> <AUDIT_RECORD "NAME"="Query" "RECORD"="1292_2014-04-30T00:04:42" "TIMESTAMP"="2014-04-30T12:45:07 UTC" "COMMAND_CLASS"="select" "CONNECTION_ID"="32" "STATUS"="1146" "SQLTEXT"="SELECT pad FROM sbtest8 WHERE id=5036031" "USER"="sb[sb] @ localhost []" "HOST"="localhost" "OS_USER"="" "IP"="" />

 Important notes: 

  • As all queries will be logged here, the passwords from “GRANT” will also be saved in clear text (as you can see above). It is very important to secure the file on disk.
  • The file can grow very large on disk:

ls -lah /var/lib/mysql/audit.log -rw-rw---- 1 mysql mysql 7.1G May 4 07:30 /var/lib/mysql/audit.log

Searching the Audit Log entries

MySQL utilities provide a useful tool, mysqlauditgrep, to search / grep the logs file.  Unfortunately, I was not able to make it work (tried both v. 1.3 and v 1.4)  with audit plugin format created by Percona server. According to this bug  it can’t parse the “new” audit format. In my case, mysqlauditgrep will return a parsing error when I use the default format and returned no results when I set the “audit_log_format=NEW”. It will be nice to use the mysqlauditgrep as it looks like a very powerful tool, but for now our searching options are limited to conventional linux grep (which is not very easy for XML documents) or custom application to parse/search XML.

Performance overhead of Audit Log Plugin and General Log 

Finally, I wanted to measure the overhead of the Audit Log Plugin compared to General Log. I did a quick benchmark with sysbench OLTP test (CPU bound workload) with 4 modes:

  1. Audit Plugin disabled (to measure baseline)
  2. Audit Plugin enabled and logs all queries
  3. Audit Plugin enabled and logs only logins
  4. General Log enabled, Audit Plugin disabled

Here are the results:

TestOverheadPlugin +  audit_log_policy = ALL~15% overheadPlugin +  audit_log_policy = LOGINS~0% overhead (sysbench only connects once, so there may be bigger overhead here)General_log~62% overhead

As we can see here, audit log is not free from overhead, however, it is much smaller than enabling general_log to log all and every query. Those are quick benchmark results and more tests are need for more accurate measurements. Also, as always, your milage can vary.

Nice to have features

What I would love to have for audit plugin is the ability to log only some specific actions. For example, only log activity from a specific user or access to a specific table (i.e. a table with a sensitive data), etc. This will give more control and less overhead (=better performance).

Conclusion

The MySQL Audit Plugin is a great feature – it is a valuable tool for MySQL security and performance audits. The performance overhead may be a concern for a highly loaded systems, however, it looks reasonable and is much better than using general log to log all queries.

If you use general log or any other audit plugins, please share your experience in the comments.

The post MySQL Audit Plugin now available in Percona Server 5.5 and 5.6 appeared first on MySQL Performance Blog.

Percona Server 5.6.17-65.0 is now available

May 6, 2014 - 9:50am

Percona Server version 5.6.17-65.0

Percona is glad to announce the release of Percona Server 5.6.17-65.0 on May 6th, 2014. Downloads are available here and from the Percona Software Repositories.

Based on MySQL 5.6.17, including all the bug fixes in it, Percona Server 5.6.17-65.0 is the current GA release in the Percona Server 5.6 series. All of Percona’s software is open-source and free, all the details of the release can be found in the 5.6.17-65.0 milestone at Launchpad.

New Features:

  • Percona Server now supports Metrics for scalability measurement.
  • Percona Server now supports Audit Log Plugin.
  • Percona Server parser and query optimizer now support Multiple Clustering Keys when TokuDB engine is used.
  • Storage engine handler interface has been extended with new calls to notify the storage engine of imminent table or index scan. The calls are used by TokuDB to improve performance of these operations.
  • Percona Server packages are now available for Ubuntu 14.04.

Bugs Fixed:

  • Percona Server couldn’t be built with Bison 3.0. Bug fixed #1262439, upstream #71250 (Ryan Gordon).
  • Fixed the inadequate background LRU flushing for write workloads with InnoDB compression that could lead to lower performance. Bug fixed #1295268.
  • Percona Server debug packages were not built for the previous releases. Bug fixed #1298352.
  • Queries that no longer exceed long_query_time were written to the slow query log if they matched the previous long_query_time value when slow_query_log_use_global_control variable was set to all. Bug fixed #1016991.
  • When writing audit plugins it was not possible to get notifications for general-log events without enabling the general-log. Bug fixed #1182535 (upstream #60782).
  • mysqld_safe did not correctly parse flush_caches and numa_interleave options. Bug fixed #1231110.
  • Thread Pool would handle a new client connection without notifying Audit Plugin. Bug fixed #1282008.
  • Fixed a performance issue in extending tablespaces if running under fusionIO with atomic writes enabled. Bug fixed #1286114 (Jan Lindström).
  • Previous implementation of the log_slow_rate_type set to query with log_slow_rate_limit feature would log every nth query deterministically instead of each query having a 1/n probability to get logged. Fixed by randomly selecting the queries to be logged instead of logging every nth query. Bug fixed #1287650.
  • Percona Server source files were referencing Maatkit instead of Percona Toolkit. Bug fixed #1174779.
  • Maximum allowed value for log_slow_rate_limit was ULONG_MAX (ie. either 4294967295 or 18446744073709551615, depending on the platform). As it was unreasonable to configure the slow log for every four billionth session/query, new maximum allowed value is set to 1000. Bug fixed #1290714.

Other bugs fixed #1295523, #1299688 (upstream #72163) and #1272732.

Release notes for Percona Server 5.6.17-65.0 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

The post Percona Server 5.6.17-65.0 is now available appeared first on MySQL Performance Blog.

Percona Server 5.5.37-35.0 is now available

May 6, 2014 - 9:47am

Percona Server version 5.5.37-35.0

Percona is glad to announce the release of Percona Server 5.5.37-35.0 on May 6th, 2014 (Downloads are available here and from the Percona Software Repositories). Based on MySQL 5.5.37, including all the bug fixes in it, Percona Server 5.5.37-35.0 is now the current stable release in the 5.5 series. All of Percona‘s software is open-source and free, all the details of the release can be found in the 5.5.37-35.0 milestone at Launchpad.

New Features:

Bugs Fixed:

  • Percona Server couldn’t be built with Bison 3.0. Bug fixed #1262439, upstream #71250.
  • Backported the upstream fix for overflow which would caused replication SQL thread to fail to execute events. Bug fixed #1070255 (upstream #67352).
  • Percona Server debug packages were not built for the previous releases. Bug fixed #1298352.
  • Queries that no longer exceed long_query_time were written to the slow query log if they matched the previous long_query_time value when slow_query_log_use_global_control variable was set to all. Bug fixed #1016991.
  • When writing audit plugins it was not possible to get notifications for general-log events without enabling the general-log. Bug fixed #1182535 (upstream #60782).
  • mysqld_safe did not correctly parse flush_caches and numa_interleave options. Bug fixed #1231110.
  • Thread Pool would handle a new client connection without notifying Audit Plugin. Bug fixed #1282008.
  • Fixed a performance issue in extending tablespaces if running under fusionIO with atomic writes enabled. Bug fixed #1286114 (Jan Lindström).
  • Previous implementation of the log_slow_rate_type set to query with log_slow_rate_limit feature would log every nth query deterministically instead of each query having a 1/n probability to get logged. Fixed by randomly selecting the queries to be logged instead of logging every nth query. Bug fixed #1287650.
  • Percona Server source files were referencing Maatkit instead of Percona Toolkit. Bug fixed #1174779.
  • Maximum allowed value for log_slow_rate_limit was ULONG_MAX (ie. either 4294967295 or 18446744073709551615, depending on the platform). As it was unreasonable to configure the slow log for every four billionth session/query, new maximum allowed value is set to 1000. Bug fixed #1290714.

Other bugs fixed: #1272732.
Release notes for Percona Server 5.5.37-35.0 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

The post Percona Server 5.5.37-35.0 is now available appeared first on MySQL Performance Blog.

Row-based replication, MySQL 5.6 upgrades and temporal data types

May 6, 2014 - 8:31am
Whither your rollback plan?

MySQL 5.6 upgrades are in full swing these days and knowing how to safely upgrade from MySQL 5.5 to 5.6 is important. When upgrading a replication environment, it’s important that you can build a migration plan that safely allows for your upgrade with minimal risk — rollback is often a very important component to this.

For many people this means upgrading slaves first and then the master.  The strategy of an older master replicating to a newer slave is well known and has been supported in MySQL replication for a very long time.  To be specific:  you can have a MySQL 5.6 slave of a 5.5 master and this should work fine until you upgrade your master and/or promote one of the slaves to be the master.

However, there are those of us who like to live on the edge and do unsupported things.  Suppose that when you cut over to that MySQL 5.6 master your application completely breaks.  What would your rollback plan be?   In such a case, leaving a 5.5 slave of the new 5.6 master (or perhaps a dual-master setup with 5.5 and 5.6) would be useful to allow you to rollback to but still have the data written on the 5.6 master.

What might break?

With Statement-based replication (SBR), you are generally ok with this type of setup, provided you aren’t doing any MySQL 5.6 syntax-specific things until you don’t have any more 5.5 slaves.  However, with Row-based replication (RBR), things are a bit trickier, particularly when column formats change.

Now, one nice new feature of MySQL 5.6 is the improvement of the storage requirements for DATETIME fields as well as the addition of fractional second support for TIME, DATETIME, and TIMESTAMP.   This is great, but unfortunately this is a new column format that 5.5 clearly would not understand.  Does this put our 5.6 to 5.5 replication in jeopardy?    The answer is, if we’re careful, NO.

Quite simply, MySQL 5.6 supports both old and new types and mysql_upgrade does not make such a conversion on existing tables.  Only NEW tables or REBUILT tables in 5.6 will use the new format.  Any tables from 5.5 with a simple mysql_upgrade to 5.6 will still be using the old types.  For more information on how to find columns in 5.6 that are using the old format, see Ike Walker’s excellent blog post on the topic.  (Thanks Ike!)

An imperfect test

To test this out, I created a simple experiment.  I have a master and slave using RBR, both on 5.5, and I setup pt-heartbeat to update the master.  I realized that pt-heartbeat actually uses a varchar for the timestamp field — I suspect this makes multiple database support easier.  However, since pt-heartbeat’s update uses a NOW() to populate that field, I can convert it to a DATETIME:

[root@master ~]# pt-heartbeat --update  --database percona --create-table CREATE TABLE `heartbeat` ( `ts` varchar(26) NOT NULL, `server_id` int(10) unsigned NOT NULL, `file` varchar(255) DEFAULT NULL, `position` bigint(20) unsigned DEFAULT NULL, `relay_master_log_file` varchar(255) DEFAULT NULL, `exec_master_log_pos` bigint(20) unsigned DEFAULT NULL, PRIMARY KEY (`server_id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 master mysql> alter table heartbeat drop column ts, add column ts DATETIME; slave mysql> select * from heartbeat\G *************************** 1. row *************************** server_id: 1 file: master-bin.000002 position: 5107583 relay_master_log_file: NULL exec_master_log_pos: NULL ts: 2014-05-02 17:03:59 1 row in set (0.00 sec) CREATE TABLE `heartbeat` ( `server_id` int(10) unsigned NOT NULL, `file` varchar(255) DEFAULT NULL, `position` bigint(20) unsigned DEFAULT NULL, `relay_master_log_file` varchar(255) DEFAULT NULL, `exec_master_log_pos` bigint(20) unsigned DEFAULT NULL, `ts` datetime DEFAULT NULL, PRIMARY KEY (`server_id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 

So my heartbeat table now has a 5.5 DATETIME, pt-heartbeat is working properly, and the heartbeat is replicating to the slave.  Now I will upgrade my master to MySQL 5.6:

[root@master ~]# rpm -e Percona-Server-devel-55-5.5.36-rel34.2.el6.x86_64 Percona-Server-shared-55-5.5.36-rel34.2.el6.x86_64 Percona-Server-client-55-5.5.36-rel34.2.el6.x86_64 Percona-Server-server-55-5.5.36-rel34.2.el6.x86_64 --nodeps [root@master ~]# yum install Percona-Server-server-56.x86_64 ============================================================================================================== Package Arch Version Repository Size ============================================================================================================== Installing: Percona-Server-server-56 x86_64 5.6.16-rel64.2.el6 Percona 19 M Installing for dependencies: Percona-Server-client-56 x86_64 5.6.16-rel64.2.el6 Percona 6.8 M Percona-Server-shared-56 x86_64 5.6.16-rel64.2.el6 Percona 712 k Transaction Summary ============================================================================================================== Install 3 Package(s) ... [root@master ~]# service mysql start Starting MySQL (Percona Server)....... SUCCESS! [root@master ~]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 1 Server version: 5.6.16-64.2-56-log Percona Server (GPL), Release 64.2, Revision 569 [root@master ~]# mysql_upgrade Looking for 'mysql' as: mysql Looking for 'mysqlcheck' as: mysqlcheck Running 'mysqlcheck with default connection arguments Running 'mysqlcheck with default connection arguments mysql.columns_priv OK mysql.db OK mysql.event OK mysql.func OK mysql.general_log OK mysql.help_category OK mysql.help_keyword OK mysql.help_relation OK mysql.help_topic OK mysql.host OK mysql.ndb_binlog_index OK mysql.plugin OK mysql.proc OK mysql.procs_priv OK mysql.proxies_priv OK mysql.servers OK mysql.slow_log OK mysql.tables_priv OK mysql.time_zone OK mysql.time_zone_leap_second OK mysql.time_zone_name OK mysql.time_zone_transition OK mysql.time_zone_transition_type OK mysql.user OK Running 'mysql_fix_privilege_tables'... Running 'mysqlcheck with default connection arguments Running 'mysqlcheck with default connection arguments percona.heartbeat OK OK

I can now verify that Ike’s INFORMATION_SCHEMA queries correctly detect the ‘heartbeat.ts’ column as the old format:

master mysql> select t.table_schema,t.engine,t.table_name,c.column_name,c.column_type from information_schema.tables t inner join information_schema.columns c on c.table_schema = t.table_schema and c.table_name = t.table_name left outer join information_schema.innodb_sys_tables ist on ist.name = concat(t.table_schema,'/',t.table_name) left outer join information_schema.innodb_sys_columns isc on isc.table_id = ist.table_id and isc.name = c.column_name where c.column_type in ('time','timestamp','datetime') and t.table_schema not in ('mysql','information_schema','performance_schema') and t.table_type = 'base table' and (t.engine != 'innodb' or (t.engine = 'innodb' and isc.mtype = 6)) order by t.table_schema,t.table_name,c.column_name; +--------------+--------+------------+-------------+-------------+ | table_schema | engine | table_name | column_name | column_type | +--------------+--------+------------+-------------+-------------+ | percona | InnoDB | heartbeat | ts | datetime | +--------------+--------+------------+-------------+-------------+ 1 row in set (0.04 sec)

To make replication work from MySQL 5.6 to 5.5, I also had to add a few backwards compatibility options on the master:

log_bin_use_v1_row_events = ON binlog_checksum = NONE

Once I fixed that up, I can verify my slave is still working after this and receiving heartbeats. Clearly the new formats are not a show-stopper for backwards replication compatibility.

slave mysql> show slave status\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.70.2 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: master-bin.000005 Read_Master_Log_Pos: 120 Relay_Log_File: slave-relay-bin.000002 Relay_Log_Pos: 267 Relay_Master_Log_File: master-bin.000005 Slave_IO_Running: Yes Slave_SQL_Running: Yes master mysql> select * from heartbeat; +-----------+-------------------+----------+-----------------------+---------------------+---------------------+ | server_id | file | position | relay_master_log_file | exec_master_log_pos | ts | +-----------+-------------------+----------+-----------------------+---------------------+---------------------+ | 1 | master-bin.000002 | 5115935 | NULL | NULL | 2014-05-02 17:04:23 | +-----------+-------------------+----------+-----------------------+---------------------+---------------------+ 1 row in set (0.01 sec) slave mysql> select * from heartbeat; +-----------+-------------------+----------+-----------------------+---------------------+---------------------+ | server_id | file | position | relay_master_log_file | exec_master_log_pos | ts | +-----------+-------------------+----------+-----------------------+---------------------+---------------------+ | 1 | master-bin.000002 | 5115935 | NULL | NULL | 2014-05-02 17:04:23 | +-----------+-------------------+----------+-----------------------+---------------------+---------------------+ 1 row in set (0.00 sec)

But, if I’m not careful on MySQL 5.6, and rebuild the table, the new format does clearly bite me:

master mysql> set sql_log_bin=0; Query OK, 0 rows affected (0.00 sec) master mysql> alter table percona.heartbeat force; Query OK, 1 row affected, 1 warning (0.18 sec) Records: 1 Duplicates: 0 Warnings: 1 master mysql> show warnings; +-------+------+-------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+-------------------------------------------------------------------------------------+ | Note | 1880 | TIME/TIMESTAMP/DATETIME columns of old format have been upgraded to the new format. | +-------+------+-------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) slave mysql> show slave status\G *************************** 1. row *************************** ... Slave_IO_Running: Yes Slave_SQL_Running: No ... Last_Errno: 1677 Last_Error: Column 5 of table 'percona.heartbeat' cannot be converted from type '' to type 'datetime' ... Last_SQL_Errno: 1677 Last_SQL_Error: Column 5 of table 'percona.heartbeat' cannot be converted from type '' to type 'datetime' Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec)

TL;DR

What does all this teach us?

While the MySQL version is important, for RBR what matters most is the actual current format for each column.  Your master and slave(s) MUST have the same column formats for RBR to work right.

So, the new temporal formats do not necessarily break RBR replication back to 5.5, provided:

  • All base MySQL 5.6 enhancements to replication are disabled (binlog checksums and the RBR v2 format)
  • Tables with temporal formats are preserved in their 5.5 formats until all 5.5 nodes are retired.
  • You can avoid creating any new tables on the MySQL 5.6 master with temporal formats

However, I want  to make it clear that MySQL 5.6 to 5.5 replication is technically unsupported.  I have not exhausted all possibilities for problems with 5.6 to 5.5 RBR replication, just this specific one. If you choose to make an upgrade strategy that relies on backwards replication in this way, be prepared for it to not work and test it thoroughly in advance.  The purpose of this post is to simply point out that data type formats, in and of themselves, do not necessarily break RBR backwards compatibility.

The post Row-based replication, MySQL 5.6 upgrades and temporal data types appeared first on MySQL Performance Blog.

Exploring message brokers

May 5, 2014 - 5:00am

Message brokers are not regularly covered here but are, nonetheless, important web-related technologies. Some time ago, I was asked by one of our customer to review a selection of OSS message brokers and propose a couple of good candidates. The requirements were fairly simple: behave well when there’s a large backlog of messages, be able to create a cluster and in case of the failure of a node in a cluster, try to protect the data but never blocks the publishers even though that might imply data lost. Nothing fancy regarding queues and topics management. I decided to write my findings here, before I forget…

I don’t consider myself a message broker specialist and I spent only about a day or two on each so, I may have done some big mistakes configuration wise. I’ll take the blame if something is misconfigured or not used correctly.

RabbitMQ

RabbitMQ is well known and popular message broker and it has many powerful features. The documentation on the RabbitMQ web site is excellent and there are many books available. RabbitMQ is written in Erlang, not a widely used programming language but well adapted to such tasks. The company Pivotal develops and maintains RabbitMQ. I reviewed version 3.2.2 on CentOS 6 servers.

The installation was easy, I installed Erlang version R14B from epel and the RabbitMQ rpm. The only small issue I had was that the server is expecting “127.0.0.1″ to be resolved in /etc/hosts and the openstack VMs I used were missing that. Easy to fix. I also installed and enabled the management plugin.

The RabbitMQ configuration is set in the rabbitmq.config file and it has tons of adjustable parameters. I used the defaults. In term of client API, RabbitMQ support a long list of languages and some standards protocols, like STOMP are available with a plugin. Queues and topics can be created either by the web interface or through the client API directly. If you have more than one node, they can be clustered and then, queues and topics, can be replicated to other servers.

I created 4 queues, wrote a ruby client and started inserting messages. I got a publishing rate of about 20k/s using multiple threads but I got a few stalls caused by the vm_memory_high_watermark, from my understanding during those stalls it writing to disk. Not exactly awesome given my requirements. Also, some part is always kept in memory even if a queue is durable so, even though I had plenty of disk space, the memory usage grew and eventually hit the vm_memory_high_watermark setting. The cpu load was pretty high during the load, between 40% and 50% on an 8 cores VM.

Even though my requirements were not met, I setup a replicated queue on 2 nodes and inserted a few millions objects. I killed one of the two nodes and insert were even faster but then… I did a mistake. I restarted the node and asked for a resync. Either I didn’t set it correctly or the resync is poorly implemented but it took forever to resync and it was slowing down as it progressed. At 58% done, it has been running for 17h, one thread at 100%. My patience was exhausted.

So, lots of feature, decent performance but behavior not compatible with the requirements.

Kafka

Kafka has been designed originally by LinkedIn, it is written in Java and it is now under the Apache project umbrella. Sometimes you look at a technology and you just say: wow, this is really done the way it should be. At least I could say that for the purpose I had. What is so special about Kafka is the architecture, it stores the messages in flat files and consumers ask messages based on an offset. Think of it like a MySQL server (producer) saving messages (updates SQL) to its binlogs and slaves (consumers) ask messages based on an offset. The server is pretty simple and just don’t care about the consumers much. That simplicity makes it super fast and low on resource. Old messages can be retained on a time base (like expire_logs_days) and/or on a storage usage base.

So, if the server doesn’t keep track of what has been consumed on each topics, how do can you have multiple consumer. The missing element here is Zookeeper. The Kafka server uses Zookeeper for cluster membership and routing while the consumers can also use Zookeeper or something else for synchronization. The sample consumer provided with the server uses Zookeeper so you can launch many instances and they’ll synchronize automatically. For the ones that doesn’t know Zookeeper, it is a highly-available synchronous distributed storage system. If you know Corosync, it provides somewhat the same functionality.

Feature wise Kafka, isn’t that great. There’s no web frontend builtin although a few are available in the ecosystem. Routing and rules are inexistent and stats are just with JMX. But, the performance… I reached a publishing speed of 165k messages/s over a single thread, I didn’t bother tuning for more. Consuming was essentially disk bound on the server, 3M messages/s… amazing. That was without Zookeeker coordination. Memory and cpu usage were modest.

To test clustering, I created a replicated queue, inserted a few messages, stopped a replica, inserted a few millions more messages and restarted the replica. I took only a few seconds to resync.

So, Kafka is very good fit for the requirements, stellar performance, low resource usage and nice fit with the requirements.

ActiveMQ

ActiveMQ is another big player in the field with an impressive feature set. ActiveMQ is more in the RabbitMQ league than Kafka and like Kafka, it is written in Java. HA can be provided by the storage backend, levelDB supports replication but I got some issues with it. My requirements are not for full HA, just to make sure the publishers are never blocked so I dropped the storage backend replication in favor of a mesh of brokers.

My understanding of the mesh of brokers is that you connect to one of the members and you publish or consume a message. You don’t know on which node(s) the queue is located, the broker you connect to knows and routes your request. To further help, you can specify all the brokers on the connection string and the client library will just reconnect to another if the one you are connected to goes down. That looks pretty good for the requirements.

With the mesh of brokers setup, I got an insert rate of about 5000 msg/s over 15 threads and a single consumer was able to read 2000 msg/s. I let it run for a while and got 150M messages. At this point though, I lost the web interface and the publishing rate was much slower.

So, a big beast, lot of features, decent performance, on the edge with the requirements.

Kestrel

Kestrel is another interesting broker, this time, more like Kafka. Written in scala, the Kestrel broker speaks the memcached protocol. Basically, the key becomes the queue name and the object is the message. Kestrel is very simple, queues are defined in a configuration file but you can specify, per queue, storage limits, expiration and behavior when limits are reached. With a setting like “discardOldWhenFull = true”, my requirement of never blocking the publishers is easily met.

In term of clustering Kestrel is a bit limited but each can publish its availability to Zookeeper so that publishers and consumers can be informed of a missing server and adjust. Of course, if you have many Kestrel servers with the same queue defined, the consumers will need to query all of the broker to get the message back and strict ordering can be a bit hard.

In term of performance, a few simple bash scripts using nc to publish messages easily reached 10k messages/s which is very good. The rate is static over time and likely limited by the reconnection for each message. The presence of consumers slightly reduces the publishing rate but nothing drastic. The only issue I had was when a large number of messages expired, the server froze for some time but that was because I forgot to set maxExpireSweep to something like 100 and all the messages were removed in one pass.

So, fairly good impression on Kestrel, simple but works well.

Conclusion

For the requirements given by the customer, Kafka was like a natural fit. It offers a high guarantee that the service will be available and non-blocking under any circumstances. In addition, messages can easily be replicated for higher data availability. Kafka performance is just great and resource usage modest.

The post Exploring message brokers appeared first on MySQL Performance Blog.

The top 9 Percona Toolkit tools that can make your job easier: May 7 Webinar

May 5, 2014 - 3:00am

Tools for MySQL are a vital part of any deployment, so it’s important to use ones that are reliable and well-designed. Percona Toolkit is a collection of more than 30 command-line tools for MySQL, Percona Server, and MariaDB that can help database administrators automate a variety of database and system tasks. With so many available tools, however, it can be difficult knowing where to start.

For this reason I invite you to join me on Wednesday, May 7 at 10 a.m. Pacific time for a free webinar titled, “The top 9 Percona Toolkit tools that can make your job easier.” You can register directly here (you’ll also be able to download the slides and catch the recording following the webinar at that same link).

Spoiler alert!  The 9 most popular tools in Percona Toolkit are:

Percona Toolkit for MySQL is derived from Maatkit and Aspersa, two of the best-known MySQL management software utility toolkits for MySQL server administration. It has been downloaded more than 250,000 times. Some of the best-known Internet sites powered by MySQL, Percona Server, MariaDB, and Amazon RDS use tools from Percona Toolkit for MySQL to manage their databases. If you have never heard of Percona Toolkit or if you have but are still new to it, this webinar will orient you to the tools and help you get the most out of them.

I’ll also be fielding questions at the end of the webinar. I recommend registering now to reserve your spot – see you Wednesday!

The post The top 9 Percona Toolkit tools that can make your job easier: May 7 Webinar appeared first on MySQL Performance Blog.

How to identify and cure MySQL replication slave lag

May 2, 2014 - 12:00am

Here on the Percona MySQL Support team, we often see issues where a customer is complaining about replication delays – and many times the problem ends up being tied to MySQL replication slave lag. This of course is nothing new for MySQL users and we’ve had a few posts here on the MySQL Performance Blog on this topic over the years (two particularly popular post in the past were: “Reasons for MySQL Replication Lag” and “Managing Slave Lag with MySQL Replication,” both by Percona CEO Peter Zaitsev).

In today’s post, however, I will share some new ways of identifying delays in replication – including possible causes of lagging slaves – and how to cure this problem.

How to identify Replication Delay
MySQL replication works with two threads, IO_THREAD & SQL_THREAD. IO_THREAD connects to a master, reads binary log events from the master as they come in and just copies them over to a local log file called relay log. On the other hand, SQL_THREAD reads events from a relay log stored locally on the replication slave (the file that was written by IO thread) and then applies them as fast as possible. Whenever replication delays, it’s important to discover first whether it’s delaying on slave IO_THREAD or slave SQL_THREAD.

Normally, I/O thread would not cause a huge replication delay as it is just reading the binary logs from the master. However, It depends on the network connectivity, network latency… how fast is that between the servers. The Slave I/O thread could be slow because of high bandwidth usage. Usually, when the slave IO_THREAD is able to read binary logs quickly enough it copies and piles up the relay logs on the slave – which is one indication that the slave IO_THREAD is not the culprit of slave lag.

On the other hand, when the slave SQL_THREAD is the source of replication delays it is probably because of queries coming from the replication stream are taking too long to execute on the slave. This is sometimes because of different hardware between master/slave, different schema indexes, workload. Moreover, the slave OLTP workload sometimes causes replication delays because of locking. For instance, if a long-running read against a MyISAM table blocks the SQL thread, or any transaction against an InnoDB table creates an IX lock and blocks DDL in the SQL thread. Also, take into account that slave is single threaded prior to MySQL 5.6, which would be another reason for delays on the slave SQL_THREAD.

Let me show you via master status/slave status example to identify either slave is lagging on slave IO_THREAD or slave SQL_THREAD.

mysql-master> SHOW MASTER STATUS; +------------------+--------------+------------------+------------------------------------------------------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+--------------+------------------+------------------------------------------------------------------+ | mysql-bin.018196 | 15818564 | | bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:1-2331947 | +------------------+--------------+------------------+------------------------------------------------------------------+ mysql-slave> SHOW SLAVE STATUS\G *************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay log Master_Host: master.example.com Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.018192 Read_Master_Log_Pos: 10050480 Relay_Log_File: mysql-relay-bin.001796 Relay_Log_Pos: 157090 Relay_Master_Log_File: mysql-bin.018192 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 5395871 Relay_Log_Space: 10056139 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 230775 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 2 Master_UUID: bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:2-973166 Master_Info_File: /var/lib/mysql/i1/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Reading event from the relay log Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:2-973166 Executed_Gtid_Set: bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:2-973166, ea75c885-c2c5-11e3-b8ee-5cf3fcfc9640:1-1370 Auto_Position: 1

This clearly suggests that the slave IO_THREAD is lagging and obviously because of that the slave SQL_THREAD is lagging, too, and it yields replication delays. As you can see the Master log file is mysql-bin.018196 (File parameter from master status) and slave IO_THREAD is on mysql-bin.018192 (Master_Log_File from slave status) which indicates slave IO_THREAD is reading from that file, while on master it’s writing on mysql-bin.018196, so the slave IO_THREAD is behind by 4 binlogs. Meanwhile, the slave SQL_THREAD is reading from same file i.e. mysql-bin.018192 (Relay_Master_Log_File from slave status) This indicates that the slave SQL_THREAD is applying events fast enough, but it’s lagging too, which can be observed from the difference between Read_Master_Log_PosExec_Master_Log_Pos from show slave status output.

You can calculate slave SQL_THREAD lag from Read_Master_Log_PosExec_Master_Log_Pos in general as long as Master_Log_File parameter output from show slave status and Relay_Master_Log_File parameter from show slave status output are the same. This will give you rough idea how fast slave SQL_THREAD is applying events. As I mentioned above, the slave IO_THREAD is lagging as in this example then off course slave SQL_THREAD is behind too. You can read detailed description of show slave status output fields here.

Also, the Seconds_Behind_Master parameter shows a huge delay in seconds. However, this can be misleading, because it only measures the difference between the timestamps of the relay log most recently executed, versus the relay log entry most recently downloaded by the IO_THREAD. If there are more binlogs on the master, the slave doesn’t figure them into the calculation of Seconds_behind_master.  You can get a more accurate measure of slave lag using pt-heartbeat from Percona Toolkit. So, we learned how to check replication delays – either it’s slave IO_THREAD or slave SQL_THREAD. Now, let me provide some tips and suggestions for what exactly causing this delay.

Tips and Suggestions What Causing Replication Delay & Possible Fixes
Usually, the slave IO_THREAD is behind because of slow network between master/slave. Most of the time, enabling slave_compressed_protocol helps to mitigate slave IO_THREAD lag. One other suggestion is to disable binary logging on slave as it’s IO intensive too unless you required it for point in time recovery.

To minimize slave SQL_THREAD lag, focus on query optimization. My recommendation is to enable the configuration option log_slow_slave_statements so that the queries executed by slave that take more than long_query_time will be logged to the slow log. To gather more information about query performance, I would also recommend setting the configuration option log_slow_verbosity to “full”.

This way we can see if there are queries executed by slave SQL_thread that are taking long time to complete. You can follow my previous post about how to enable slow query log for specific time period with mentioned options here.  And as a reminder,  log_slow_slave_statements as variable were first introduced in Percona Server 5.1 which is now part of vanilla MySQL from version 5.6.11 In upstream version of MySQL Server log_slow_slave_statements were introduced as command line option. Details can be found here while log_slow_verbosity is Percona Server specific feature.

One another reason for delay on slave SQL_THREAD if you use row based binlog format is that if your any database table missing primary key or unique key then it will scan all rows of the table for DML on slave and causes replication delays so make sure all your tables should have primary key or unique key. Check this bug report for details http://bugs.mysql.com/bug.php?id=53375  You can use below query on slave to identify which of database tables missing primary or unique key.

mysql> SELECT t.table_schema,t.table_name,engine FROM information_schema.tables t INNER JOIN information_schema .columns c on t.table_schema=c.table_schema and t.table_name=c.table_name GROUP BY t.table_schema,t.table_name HAVING sum(if(column_key in ('PRI','UNI'), 1,0)) =0;

One improvement is made for this case in MySQL 5.6, where in memory hash is used  slave_rows_search_algorithms comes to the rescue.

Note that Seconds_Behind_Master is not updated while we read huge RBR event, So, “lagging” may be related to just that – we had not completed reading of the event. For example, in row based replication huge transactions may cause delay on slave side e.g. if you have 10 million rows table and you do “DELETE FROM table WHERE id < 5000000″ 5M rows will be sent to slave, each row separately which will be painfully slow. So, if you have to delete oldest rows time to time from huge table using Partitioning might be good alternative for this for some kind of workloads where instead using DELETE use DROP old partition may be good and only statement is replicated because it will be DDL operation.

To explain it better, let suppose you have partition1 holding rows of ID’s from  1 to 1000000, partition2 – ID’s from 1000001 to 2000000 and so on so instead of deleting via statement “DELETE FROM table WHERE ID<=1000000;” you can do “ALTER TABLE DROP partition1;” instead. For alter partitioning operations check manual – Check this wonderful post too from my colleague Roman explaining possible grounds for replication delays here

pt-stalk is one of finest tool from Percona Toolkit which collects diagnostics data when problems occur. You can setup pt-stalk as follows so whenever there is a slave lag it can log diagnostic information which we can be later analyze to check to see what exactly causing the lag.

Here is how you can setup pt-stalk so that it captures diagnostic data when there is slave lag:

------- pt-plug.sh contents #!/bin/bash trg_plugin() { mysqladmin $EXT_ARGV ping &> /dev/null mysqld_alive=$? if [[ $mysqld_alive == 0 ]] then seconds_behind_master=$(mysql $EXT_ARGV -e "show slave status" --vertical | grep Seconds_Behind_Master | awk '{print $2}') echo $seconds_behind_master else echo 1 fi } # Uncomment below to test that trg_plugin function works as expected #trg_plugin ------- -- That's the pt-plug.sh file you would need to create and then use it as below with pt-stalk: $ /usr/bin/pt-stalk --function=/root/pt-plug.sh --variable=seconds_behind_master --threshold=300 --cycles=60 --notify-by-email=muhammad@example.com --log=/root/pt-stalk.log --pid=/root/pt-stalk.pid --daemonize

You can adjust the threshold, currently its 300 seconds, combining that with –cycles, it means that if seconds_behind_master value is >= 300 for 60 seconds or more then pt-stalk will start capturing data. Adding –notify-by-email option will notify via email when pt-stalk captures data. You can adjust the pt-stalk thresholds accordingly so that’s how it triggers to collect diagnostic data during problem.

Conclusion
A lagging slave is a tricky problem but a common issue in MySQL replication. I’ve tried to cover most aspects of replication delays in this post. Please let me know in the comments section if you know of any other reasons for replication delay.

The post How to identify and cure MySQL replication slave lag appeared first on MySQL Performance Blog.

Parallel Query for MySQL with Shard-Query

May 1, 2014 - 3:00am

While Shard-Query can work over multiple nodes, this blog post focuses on using Shard-Query with a single node.  Shard-Query can add parallelism to queries which use partitioned tables.  Very large tables can often be partitioned fairly easily. Shard-Query can leverage partitioning to add paralellism, because each partition can be queried independently. Because MySQL 5.6 supports the partition hint, Shard-Query can add parallelism to any partitioning method (even subpartioning) on 5.6 but it is limited to RANGE/LIST partitioning methods on early versions.

The output from Shard-Query is from the commandline client, but you can use MySQL proxy to communicate with Shard-Query too.

In the examples I am going to use the schema from the Star Schema Benchmark.  I generated data for scale factor 10, which means about 6GB of data in the largest table. I am going to show a few different queries, and explain how Shard-Query executes them in parallel.

Here is the DDL for the lineorder table, which I will use for the demo queries:

CREATE TABLE IF NOT EXISTS lineorder ( LO_OrderKey bigint not null, LO_LineNumber tinyint not null, LO_CustKey int not null, LO_PartKey int not null, LO_SuppKey int not null, LO_OrderDateKey int not null, LO_OrderPriority varchar(15), LO_ShipPriority char(1), LO_Quantity tinyint, LO_ExtendedPrice decimal, LO_OrdTotalPrice decimal, LO_Discount decimal, LO_Revenue decimal, LO_SupplyCost decimal, LO_Tax tinyint, LO_CommitDateKey int not null, LO_ShipMode varchar(10), primary key(LO_OrderDateKey,LO_PartKey,LO_SuppKey,LO_Custkey,LO_OrderKey,LO_LineNumber) ) PARTITION BY HASH(LO_OrderDateKey) PARTITIONS 8;

Notice that the lineorder table is partitioned by HASH(LO_OrderDateKey) into 8 partitions.  I used 8 partitions and my test box has 4 cores. It does not hurt to have more partitions than cores. A number of partitions that is two or three times the number of cores generally works best because it keeps each partition small, and smaller partitions are faster to scan. If you have a very large table, a larger number of partitions may be acceptable. Shard-Query will submit a query to Gearman for each partition, and the number of Gearman workers controls the parallelism.

The SQL for the first demo is:

SELECT COUNT(DISTINCT LO_OrderDateKey) FROM lineorder;

Here is the explain from regular MySQL:

mysql> explain select count(distinct LO_OrderDateKey) from lineorder\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: lineorder type: index possible_keys: PRIMARY key: PRIMARY key_len: 25 ref: NULL rows: 58922188 Extra: Using index 1 row in set (0.00 sec)

 

So it is basically a full table scan. It takes a long time:

mysql> select count(distinct LO_OrderDateKey) from lineorder; +---------------------------------+ | count(distinct LO_OrderDateKey) | +---------------------------------+ | 2406 | +---------------------------------+ 1 row in set (4 min 48.63 sec)

 

Shard-Query executes this query differently from MySQL. It sends a query to each partition, in parallel like the following queries:

Array ( [0] => SELECT LO_OrderDateKey AS expr_2839651562 FROM lineorder PARTITION(p0) AS `lineorder` WHERE 1=1 AND 1=1 GROUP BY LO_OrderDateKey [1] => SELECT LO_OrderDateKey AS expr_2839651562 FROM lineorder PARTITION(p1) AS `lineorder` WHERE 1=1 AND 1=1 GROUP BY LO_OrderDateKey [2] => SELECT LO_OrderDateKey AS expr_2839651562 FROM lineorder PARTITION(p2) AS `lineorder` WHERE 1=1 AND 1=1 GROUP BY LO_OrderDateKey [3] => SELECT LO_OrderDateKey AS expr_2839651562 FROM lineorder PARTITION(p3) AS `lineorder` WHERE 1=1 AND 1=1 GROUP BY LO_OrderDateKey [4] => SELECT LO_OrderDateKey AS expr_2839651562 FROM lineorder PARTITION(p4) AS `lineorder` WHERE 1=1 AND 1=1 GROUP BY LO_OrderDateKey [5] => SELECT LO_OrderDateKey AS expr_2839651562 FROM lineorder PARTITION(p5) AS `lineorder` WHERE 1=1 AND 1=1 GROUP BY LO_OrderDateKey [6] => SELECT LO_OrderDateKey AS expr_2839651562 FROM lineorder PARTITION(p6) AS `lineorder` WHERE 1=1 AND 1=1 GROUP BY LO_OrderDateKey [7] => SELECT LO_OrderDateKey AS expr_2839651562 FROM lineorder PARTITION(p7) AS `lineorder` WHERE 1=1 AND 1=1 GROUP BY LO_OrderDateKey )

You will notice that there is one query for each partition.  Those queries will be sent to Gearman and executed in parallel by as many Gearman workers as possible (in this case 4.)  The output of the queries go into a coordinator table, and then another query does a final aggregation.  That query looks like this:

SELECT COUNT(distinct expr_2839651562) AS `count` FROM `aggregation_tmp_73522490`

The Shard-Query time:

select count(distinct LO_OrderDateKey) from lineorder; Array ( [count ] => 2406 ) 1 rows returned Exec time: 0.10923719406128

That isn’t a typo, it really is sub-second compared to minutes in regular MySQL.

This is because Shard-Query uses GROUP BY to answer this query and a  loose index scan of the PRIMARY KEY is possible:

mysql> explain partitions SELECT LO_OrderDateKey AS expr_2839651562 -> FROM lineorder PARTITION(p7) AS `lineorder` WHERE 1=1 AND 1=1 GROUP BY LO_OrderDateKey -> \G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: lineorder partitions: p7 type: range possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: NULL rows: 80108 Extra: Using index for group-by 1 row in set (0.00 sec)

Next another simple query will be tested, first on regular MySQL:

mysql> select count(*) from lineorder; +----------+ | count(*) | +----------+ | 59986052 | +----------+ 1 row in set (4 min 8.70 sec)

Again, the EXPLAIN shows a full table scan:

mysql> explain select count(*) from lineorder\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: lineorder type: index possible_keys: NULL key: PRIMARY key_len: 25 ref: NULL rows: 58922188 Extra: Using index 1 row in set (0.00 sec)

Now, Shard-Query can’t do anything special to speed up this query, except to execute it in parallel, similar to the first query:

[0] => SELECT COUNT(*) AS expr_3190753946 FROM lineorder PARTITION(p0) AS `lineorder` WHERE 1=1 AND 1=1 [1] => SELECT COUNT(*) AS expr_3190753946 FROM lineorder PARTITION(p1) AS `lineorder` WHERE 1=1 AND 1=1 [2] => SELECT COUNT(*) AS expr_3190753946 FROM lineorder PARTITION(p2) AS `lineorder` WHERE 1=1 AND 1=1 [3] => SELECT COUNT(*) AS expr_3190753946 FROM lineorder PARTITION(p3) AS `lineorder` WHERE 1=1 AND 1=1 ...

The aggregation SQL is similar, but this time the aggregate function is changed to SUM to combine the COUNT from each partition:

SELECT SUM(expr_3190753946) AS ` count ` FROM `aggregation_tmp_51969525`

And the query is quite a bit faster at 140.24 second compared with MySQL’s 248.7 second result:

Array ( [count ] => 59986052 ) 1 rows returned Exec time: 140.24419403076

Finally, I want to look at a more complex query that uses joins and aggregation.

mysql> explain select d_year, c_nation, sum(lo_revenue - lo_supplycost) as profit from lineorder join dim_date on lo_orderdatekey = d_datekey join customer on lo_custkey = c_customerkey join supplier on lo_suppkey = s_suppkey join part on lo_partkey = p_partkey where c_region = 'AMERICA' and s_region = 'AMERICA' and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2') group by d_year, c_nation order by d_year, c_nation; +----+-------------+-----------+--------+---------------+---------+---------+--------------------------+------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+--------+---------------+---------+---------+--------------------------+------+---------------------------------+ | 1 | SIMPLE | dim_date | ALL | PRIMARY | NULL | NULL | NULL | 5 | Using temporary; Using filesort | | 1 | SIMPLE | lineorder | ref | PRIMARY | PRIMARY | 4 | ssb.dim_date.D_DateKey | 89 | NULL | | 1 | SIMPLE | supplier | eq_ref | PRIMARY | PRIMARY | 4 | ssb.lineorder.LO_SuppKey | 1 | Using where | | 1 | SIMPLE | customer | eq_ref | PRIMARY | PRIMARY | 4 | ssb.lineorder.LO_CustKey | 1 | Using where | | 1 | SIMPLE | part | eq_ref | PRIMARY | PRIMARY | 4 | ssb.lineorder.LO_PartKey | 1 | Using where | +----+-------------+-----------+--------+---------------+---------+---------+--------------------------+------+---------------------------------+ 5 rows in set (0.01 sec)

Here is the query on regular MySQL:

mysql> select d_year, c_nation, sum(lo_revenue - lo_supplycost) as profit from lineorder join dim_date on lo_orderdatekey = d_datekey join customer on lo_custkey = c_customerkey join supplier on lo_suppkey = s_suppkey join part on lo_partkey = p_partkey where c_region = 'AMERICA' and s_region = 'AMERICA' and (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2') group by d_year, c_nation order by d_year, c_nation; +--------+---------------+--------------+ | d_year | c_nation | profit | +--------+---------------+--------------+ | 1992 | ARGENTINA | 102741829748 | ... | 1998 | UNITED STATES | 61345891337 | +--------+---------------+--------------+ 35 rows in set (11 min 56.79 sec)

Again, Shard-Query splits up the query to run over each partition (I won’t bore you with the details) and it executes the query faster than MySQL, in 343.3 second compared to ~720:

Array ( [d_year] => 1998 [c_nation] => UNITED STATES [profit] => 61345891337 ) 35 rows returned Exec time: 343.29854893684

I hope you see how using Shard-Query can speed up queries without using sharding, on just a single server. All you really need to do is add partitioning.

You can get Shard-Query from GitHub at http://github.com/greenlion/swanhart-tools

Please note: Configure and install Shard-Query as normal, but simply use one node and set the column option (the shard column) to “nocolumn” or false, because you are not required to use a shard column if you are not sharding.

The post Parallel Query for MySQL with Shard-Query appeared first on MySQL Performance Blog.

Galera 3.5 for Percona XtraDB Cluster 5.6 is now available

April 30, 2014 - 10:00am

Percona is glad to announce the release of Galera 3.5 package for Percona XtraDB Cluster 5.6. Binaries are available from our software repositories. All of Percona‘s software is open-source and free, all the details about this release can be found in the Galera 3.5 milestone at Launchpad.

This update contains following bug fixes:

  • A crash due to of-by-one error in certification index cleanup has been fixed. Bugs fixed #1309227 and #1267507.
  • Fixed the corruption in gcache that could lead to a cluster crash. Bugs fixed #1301616 and #1152565.
  • Joining node crashed under IST when number of writesets was high and under load. Bug fixed #1284803.
  • Due to a bug in certification index cleanup, attempt to match against an empty key would cause node shutdown and possible node consistency compromise. Bug fixed #1274199.
  • New wsrep_provider_options option repl.max_ws_size has been introduced to make the maximum writeset size configurable. Bug fixed #1270921.

Documentation for Percona XtraDB Cluster is available online along with the installation and upgrade instructions. We did our best to eliminate bugs and problems during the testing release, but this is a software, so bugs are expected. If you encounter them, please report them to our bug tracking system.

The post Galera 3.5 for Percona XtraDB Cluster 5.6 is now available appeared first on MySQL Performance Blog.

ScaleArc: Real-world application testing with WordPress (benchmark test)

April 29, 2014 - 6:00am

ScaleArc recently hired Percona to perform various tests on its database traffic management product. This post is the outcome of the benchmarks carried out by me and ScaleArc co-founder and chief architect, Uday Sawant.

The goal of this benchmark was to identify ScaleArc’s overhead using a real-world application – the world’s most popular (according to wikipedia) content management system and blog engine: WordPress.

The tests also sought to identify the benefit of caching for this type of workload. The caching parameters represent more real-life circumstances than we applied in the sysbench performance tests – the goal here was not just to saturate the cache. For this reason, we created an artificial WordPress blog with generated data.

The size of the database was roughly 4G. For this particular test, we saw that using ScaleArc introduces very little overhead and caching increased the throughput 3.5 times at peak capacity. In terms of response times, response times on queries for which we had a cache hit decreased substantially. For example, a 5-second main page load became less than 1 second when we had cache hits on certain queries. It’s a bit hard to talk about response time here in general, because WordPress itself has different requests that are associated with different costs (computationally) and which have different response times.

Test description

The pre-generated test database contained the following:

  • 100 users
  • 25 categories
  • 100.000 posts (stories)
  • 300.000 comments (3 per post)

One iteration of the load contained the following:

  • Homepage retrieval
  • 10 story (post) page retrieval
  • 3 category page retrieval
  • Log in as a random user
  • That random user posted a new story and commented on an existing post

We think that the usage pattern is close to reality – most people just visit blogs, but some write posts and comments. For the test, we used WordPress version 3.8.1. We wrote a simple shell script that could do these iterations using multiple processes. Some of this testing pattern, however, is not realistic. Some posts will always have many more comments than others, and some posts won’t have any comments at all. This test doesn’t take that nuance into account, but that doesn’t change the big picture. Choosing a random post to comment on will give us a uniform comment distribution.

We measured 3 scenarios:

  • Direct connection to the database (direct_wp).
  • Connection through ScaleArc without caching.
  • Connection through ScaleArc with caching enabled.

When caching is enabled, queries belonging to comments were cached for 5 minutes, queries belonging to the home page were cached for 15 minutes, and queries belonging to stories (posts) were cached for 30 minutes.

We varied the number of parallel iterations. Each test ran for an hour.

Results for direct database connection

Threads: 1, Iterations: 180, Time[sec]: 3605 Threads: 2, Iterations: 356, Time[sec]: 3616 Threads: 4, Iterations: 780, Time[sec]: 3618 Threads: 8, Iterations: 1408, Time[sec]: 3614 Threads: 16, Iterations: 2144, Time[sec]: 3619 Threads: 32, Iterations: 2432, Time[sec]: 3646 Threads: 64, Iterations: 2368, Time[sec]: 3635 Threads: 128, Iterations: 2432, Time[sec]: 3722

The result above is the summary output of the script we used. The data shows we reach peak capacity at 32 concurrent threads.

Results for connecting through ScaleArc

Threads: 1, Iterations: 171, Time[sec]: 3604 Threads: 2, Iterations: 342, Time[sec]: 3606 Threads: 4, Iterations: 740, Time[sec]: 3619 Threads: 8, Iterations: 1304, Time[sec]: 3609 Threads: 16, Iterations: 2048, Time[sec]: 3625 Threads: 32, Iterations: 2336, Time[sec]: 3638 Threads: 64, Iterations: 2304, Time[sec]: 3678 Threads: 128, Iterations: 2304, Time[sec]: 3675

The results are almost identical. Because a typical query in this example is quite expensive, the overhead of ScaleArc here is barely measurable.

Results for connecting through ScaleArc with caching enabled

Threads: 1, Iterations: 437, Time[sec]: 3601 Threads: 2, Iterations: 886, Time[sec]: 3604 Threads: 4, Iterations: 1788, Time[sec]: 3605 Threads: 8, Iterations: 3336, Time[sec]: 3600 Threads: 16, Iterations: 6880, Time[sec]: 3606 Threads: 32, Iterations: 8832, Time[sec]: 3600 Threads: 64, Iterations: 9024, Time[sec]: 3614 Threads: 128, Iterations: 8576, Time[sec]: 3630

Caching improved response time even for a single thread. At 32 threads, we see more than 3.5x improvement in throughput. Caching is a great help here for the same reason the overhead is barely measurable: the queries are more expensive in general, so more resources are spared when they are not run.

Throughput
From the web server’s access log, we created a per-second throughput graph. We are talking about requests per second here. Please note that the variance is relatively high, because the requests are not identical – retrieving the main page is a different request and has a different cost then retrieving a story page.

The red and blue dots are literally plotted on top of each other – the green bar is always on top of them. The green ones have a greater variance because even though we had caching enabled during the test, we used more realistic TTLs in this cache, so cached items did actually expire during the test. When the cache was expired, requests took longer, so the throughput was lower. When the cache was populated, requests took a shorter amount of time, so the throughput was higher.

CPU utilization

CPU utilization characteristics are pretty much the same on the left and right sides (direct connection on the left and ScaleArc without caching on the right). In the middle, we can see that the web server’s CPU gets completely utilized sooner with caching. Because data comes faster from the cache, it serves more requests, which costs more resources computationally. On the other hand, the database server’s CPU utilization is significantly lower when caching is used. The bar is on the top on the left and right sides – in the middle, we have bars both at the top and at the bottom. The test is utilizing the database server’s CPU completely only when we hit cache misses.

Because ScaleArc serves the cache hits, and these requests are not hitting the database, the database is not used at all when requests are served from the cache. In the case of tests with caching on, the bottleneck became the web server, which is a component that is a lot easier to scale than the database.

There are two more key points to take away here. First, regardless of whether caching is turned on or off, this workload is not too much for ScaleArc. Second, the client we ran the measurement scripts on was not the bottleneck.

Conclusion
The goal of these benchmarks was to show that ScaleArc has very little overhead and that caching can be beneficial for a real-world application, which has a “read mostly” workload with relatively expensive reads (expensive means that network round trip is not a significant contributor in the read’s response time). A blog is exactly that type – typically, more people are visiting than commenting. The test showed that ScaleArc is capable of supporting this scenario well, delivering 3.5x throughput at peak capacity. It’s worth mentioning that if this system needs to be scaled, more web servers could be added as well as more read slaves. Those read slaves can take up read queries by a WordPress plugin which allows this, or by ScaleArc’s read-write splitting facility (it threats autocommit selects as reads), in the later case, the caching benefit is present for the slaves as well.

The post ScaleArc: Real-world application testing with WordPress (benchmark test) appeared first on MySQL Performance Blog.

OOM relation to vm.swappiness=0 in new kernel

April 28, 2014 - 7:52am

I have recently been involved in diagnosing the reasons behind OOM invocation that would kill the MySQL server process. Of course these servers were primarily running MySQL. As such the MySQL server process was the one with the largest amount of memory allocated.

But the strange thing was that in all the cases, there was no swapping activity seen and there were enough pages in the page cache. Ironically all of these servers were CentOS 6.4 running kernel version 2.6.32-358. Another commonality was the fact that vm.swappiness was set to 0. This is a pretty much standard practice and one that is applied on nearly every server that runs MySQL.

Looking into this further I realized that there was a change introduced in kernel 3.5-rc1 that altered the swapping behavior when “vm.swappiness=0″.

Below is the description of the commit that changed “vm.swappiness=0″ behavior, together with the diff:

$ git show fe35004fbf9eaf67482b074a2e032abb9c89b1dd commit fe35004fbf9eaf67482b074a2e032abb9c89b1dd Author: Satoru Moriya <satoru.moriya@hds.com> Date: Tue May 29 15:06:47 2012 -0700 mm: avoid swapping out with swappiness==0 Sometimes we'd like to avoid swapping out anonymous memory. In particular, avoid swapping out pages of important process or process groups while there is a reasonable amount of pagecache on RAM so that we can satisfy our customers' requirements. OTOH, we can control how aggressive the kernel will swap memory pages with /proc/sys/vm/swappiness for global and /sys/fs/cgroup/memory/memory.swappiness for each memcg. But with current reclaim implementation, the kernel may swap out even if we set swappiness=0 and there is pagecache in RAM. This patch changes the behavior with swappiness==0. If we set swappiness==0, the kernel does not swap out completely (for global reclaim until the amount of free pages and filebacked pages in a zone has been reduced to something very very small (nr_free + nr_filebacked < high watermark)). Signed-off-by: Satoru Moriya <satoru.moriya@hds.com> Acked-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Jerome Marchand <jmarchan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> diff --git a/mm/vmscan.c b/mm/vmscan.c index 67a4fd4..ee97530 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1761,10 +1761,10 @@ static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc, * proportional to the fraction of recently scanned pages on * each list that were recently referenced and in active use. */ - ap = (anon_prio + 1) * (reclaim_stat->recent_scanned[0] + 1); + ap = anon_prio * (reclaim_stat->recent_scanned[0] + 1); ap /= reclaim_stat->recent_rotated[0] + 1; - fp = (file_prio + 1) * (reclaim_stat->recent_scanned[1] + 1); + fp = file_prio * (reclaim_stat->recent_scanned[1] + 1); fp /= reclaim_stat->recent_rotated[1] + 1; spin_unlock_irq(&mz->zone->lru_lock); @@ -1777,7 +1777,7 @@ out: unsigned long scan; scan = zone_nr_lru_pages(mz, lru); - if (priority || noswap) { + if (priority || noswap || !vmscan_swappiness(mz, sc)) { scan >>= priority; if (!scan && force_scan) scan = SWAP_CLUSTER_MAX;

This change was merged into the RHEL kernel 2.6.32-303:

* Mon Aug 27 2012 Jarod Wilson <jarod@redhat.com> [2.6.32-303.el6] ... - [mm] avoid swapping out with swappiness==0 (Satoru Moriya) [787885]

This obviously changed the way we think about “vm.swappiness=0″. Previously, setting this to 0 was thought to reduce the tendency to swap userland processes but not disable that completely. As such it was expected to see little swapping instead of OOM.

This applies to all RHEL/CentOS kernels > 2.6.32-303 and to other distributions that provide newer kernels such as Debian and Ubuntu. Or any other distribution where this change has been backported as in RHEL.

Let me share with you memory zones related statistics that were logged to the system log from one of the OOM event.

Mar 11 11:01:45 db01 kernel: Node 0 DMA free:15680kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15284kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Mar 11 11:01:45 db01 kernel: Node 0 DMA32 free:45448kB min:25140kB low:31424kB high:37708kB active_anon:1741812kB inactive_anon:520348kB active_file:4792kB inactive_file:462576kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3072160kB mlocked:0kB dirty:386328kB writeback:76268kB mapped:936kB shmem:0kB slab_reclaimable:20420kB slab_unreclaimable:6964kB kernel_stack:0kB pagetables:572kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:142592 all_unreclaimable? no Mar 11 11:01:45 db01 kernel: Node 0 Normal free:42436kB min:42316kB low:52892kB high:63472kB active_anon:3041852kB inactive_anon:643624kB active_file:340156kB inactive_file:1003512kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:5171200kB mlocked:0kB dirty:979444kB writeback:22040kB mapped:15616kB shmem:180kB slab_reclaimable:41052kB slab_unreclaimable:35996kB kernel_stack:2720kB pagetables:19912kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:31552 all_unreclaimable? no

As can be seen the amount of free memory and the amount of memory in the page cache was greater than the high watermark, which prevented any swapping activity. Yet unnecessary memory pressure caused OOM to be invoked which killed the MySQL server process.

MySQL getting OOM’ed is bad for many reasons and can have an undesirable impact such as causing loss of uncommitted transactions or transactions not yet flushed to the log because of innodb_flush_log_at_trx_commit=0, or a much more heavy impact because of cold caches upon restart.

I prefer the old behavior of vm.swappiness and as such I now set it to a value of “1″. Setting vm.swappiness=0 would mean that you will now have to be much more accurate in how you configure the size of various global and session buffers.

The post OOM relation to vm.swappiness=0 in new kernel appeared first on MySQL Performance Blog.

Pages

Contact Us 24 Hours A Day
Support Contact us 24×7
Emergency? Contact us for help now!
Sales North America (888) 316-9775 or
(208) 473-2904
Sales
Europe
+44-208-133-0309 (UK)
0-800-051-8984 (UK)
0-800-181-0665 (GER)
More Numbers
Training (855) 55TRAIN or
(925) 271-5054

 

Share This
]]>