]]>
]]>

Latest MySQL Performance Blog posts

You are here

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 1 hour 30 min ago

Percona XtraBackup 2.1.7 is now available

January 24, 2014 - 12:50am

Percona is glad to announce the release of Percona XtraBackup 2.1.7 on January 24th, 2014. Downloads are available from our download site here and Percona Software Repositories.

This release is the current GA (Generally Available) stable release in the 2.1 series. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases.

New Features

  • Percona XtraBackup has now been rebased on MySQL versions 5.1.73, 5.5.35, 5.6.15 and Percona Server versions 5.1.73-rel14.11 and 5.5.35-rel33.0.

Bugs Fixed

  • After being rebased on MySQL 5.6.11 Percona XtraBackup has been affected by the upstream bug #69780. Fixed by rebasing Percona XtraBackup on MySQL 5.6.15 which contains a fix for the upstream bug. Bug fixed #1203669.
  • Backup directory would need to be specified even for running the innobackupex with --help and --version options. Bug fixed #1223716.
  • When creating an incremental backup with the changed page tracking feature available in Percona Server, innobackupex would fail if the server had the ANSI_QUOTES SQL mode enabled. Bug fixed #1248331.
  • When innobackupex --galera-info is specified and Percona XtraDB Cluster is using GTID replication (version 5.6 only), innobackupex will execute FLUSH BINARY LOGS and then carry the current binary log as indicated in SHOW MASTER STATUS into the backup set. Bug fixed #1250375.
  • Percona XtraBackup did not roll back prepared XA transactions when applying the log. Which was a regression introduced with the original port of Percona XtraBackup patches to 5.6. Fixed by restoring code that has been lost in the port. Bug fixed #1254227.
  • Percona XtraBackup now uses libgcrypt built in randomization functions for setting the Initialization Vector. Bug fixed #1255300.
  • xtrabackup_56 didn’t support ALL_O_DIRECT option for innodb_flush_method in Percona Server 5.6. Bug fixed #1261877.

Other bugs fixed: #1255019, #1268325, #1265759, #1269694, #1271501.

Release notes with all the bugfixes for Percona XtraBackup 2.1.7 are available in our online documentation. All of Percona‘s software is open source and free, all the details of the release can be found in the 2.1.7 milestone at Launchpad. Bugs can be reported on the launchpad bug tracker.

The post Percona XtraBackup 2.1.7 is now available appeared first on MySQL Performance Blog.

Percona Server: Improve Scalability with Percona Thread Pool

January 23, 2014 - 2:28pm

By default, for every client connection the MySQL server spawns a separate thread which will process all statements for this connection. This is the ‘one-thread-per-connection’ model. It’s simple and efficient until some number of connections N is reached. After this point performance of the MySQL server will degrade, mostly due to various contentions caused by N threads that are trying to access shared resources: either system ones like CPU, IO, memory or MySQL specific: structures/locks/etc. To keep the system stable and avoid degradation in the performance we need to limit the number of active threads, and at the same time we do not want to limit number of the client connections. The ‘Thread Pool’ model helps us to achieve that. It allows mapping N client connections to M number of active threads (actually performing work) while demonstrate a smooth and stable throughput for the MySQL server.

There are several implementations of the Thread Pool model:
- Commercial: Oracle/MySQL provides thread_pool plugin as part of the Enterprise subscription
- Open Source: The MariaDB thread_pool developed by Vladislav Vaintroub

In Percona Server we have included the latter, yet we further enhanced and improved it.

To demonstrate how thread pool may help to improve scalability, we run sysbench/OLTP_RW workload up to 16,384 threads with the latest MySQL server, Percona Server and Percona Server with thread_pool setup for IO and CPU bound load on a Dell R720 server with 16Cores/32vCPU.

The current thread pool implementation of Percona server is built into the server, unlike Oracle’s commercial version which is implemented as a plugin. In order to enable Thread Pool with Percona Server, you simply need to specify ‘thread_handling=pool-of-threads’ in the my.cnf file (before startup/restart) and also adjust the number of thread_pool groups with the ‘thread_pool_size’ variable. You can do the latter after a server start. In our runs with thread_pool we used ‘thread_pool_size=36′.

IO bound: sysbench dataset 32 tables/12M rows each (~100GB), InnoDB buffer pool=25GB

In-memory/CPU bound: sysbench dataset 32 tables/12M rows each (~100GB), InnoDB buffer pool=100GB

As you can see in both scenarios above, after 1024 threads, the standalone server is not really capable with keeping throughput at the same level. However, with thread_pool enabled the throughput is quite stable and smooth up to 16384 client connections.

Conclusion: if you regularly go over > 512/1024 connections, it is definitely worth trying Percona’s thread pool implementation to protect your server from serious performance degradations due to server overload.

The post Percona Server: Improve Scalability with Percona Thread Pool appeared first on MySQL Performance Blog.

FOSDEM 2014 MySQL & Friends devroom

January 23, 2014 - 12:00am

FOSDEM is a free, yearly event that offers open source communities a place to meet, share ideas and collaborate. It takes place in Brussels, Belgium each February.

10 years ago now, the first MySQL devroom was organized there by David Axmark, marking the first time MySQL users got their own space to come together and discuss at the event. On this anniversary edition, Percona consultants will be present at the MySQL & Friends Community Booth, as well as giving talks in the devroom (full schedule):

* Frédéric Descamps at 12.35 with 15 Tips to improve your Galera Cluster
* Stéphane Combaudon at 15:05 with Handling failover with MySQL 5.6 and Global Transaction IDs
* Peter Boros at 17.35 with Practical sysbench

The MySQL & Friends devroom will end with its traditional Community Dinner, organized by Percona’s Liz, Kenny and Dimitri  and co-sponsored by Percona, Oracle and MariaDB. Registration for the dinner is MANDATORY this year, so don’t forget to reserve your spot !

Looking forward to seeing you all on Friday night during the traditional beer event opening FOSDEM 2014!

The post FOSDEM 2014 MySQL & Friends devroom appeared first on MySQL Performance Blog.

10 MySQL settings to tune after installation

January 22, 2014 - 8:46am

When we are hired for a MySQL performance audit, we are expected to review the MySQL configuration and to suggest improvements. Many people are surprised because in most cases, we only suggest to change a few settings even though hundreds of options are available. The goal of this post is to give you a list of some of the most critical settings.

We already made such suggestions in the past here on this blog, but things have changed a lot in the MySQL world since then!

Before we start…

Even experienced people can make mistakes that can cause a lot of trouble. So before blindly applying the recommendations of this post, please keep in mind the following items:

  • Change one setting at a time! This is the only way to estimate if a change is beneficial.
  • Most settings can be changed at runtime with SET GLOBAL. It is very handy and it allows you to quickly revert the change if it creates any problem. But in the end, you want the setting to be adjusted permanently in the configuration file.
  • A change in the configuration is not visible even after a MySQL restart? Did you use the correct configuration file? Did you put the setting in the right section? (all settings in this post belong to the [mysqld] section)
  • The server refuses to start after a change: did you use the correct unit? For instance, innodb_buffer_pool_size should be set in MB while max_connection is dimensionless.
  • Do not allow duplicate settings in the configuration file. If you want to keep track of the changes, use version control.
  • Don’t do naive math, like “my new server has 2x RAM, I’ll just make all the values 2x the previous ones”.
Basic settings

Here are 3 settings that you should always look at. If you do not, you are very likely to run into problems very quickly.

innodb_buffer_pool_size: this is the #1 setting to look at for any installation using InnoDB. The buffer pool is where data and indexes are cached: having it as large as possible will ensure you use memory and not disks for most read operations. Typical values are 5-6GB (8GB RAM), 20-25GB (32GB RAM), 100-120GB (128GB RAM).

innodb_log_file_size: this is the size of the redo logs. The redo logs are used to make sure writes are fast and durable and also during crash recovery. Up to MySQL 5.1, it was hard to adjust, as you wanted both large redo logs for good performance and small redo logs for fast crash recovery. Fortunately crash recovery performance has improved a lot since MySQL 5.5 so you can now have good write performance and fast crash recovery. Until MySQL 5.5 the total redo log size was limited to 4GB (the default is to have 2 log files). This has been lifted in MySQL 5.6.

Starting with innodb_log_file_size = 512M (giving 1GB of redo logs) should give you plenty of room for writes. If you know your application is write-intensive and you are using MySQL 5.6, you can start with innodb_log_file_size = 4G.

max_connections: if you are often facing the ‘Too many connections’ error, max_connections is too low. It is very frequent that because the application does not close connections to the database correctly, you need much more than the default 151 connections. The main drawback of high values for max_connections (like 1000 or more) is that the server will become unresponsive if for any reason it has to run 1000 or more active transactions. Using a connection pool at the application level or a thread pool at the MySQL level can help here.

InnoDB settings

InnoDB has been the default storage engine since MySQL 5.5 and it is much more frequently used than any other storage engine. That’s why it should be configured carefully.

innodb_file_per_table: this setting will tell InnoDB if it should store data and indexes in the shared tablespace (innodb_file_per_table = OFF) or in a separate .ibd file for each table (innodb_file_per_table= ON). Having a file per table allows you to reclaim space when dropping, truncating or rebuilding a table. It is also needed for some advanced features such as compression. However it does not provide any performance benefit. The main scenario when you do NOT want file per table is when you have a very high number of tables (say 10k+).

With MySQL 5.6, the default value is ON so you have nothing to do in most cases. For previous versions, you should set it to ON prior to loading data as it has an effect on newly created tables only.

innodb_flush_log_at_trx_commit: the default setting of 1 means that InnoDB is fully ACID compliant. It is the best value when your primary concern is data safety, for instance on a master. However it can have a significant overhead on systems with slow disks because of the extra fsyncs that are needed to flush each change to the redo logs. Setting it to 2 is a bit less reliable because committed transactions will be flushed to the redo logs only once a second, but that can be acceptable on some situations for a master and that is definitely a good value for a replica. 0 is even faster but you are more likely to lose some data in case of a crash: it is only a good value for a replica.

innodb_flush_method: this setting controls how data and logs are flushed to disk. Popular values are O_DIRECT when you have a hardware RAID controller with a battery-protected write-back cache and fdatasync (default value) for most other scenarios. sysbench is a good tool to help you choose between the 2 values.

innodb_log_buffer_size: this is the size of the buffer for transactions that have not been committed yet. The default value (1MB) is usually fine but as soon as you have transactions with large blob/text fields, the buffer can fill up very quickly and trigger extra I/O load. Look at the Innodb_log_waits status variable and if it is not 0, increase innodb_log_buffer_size.

Other settings

query_cache_size: the query cache is a well known bottleneck that can be seen even when concurrency is moderate. The best option is to disable it from day 1 by setting query_cache_size = 0 (now the default on MySQL 5.6) and to use other ways to speed up read queries: good indexing, adding replicas to spread the read load or using an external cache (memcache or redis for instance). If you have already built your application with the query cache enabled and if you have never noticed any problem, the query cache may be beneficial for you. So you should be cautious if you decide to disable it.

log_bin: enabling binary logging is mandatory if you want the server to act as a replication master. If so, don’t forget to also set server_id to a unique value. It is also useful for a single server when you want to be able to do point-in-time recovery: restore your latest backup and apply the binary logs. Once created, binary log files are kept forever. So if you do not want to run out of disk space, you should either purge old files with PURGE BINARY LOGS or set expire_logs_days to specify after how many days the logs will be automatically purged.

Binary logging however is not free, so if you do not need for instance on a replica that is not a master, it is recommended to keep it disabled.

skip_name_resolve: when a client connects, the server will perform hostname resolution, and when DNS is slow, establishing the connection will become slow as well. It is therefore recommended to start the server with skip-name-resolve to disable all DNS lookups. The only limitation is that the GRANT statements must then use IP addresses only, so be careful when adding this setting to an existing system.

Conclusion

There are of course other settings that can make a difference depending on your workload or your hardware: low memory and fast disks, high concurrency, write-intensive workloads for instance are cases when you will need specific tuning. However the goal here is to allow you to quickly get a sane MySQL configuration without spending too much time on changing non-essential MySQL settings or on reading documentation to understand which settings do matter to you.

The post 10 MySQL settings to tune after installation appeared first on MySQL Performance Blog.

Past, present and future of MySQL and variants: linux.conf.au 2014 video

January 22, 2014 - 3:00am

Early this month I spoke at linux.conf.au in Perth, Australia, where I presented “Past, Present and Future of MySQL and variants.”

Here’s my presentation in its entirety. Please let me know if you have any questions or comments. (linux.conf.au is widely regarded by delegates as one of the best community run Linux conferences worldwide.)

The post Past, present and future of MySQL and variants: linux.conf.au 2014 video appeared first on MySQL Performance Blog.

Beware of MySQL 5.6 server UUID when cloning slaves

January 21, 2014 - 7:18am

The other day I was working on an issue where one of the slaves was showing unexpected lag. Interestingly with only the IO thread running the slave was doing significantly more IO as compared to the rate at which the IO thread was fetching the binary log events from the master. A closer inspection showed that this had to do with the error log getting filled up with error messages apparently suggesting that the slave IO thread was disconnecting and reconnecting repeatedly.

Below is the type of error messages that were being generated and were filling up the error log:

2014-01-18 03:33:46 22921 [Note] Slave: received end packet from server, apparent master shutdown: 2014-01-18 03:33:46 22921 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.000001' at position 4239 2014-01-18 03:33:46 22921 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information. 2014-01-18 03:33:46 22921 [Note] Slave: received end packet from server, apparent master shutdown: 2014-01-18 03:33:46 22921 [Note] Slave I/O thread: Failed reading log event, reconnecting to retry, log 'mysql-bin.000001' at position 4239 2014-01-18 03:33:46 22921 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.

These error messages are quite vague and do not suggest the actual problem at all. The error messages appear to point out to the disconnection and reconnection of the IO thread. The rate at which these error messages were being generated was close to 20MB/s which was unnecessarily producing IO load.

Importance of a Unique MySQL 5.6 Server UUID

But anyway coming back to the problem. Looking at the output of running SHOW SLAVE HOSTS on the master showed what the problem was:

master [localhost] {msandbox} (test) > show slave hosts; +-----------+----------+-------+-----------+--------------------------------------+ | Server_id | Host | Port | Master_id | Slave_UUID | +-----------+----------+-------+-----------+--------------------------------------+ | 101 | SBslave1 | 19681 | 1 | 6c27ed6d-7ee1-11e3-be39-6c626d957cff | +-----------+----------+-------+-----------+--------------------------------------+ 1 row in set (0.00 sec) master [localhost] {msandbox} (test) > show slave hosts; +-----------+----------+-------+-----------+--------------------------------------+ | Server_id | Host | Port | Master_id | Slave_UUID | +-----------+----------+-------+-----------+--------------------------------------+ | 102 | SBslave2 | 19682 | 1 | 6c27ed6d-7ee1-11e3-be39-6c626d957cff | +-----------+----------+-------+-----------+--------------------------------------+ 1 row in set (0.00 sec)

The slave hosts for the UUID “6c27ed6d-7ee1-11e3-be39-6c626d957cff” were fluctuating between “SBslave1″ and “SBslave2″. So the actual problem was that there were two slaves running with the same server UUID and that was confusing the master. This was caused by the new slave “SBslave2″ being cloned from the slave “SBslave1″. The cloning process ended up copying the file “auto.cnf” which is present in the MySQL data directory. The auto.cnf file stores the server’s UUID which is used to uniquely identify a server. This is a new feature in MySQL 5.6 and is explained very well in the MySQL manual here.

So essentially when a server starts up, it reads the UUID value from the auto.cnf file. If the auto.cnf file is not present or the UUID value cannot be read, then the server generates a new UUID. When a new UUID is generated you should see a message similar to the following in the MySQL error log:

2014-01-18 03:44:43 27365 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 7c6858fa-7fea-11e3-84fa-6c626d957cff.

Conclusion

So the important lesson is to remember to not reuse the auto.cnf file when cloning a server, as the server UUID is an important identification of a particular server. This is used by the master for the purpose of server identification. I also believe that MySQL should show a different error message that is more pertinent to the actual error. The master could in theory send a different type of event to the slave when it notices one with a duplicate UUID.

The post Beware of MySQL 5.6 server UUID when cloning slaves appeared first on MySQL Performance Blog.

Percona is hiring a Go back-end engineer

January 20, 2014 - 6:24pm

We’re looking for a full-time web back-end developer to join our team and help continue building the Percona Cloud Tools (https://cloud.percona.com) back end. We use the Go language, so we would love to find a Go programmer, but since Go is a very new language, we’re looking for any great programmer who wants to learn and master Go–it’s fun and easy!
We understand that Go is a new technology, so you may not have a direct experience, but if you are looking to be an expert in it together with our team, and participate in the development of new generation tools for MySQL – please submit your application!

The post Percona is hiring a Go back-end engineer appeared first on MySQL Performance Blog.

Percona Replication Manager (PRM) now supporting 5.6 GTID

January 20, 2014 - 6:05am

Over the last few days, I integrated the MySQL 5.6 GTID version of the Percona Replication Manager (PRM) work of Frédéric Descamps, a colleague at Percona. The agent supports the GTID replication mode of MySQL 5.6 and if the master suffers a hard crash, it picks the slave having applied the highest transaction ID from the dead master. Given the nature of GTID-based replication, that causes all the other slaves to resync appropriately to their new master which is pretty cool and must yet be matched by the regular PRM agent.

For now, it is part of a separate agent, mysql_prm56, which may be integrated with the regular agent in the future. To use it, download the agent with the link above, the pacemaker configuration is similar to the one of the regular PRM agent. If you start from scratch, have a look here and of course, replace “mysql_prm” with “mysql_prm56″. Keep in mind that although it successfully ran many tests, it is the first release and there’s no field experience. I invite you to send any issue or successful usage to PRM-discuss.

As a side note, dealing with GTID based replication is slightly different than regular replication. I invite to consult these posts for more details:

Replication in MySQL 5.6: GTIDs benefits and limitations – Part 1

How to create/restore a slave using GTID replication in MySQL 5.6
How to create a new (or repair a broken) GTID based slave with Percona XtraBackup
Repair MySQL 5.6 GTID replication by injecting empty transactions

The post Percona Replication Manager (PRM) now supporting 5.6 GTID appeared first on MySQL Performance Blog.

keepalived with reader and writer VIPs for Percona XtraDB Cluster

January 17, 2014 - 12:00am

This is a followup to Jay Janssen’s October post, “Using keepalived for HA on top of Percona XtraDB Cluster.” We got a request recently where the customer has 2 VIPs (Virtual IP addresses), one for reader and one for a writer for a cluster of 3 nodes. They wanted to keep it simple, with low latency and does not require an external node resource like HaProxy would.

keepalived is a simple load balancer with HA capabilities, which means it can proxy TCP services behind it and at the same time, keep itself highly available using VRRP as failover mechanism. This post is about taking advantage of the VRRP capabilities built into keepalived to intelligently manage your PXC VIPs.

While Yves Trudeau also wrote a very interesting and somewhat similar solution using ClusterIP and Pacemaker to load balance VIPs, they have different use cases. Both solutions reduce latency from an external proxy or load balancer, but unlike ClusterIP, connections to the desired VIP with keepalived go to a single node which means a little less work for each node trying to see if they should respond to the request. ClusterIP is good if you want to send writes to all nodes in calculated distribution while with our keepalived option, each VIP at best assigned to only a single node – depending on your workload, each will have advantages and disadvantages.

The OS I used was CentOS 6.4 with keepalived 1.2.7 available in the yum repositories, however, it’s difficult to troubleshoot failover behavior with VRRP_Instance weights without seeing them from keepalived directly. So I used a custom build, with a patch for –vrrp-status option that allows me to monitor something like this:

[root@pxc01 keepalived]# keepalived --vrrp-status VRRP Instance : writer_vip Interface : eth5 Virtual Router ID : 60 State : BACKUP Virtual IP address : 192.168.56.83 Advertisement interval : 1 sec Preemption : Enabled, delay 0 secs Priority : 101 Effective Priority : 101 Authentication : NONE Master router : 192.168.56.44 priority 151 Master down interval : 3.6 VRRP Instance : reader_vip Interface : eth5 Virtual Router ID : 61 State : MASTER Virtual IP address : 192.168.56.84 Advertisement interval : 1 sec Preemption : Enabled, delay 0 secs Priority : 101 Effective Priority : 181 Authentication : NONE Master router : 192.168.56.42 (local)

So first, let’s compile keepalived from source, the Github branch here is where the status patch is available.

cd ~ git clone https://github.com/jonasj76/keepalived.git git checkout 5c5b2cc51760967c92b968d6e886ab6ecc2ee86d git branch 5c5b2cc51760967c92b968d6e886ab6ecc2ee86d ./configure make && make install

Install the customer tracker script below – because compiling keepalived above installs it on /usr/local/bin, I put this script there as well. One would note that this script is completely redundant, it’s true, but beware that keepalived does not validate its configuration, especially track_scripts so I prefer to have it on separate bash script so I can easily debug misbehavior. Of course when all is working well, you can always merge this to the keepalived.conf file.

#!/bin/bash # Modify these addresses to match your reader and writer VIPs WRITER_VIP=192.168.56.83 READER_VIP=192.168.56.84 # Make sure your clustercheck script also works PXC_CHECK='/usr/bin/clustercheck clustercheck password 0' SCRIPT=$1 WEIGHT=101 case $SCRIPT in 'bad_pxc') $PXC_CHECK || exit 1 ;; 'nopreempt_writer') [[ "$(hostname|cut -d'.' -f1)" != 'pxc01' && $(ip ad sh|grep $WRITER_VIP) && $(ip ad sh|grep $READER_VIP|grep -c inet) -eq 0 ]] || exit 1 ;; 'nopreempt_reader') [[ "$(hostname|cut -d'.' -f1)" != 'pxc02' && $(ip ad sh|grep $READER_VIP) && $(ip ad sh|grep $WRITER_VIP|grep -c inet) -eq 0 ]] || exit 1 ;; 'repel_writer') [ $(ip ad sh|grep $WRITER_VIP|grep -c inet) -eq 0 ] || exit 1 ;; esac exit 0

And below is my /etc/keepalived.conf:

vrrp_script nopreempt_writer_vip { script "/usr/local/bin/pxc-track nopreempt_writer" interval 2 } vrrp_script nopreempt_reader_vip { script "/usr/local/bin/pxc-track nopreempt_reader" interval 2 } vrrp_script repel_writer_vip { script "/usr/local/bin/pxc-track repel_writer" interval 2 } vrrp_script bad_pxc { script "/usr/local/bin/pxc-track bad_pxc" interval 2 } vrrp_instance writer_vip { interface eth5 state BACKUP virtual_router_id 60 priority 101 virtual_ipaddress { 192.168.56.83 } track_script { nopreempt_writer_vip weight 50 bad_pxc weight -100 } track_interface { eth5 } notify_master "/bin/echo 'writer now master' > /tmp/keepalived-w.state" notify_backup "/bin/echo 'writer now backup' > /tmp/keepalived-w.state" notify_fault "/bin/echo 'writer now fault' > /tmp/keepalived-w.state" } vrrp_instance reader_vip { interface eth5 state BACKUP virtual_router_id 61 priority 101 virtual_ipaddress { 192.168.56.84 } track_script { repel_writer_vip weight 30 nopreempt_reader_vip weight 50 bad_pxc weight -100 } track_interface { eth5 } ! This does not work properly if we stop the MySQL process ! VIP seems to stick on the node so we have separate nopreempt_* track_scripts !nopreempt notify_master "/bin/echo 'reader now master' > /tmp/keepalived-r.state" notify_backup "/bin/echo 'reader now backup' > /tmp/keepalived-r.state" notify_fault "/bin/echo 'reader now fault' > /tmp/keepalived-r.state" }

There are a number of things you can change here like remove or modify the notify_* clauses to fit your needs or send SMTP notifications during VIP failovers. I also prefer the initial state of the VRRP_Instances to be on BACKUP instead of master and let the voting on runtime dictate where the VIPs should go.

The configuration ensures that the reader and writer will not share a single node if more than one is available in the cluster. Even though the writer VIP prefers pxc01 in my example, this does not really matter much and only makes a difference when the reader VIP is not in the picture, there is no automatic failback with the help of the nopreempt_* track_scripts.

Now, to see it in action, after starting the cluster and keepalived in order pxc01, pxc02, pxc03, I have these statuses and weights:

[revin@forge ~]$ for h in pxc01 pxc02 pxc03; do echo "$h $(ssh root@$h 'cat /tmp/keepalived-w.state /tmp/keepalived-r.state'|xargs)"; done pxc01 writer now master reader now backup pxc02 writer now backup reader now master pxc03 writer now backup reader now backup pxc01 2014-01-15_20_58_23 writer_vip 161 reader_vip 101 pxc02 2014-01-15_20_58_28 writer_vip 101 reader_vip 131 pxc03 2014-01-15_20_58_36 writer_vip 131 reader_vip 131

The writer is on pxc01 and reader on pxc02 – even though the reader VIP score between pxc02 and pxc03 matches, it remains on pxc02 because of our nopreempt_* script. Let’s see what happens if I stop MySQL on pxc02:

[revin@forge ~]$ for h in pxc01 pxc02 pxc03; do echo "$h $(ssh root@$h 'cat /tmp/keepalived-w.state /tmp/keepalived-r.state'|xargs)"; done pxc01 writer now master reader now backup pxc02 writer now backup reader now backup pxc03 writer now backup reader now master pxc01 2014-01-15_21_01_17 writer_vip 161 reader_vip 101 pxc02 2014-01-15_21_01_24 writer_vip 31 reader_vip 31 pxc03 2014-01-15_21_01_36 writer_vip 101 reader_vip 181

The reader VIP moved to pxc03 and the weights changed, pxc02 reader dropped by 100 and on pxc03 it gained by 50 – again we set this higher for nor preempt. Now let’s stop MySQL on pxc03:

[revin@forge ~]$ for h in pxc01 pxc02 pxc03; do echo "$h $(ssh root@$h 'cat /tmp/keepalived-w.state /tmp/keepalived-r.state'|xargs)"; done pxc01 writer now master reader now master pxc02 writer now backup reader now backup pxc03 writer now backup reader now backup pxc01 2014-01-15_21_04_43 writer_vip 131 reader_vip 101 pxc02 2014-01-15_21_04_49 writer_vip 31 reader_vip 31 pxc03 2014-01-15_21_04_56 writer_vip 31 reader_vip 31

All our VIPs are now on pxc01, let’s start MySQL on pxc02:

[revin@forge ~]$ for h in pxc01 pxc02 pxc03; do echo "$h $(ssh root@$h 'cat /tmp/keepalived-w.state /tmp/keepalived-r.state'|xargs)"; done pxc01 writer now master reader now backup pxc02 writer now backup reader now master pxc03 writer now backup reader now backup pxc01 2014-01-15_21_06_41 writer_vip 161 reader_vip 101 pxc02 2014-01-15_21_06_50 writer_vip 101 reader_vip 131 pxc03 2014-01-15_21_06_55 writer_vip 31 reader_vip 31

Our reader is back to pxc02 and writer remains intact. When both VIPs end up on a single node i.e. last node standing, and a second node comes up, the reader moves not the writer this is to prevent any risks in breaking any connections that may be writing to the node currently owning the VIP.

The post keepalived with reader and writer VIPs for Percona XtraDB Cluster appeared first on MySQL Performance Blog.

FOSDEM 2014: MySQL And Friends Community Dinner

January 16, 2014 - 11:04am

FOSDEM is coming to town again here in Brussels this February 1-2, and as usual, there is a MySQL Devroom. Traditionally, a MySQL and Friends community dinner is also held during the event and this year is no different – we’ve organized one for Saturday, February 1 from 8:30 p.m. to 11:30 p.m. (CET).

If you want to sign up for the dinner, go to http://fosdemmysqlcommunitydinner2014.eventbrite.com/

This year’s edition is going to be slightly different…

We have rented a private cafeteria at a company called Mundo-B, where catering will be served by Kamilou.

The whole place is ours – there will be high tables and normal tables, and plenty of time and space to socialize and meet people, even the ones that are sitting 3 chairs away from you!

There’s also no need to wait for your food anymore. There will be a 3-course buffet and drinks are included! We’ve ensured that vegetarians also have a choice. Gone is all the hassle of deciphering bills and calculating how much money you’ve spent.

We’ve requested a typically Belgian menu to be put together for the occasion. Here is what you can expect:

Starters:

- Brussels Sprouts Salad with Mustard
- Chicory and apple salad

Mains:

- Beef stew with Grimbergen beer
- “Waterzooi” with fish, Flemish style
- Vegetarian option (to be announced)
- Fries, salad and bread included

Desserts:

- Cheesecake with “speculoos”
- Chocolate fondant

Available drinks (included):

- Choice of Belgian Beers
- Red Wine / White Wine
- Sparkling Water / Still Water

An event like this would not be possible without a very generous list of sponsors who helped us greatly in bringing the price down for each attendee and allowing us to organize it at the listed ticket price – including our own Percona, along with Oracle and MariaDB.

On that subject, we would like to point out that we will not be able to get an exact fix on the price until we have a full list of attendees, so with that in mind, the price we’ve set is based on the assumption of a certain amount of people attending. If the amount of people ends lower than our target, any remaining funds will be donated to the Fosdem organization.

The reason we charge money up front is two-fold:

1) We need to be able to give the caterer a rough estimate of the amount of attendees ahead of time. Free signups tend to make that number a bit less reliable. We want to make sure there is enough food to go around for everyone attending!

2) Stress-free financials on the night of the dinner! We all know the effects of Belgian Beer on arithmetic skills. Flash your ticket, get your food.

We’re looking forward to meeting you all again at Fosdem and the Community Dinner. See you then!

Party-Squad - Lizz, Dim0, Kenny

Generously sponsored by:

Wondering how to get there from Fosdem?

The venue itself is located very close to the Royal Palace. You can find the route to get there right here.

The total distance from the ULB campus is about 3.3km, so you could walk there, but it might be more comfortable to take the bus.

- Bus 71 departs from the “ULB” or the “Jeanne” stops regularly, and will take you up to “Matonge”, where you need to get out, and walk the remaining 350m.

The post FOSDEM 2014: MySQL And Friends Community Dinner appeared first on MySQL Performance Blog.

Analyzing WordPress MySQL queries with Query Analytics

January 16, 2014 - 5:00am

This blog, MySQLPerformanceBlog.com, is powered by WordPress, but we never really looked into what kind of queries to MySQL are used by WordPress. So for couple months we ran a Query Analytics (part of Percona Cloud Tools) agent there, and now it is interesting to take a look on queries. Query Analytics uses reports produced by pt-query-digest, but it is quite different as it allows to see trends and dynamics of particular query, in contrast to pt-query-digest, which is just one static report.

Why looking into queries important? I gave an intro in my previous post from this series.

So Query Analytics give the report on the top queries. How to detect which query is “bad”?
One of metrics I am typically looking into is ratio of “Rows examined” to “Rows sent”. In OLTP workload
I expect “Rows sent” to be close to “Rows examined”, because otherwise it means that a query handles a lot of rows (“examined”) which are not used in final result set (“sent”), and it means wasted CPU cycles and even unnecessary IOs if rows are not in memory.

Looking on WordPress queries it does not take long to find one:

This one actually looks quite bad… It examines up to ~186000 rows to return 0 or in the best case 1 row.
The full query text is (and this is available in Query Analytics, you do not need to dig through logs to find it):

SELECT comment_ID FROM wp_comments WHERE comment_post_ID = '154' AND comment_parent = '0' AND comment_approved != 'trash' AND ( comment_author = 'poloralphlauren.redhillrecords' OR comment_author_email = 'spam@gmail.com' ) AND comment_content = 'Probabilities are in case you are like the ma spam jorityof people nowadays, you\'re f lululemonaddictoutletcanadaSale.angelasauceda ighting tooth and nail just looking to keep up together with your existence. Acquiring organized can help you win the fight. Appear to the ugg factors in just abo spam ut every of your spaces (desk, workplace, living room, bed' LIMIT 1;

We can see how execution time of this query changes overtime

and also how many rows it examines for the last month

It is clearly an upward trend, and obviously the query does not scale well as there more and more data.
I find these trending graphs very useful and they are available in Query Analytics as we continuously digest and analyze queries. We can see that only for the last month amount of rows this query examines increased from ~130K to ~180K.

So, the obvious question is how to optimize this query?

We look into the explain plan

+----+-------------+-------------+------+----------------------------------------------------------+-----------------+---------+-------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+----------------------------------------------------------+-----------------+---------+-------+--------+-------------+ | 1 | SIMPLE | wp_comments | ref | comment_post_ID,comment_approved_date_gmt,comment_parent | comment_post_ID | 8 | const | 188482 | Using where | +----+-------------+-------------+------+----------------------------------------------------------+-----------------+---------+-------+--------+-------------+

and SHOW CREATE TABLE

CREATE TABLE `wp_comments` ( `comment_ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `comment_post_ID` bigint(20) unsigned NOT NULL DEFAULT '0', `comment_author` tinytext NOT NULL, `comment_author_email` varchar(100) NOT NULL DEFAULT '', `comment_author_url` varchar(200) NOT NULL DEFAULT '', `comment_author_IP` varchar(100) NOT NULL DEFAULT '', `comment_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00', `comment_date_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00', `comment_content` text NOT NULL, `comment_karma` int(11) NOT NULL DEFAULT '0', `comment_approved` varchar(20) NOT NULL DEFAULT '1', `comment_agent` varchar(255) NOT NULL DEFAULT '', `comment_type` varchar(20) NOT NULL DEFAULT '', `comment_parent` bigint(20) unsigned NOT NULL DEFAULT '0', `user_id` bigint(20) unsigned NOT NULL DEFAULT '0', PRIMARY KEY (`comment_ID`), KEY `comment_post_ID` (`comment_post_ID`), KEY `comment_approved_date_gmt` (`comment_approved`,`comment_date_gmt`), KEY `comment_date_gmt` (`comment_date_gmt`), KEY `comment_parent` (`comment_parent`) )

Obviously WordPress did not design this schema to handle 180000 comments to a single post.
There are several ways to fix it, I will take the easiest way and change the key
KEY comment_post_ID (comment_post_ID)
to
KEY comment_post_ID (comment_post_ID,comment_content(300))

and it changes execution plan to

+----+-------------+-------------+------+----------------------------------------------------------------------------------------------------------------+-------------------+---------+-------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+----------------------------------------------------------------------------------------------------------------+-------------------+---------+-------------+------+-------------+ | 1 | SIMPLE | wp_comments | ref | comment_post_ID,comment_approved_date_gmt,comment_parent | comment_post | 910 | const,const | 1 | Using where | +----+-------------+-------------+------+----------------------------------------------------------------------------------------------------------------+-------------------+---------+-------------+------+-------------+

From 186000 rows to 910 rows – that’s quite improvement!

How does it affect execution time? Let’s query run for a while and see again in our trending graph:

The drop from ~600ms to ~34ms

and for Rows examined:

The 2nd query is also not to hard to find, and it is again on wp_comments table

The query examines up to 16K rows, sending only 123 in the best case.

Query text is (this one is from different instance of WordPress, so the table structure is different)

SELECT comment_post_ID FROM wp_comments WHERE LCASE(comment_author_email) = 'spam@gmail.com' AND comment_subscribe='Y' AND comment_approved = '1' GROUP BY comment_post_ID

and EXPLAIN for this particular one

+----+-------------+-------------+------+---------------------------+---------------------------+---------+-------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------------+------+---------------------------+---------------------------+---------+-------+------+----------------------------------------------+ | 1 | SIMPLE | wp_comments | ref | comment_approved_date_gmt | comment_approved_date_gmt | 62 | const | 6411 | Using where; Using temporary; Using filesort | +----+-------------+-------------+------+---------------------------+---------------------------+---------+-------+------+----------------------------------------------+

This table structure is

CREATE TABLE `wp_comments` ( `comment_ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `comment_post_ID` bigint(20) unsigned NOT NULL DEFAULT '0', `comment_author` tinytext NOT NULL, `comment_author_email` varchar(100) NOT NULL DEFAULT '', `comment_author_url` varchar(200) NOT NULL DEFAULT '', `comment_author_IP` varchar(100) NOT NULL DEFAULT '', `comment_date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00', `comment_date_gmt` datetime NOT NULL DEFAULT '0000-00-00 00:00:00', `comment_content` text NOT NULL, `comment_karma` int(11) NOT NULL DEFAULT '0', `comment_approved` varchar(20) NOT NULL DEFAULT '1', `comment_agent` varchar(255) NOT NULL DEFAULT '', `comment_type` varchar(20) NOT NULL DEFAULT '', `comment_parent` bigint(20) unsigned NOT NULL DEFAULT '0', `user_id` bigint(20) unsigned NOT NULL DEFAULT '0', `comment_reply_ID` int(11) NOT NULL DEFAULT '0', `comment_subscribe` enum('Y','N') NOT NULL DEFAULT 'N', `openid` tinyint(1) NOT NULL DEFAULT '0', PRIMARY KEY (`comment_ID`), KEY `comment_post_ID` (`comment_post_ID`), KEY `comment_date_gmt` (`comment_date_gmt`), KEY `comment_approved_date_gmt` (`comment_approved`,`comment_date_gmt`), KEY `comment_parent` (`comment_parent`) )

There again several ways how to make the query more optimal, but let’s make a little exercise: Please
propose your solution in comments, and for the one I like the most by the end of February, I will give my personal discount code to Percona Live MySQL Conference and Expo 2014

So in the conclusion:

  • Query Analytics from Percona Cloud Tools gives immediate insight as to which query requires attention
  • With continuously running reports we can see the trends and effects from our optimizations

Please also note, that Query Analytics does not require you to install MySQL proxy, some third-party middle-ware or any kind of tcp-traffic analyzers. It fully operates with slow-log generated by MySQL, Percona Server or MariaDB (Percona Server provides much more information in slow-log than vanilla MySQL).
So try Percona Cloud Tools for free while it’s still in beta. You’ll be up and running minutes!

Actually, Would you like me to take look on your queries and provide my advice? It is possible.
I will be running a webinar titled “Analyze MySQL Query Performance with Percona Cloud Tools” on Feb-12, 2014; please register and see conditions.

The post Analyzing WordPress MySQL queries with Query Analytics appeared first on MySQL Performance Blog.

Percona Cloud Tools January 2014 status update

January 16, 2014 - 12:00am

Percona Cloud Tools, a new service currently in beta, applies Percona’s MySQL expertise to identify the best opportunities for application performance improvement.

The first component of Percona Cloud Tools that’s now available is Query Analytics. Based on pt-query-digest, Query Analytics enables DBAs and application developers to analyze MySQL queries over long periods of time and shines a spotlight on problems. One could say it’s “continuous slow log analysis through the web… as-a-Service” but that would be quite a long acronym. Percona Cloud Tools can help you:

  • Ensure queries perform well and with expected response times
  • Find and analyze the slowest, most problematic queries
  • Measure gains in query performance after implementing changes
  • Get low level metrics details and other data for each query

Join the other testers that are improving the performance of their applications and sign up for the beta today. Of course, it’s free.
Even with the December holidays the Percona Cloud Tools team was hard at work. Here are some of the ways we’ve recently improved the service.

New Features
Evolving over many months, we are now introducing a refined Query Analytics user experience to efficiently analyze MySQL query metrics. This is made of up three primary views.

  • Performance – The Performance view delivers at-a-glance metrics about MySQL performance. These include Queries Per Second (QPS), load, 95% response time, and others. Like the dashboard of a car, this data is vital but minimal. It’s useful when everything should be ok and you’re just taking a quick look to make sure that nothing has hit a wall.
  • Historical – The Historical view looks at the bigger picture and is one of the ways that Percona Cloud Tools is quite unique. It’s the history of metrics data and is very useful for trending, spotting problems, and seeing if changes in the past helped or hurt query performance. For example, not only can you view the most recent Query Response Time values but you can also view values over the past weeks.
  • Report – The Report view delivers “the facts” about actual queries and is easily accessible from the Performance and Historical views. Just move your mouse over a server name anywhere in the application to see a query report. For example, if your Performance view shows that MySQL is having problems, getting to the Report view to see the bad queries is simple and fast.

So not only are we making users lives easier through better data organization, the new Percona Cloud Tools use experience enables you to quickly navigate to the specifics of any problem.

Bugs Fixed
Not only have we rolled out a new user experience, we’ve also addressed some bugs along the way.

  • An existing agent could fail to start, receiving a 500 error from the API.
  • Max values were shown for aggregated “QPS” and “Total” periods instead of average for “QPS” and sum for “Total”.
  • pt-agent would not apply new MySQL connection options without restarting

Thank you to all of our beta testers for using Percona Cloud Tools and providing feedback. Your suggestions and bug reports have been very helpful.

Try Percona Cloud Tools for free while it’s still in beta.  You’ll be up and running minutes! Having troubles setting up the service? We’re here to help. Follow the “Feedback” link after you sign in and tell us about the problem.

Finally, on February 12th, Vadim Tkachenko (CTO of Percona) will show you how to Analyze MySQL Query Performance with Percona Cloud Tools. As a bonus, Percona will raffle five (5) 60-minute MySQL query consulting sessions with Vadim to analyze your Percona Cloud Tools Query Analytics data and provide feedback and performance suggestions.  Register now to attend this free webinar!

The post Percona Cloud Tools January 2014 status update appeared first on MySQL Performance Blog.

Percona Live MySQL Conference Sessions Announced

January 15, 2014 - 5:00am

I’m very pleased to announce that the breakout session and tutorial schedules for the third annual Percona Live MySQL Conference and Expo 2014 are now online. The conference will be at The Hyatt Regency Santa Clara and Santa Clara Convention Center, April 1-4, 2014. I wish to personally thank our dedicated Conference Committee members who put in many hours evaluating hundreds of submissions in order to create these schedules.

Breakout Session Schedule

We have a strong and diverse schedule of breakout sessions which will take place April 2-4, following each morning’s keynote addresses. The speaker list includes top MySQL practitioners from some of the world’s leading MySQL vendors and users including Oracle, MariaDB, Facebook, Twitter, Google, LinkedIn, Box, Groupon, and Yahoo. Topics include the following and much, much more:

  • “MySQL 5.7: Core Server Improvements,” Morgan Tocker and Rune Humborstad, Oracle
  • “MySQL 5.6 at Facebook, 2014 Edition,” Yoshinori Matsunobu, Facebook
  • “MySQL 5.7: InnoDB – What’s New,” Sunny Bains, Oracle
  • “Scaling Twitter with MySQL,” Calvin Sun, Twitter
  • “Easy MySQL Replication Setup and Troubleshooting,” Bob Burgess, Salesforce Marketing Cloud
  • “Sharding and Scale-out using MySQL Fabric,” Mats Kindahl, Oracle
  • “Privacy and Security for MySQL at Google in the Snowden Age,” Ian Gulliver, Google
  • “MySQL 5.7: Performance Schema Improvements,” Mark Leith, Oracle
  • “Small Data and MySQL,” Domas Mituzas, Facebook
  • “Hadoop for MySQL People,” Chris Schneider, Groupon.com
  • “InnoDB: A Journey to the Core II,” Davi Arnaut, LinkedIn, and Jeremy Cole, Google
  • “High Availability Tools for use with MySQL,” Jay Janssen, Percona
  • “MySQL 5.7 Replication: A Close Look at What Is New,” Luis Soares, Oracle
  • “Performance Monitoring and Tuning MySQL at Yahoo,” Ashwin Nellore, Xiang Rao, Yahoo
  • “The MySQL NoSQL Wars: Can’t we all just get along?” Tamar Bercovici, Box
  • “MariaDB 10.0: What’s New,” Sergei Golubchik, SkySQL
  • “Virtually Available MySQL, or How to Stop Worrying and Love the Cloud,” Robert Hodges, Continuent
  • “MySQL 5.7: Performance & Scalability Benchmarks,” Dimitri Kravtchuk, Oracle
  • “Asynchronous MySQL: How Facebook Queries Databases,” Chip Turner, Facebook
  • “Architecture and Design of MySQL Powered Applications,” Peter Zaitsev, Percona
  • “MySQL Workload Replay,” Ashwin Nellore, Xiang Rao, Yahoo

Tutorial Schedule

The Percona Live MySQL Conference Committee has also selected a strong agenda of tutorials which will take place on April 1. Tutorials provide practical, in-depth knowledge of critical MySQL issues. Topics include:

  • “MySQL Replication: Advanced Features in all flavors,” Giuseppe Maxia, Continuent
  • “Hands On Trove: Database as a Service in OpenStack (for MySQL),” Florian Haas, hastexo
  • “MariaDB 10 – The complete tutorial,” Ivan Zoratti, Colin Charles, SkySQL Ab, MariaDB
  • “MySQL Break/Fix Lab,” René Cannaò, PalominoDB
  • “InnoDB Architecture and Performance Optimization,” Peter Zaitsev, Percona
  • “Percona XtraDB Cluster / Galera in Practice – 2014 edition,” Jay Janssen, Percona
  • “Deep diving into MySQL monitoring setup,” Roman Vynar, Akshay Suryawanshi, Michael Rikmas, Percona
  • “Introduction to MongoDB for MySQL Users,” Stephane Combaudon, Percona
  • “From Relational to Hadoop – Migrating your data pipeline,” Gwen Shapira, Zburivsky Danil, Cloudera, Pythian
  • “Using TokuDB: A Guided Walk Through a TokuDB Implementation,” Tim Callaghan, Tokutek

Early Bird Registration

Register now to take advantage of Early Bird Registration rates which can save you $200 on the cost of attending. Early Bird Registration for the Percona Live MySQL Conference 2014 ends on February 2.

Birds of a Feather Sessions

I once again expect a great lineup of Birds of a Feather sessions (BOFs) so attendees with interests in the same project or topic can enjoy some quality face time. We are currently accepting Birds of a Feather session proposals which can be organized for individual projects or broader topics (e.g., best practices, open data, standards). The deadline for submissions is January 31, 2014 and any Percona Live MySQL Conference attendee or conference speaker can propose and moderate a session.

Lightning Talks

The sometimes insightful, sometimes funny, always entertaining Lightning Talks are an opportunity for presenters to propose, explain, exhort, or rant on any MySQL-related topic for five minutes. Topics might include a new idea, successful project, cautionary story, quick tip, or demonstration. All submissions will be reviewed and the top 10 will be selected to present during the one-hour Lightning Talks session during the Wednesday night Community Networking Reception. We especially welcome lighthearted, fun or otherwise entertaining submissions. The deadline for submitting a Lightning Talk topic is January 31, 2014.

Save Money by Registering Your Room at the Hyatt Now

The Hyatt Regency Santa Clara is the best place to stay and enjoy all the conference activities including receptions, Birds of a Feather sessions, and Lightning talks. The Hyatt bar area is often packed with conference goers throughout the afternoon and long into the evening. Make your reservation now through the Percona Live MySQL Conference and Expo 2014 website to receive the special rate before our conference room block fills up and rates increase.

Sponsors

We have a great list of sponsors who have already signed up for the Percona Live MySQL Conference and Expo. The list includes many of the world’s top MySQL solution providers, including:

Diamond Plus Sponsors

  • Continuent
  • Fusion-io

Gold Sponsors

  • Pythian
  • Micron
  • SkySQL

Silver Sponsors

  • Yelp
  • AppDynamics
  • Parelastic
  • Box
  • Galera/Codership
  • Google
  • InfiniDB (Calpont)

I Hope to See You in Santa Clara

Visit the Percona Live MySQL Conference and Expo 2014 website for more information about the conference. If you are interested in upcoming announcements, you can also register to receive email updates about the conference. Keep an eye on the conference website for upcoming information on our strong group of keynote sponsors and for the lineup of BOFs and Lightning Talks. 2014 promises to once again be a great year at the Percona Live MySQL Conference and Expo!

The post Percona Live MySQL Conference Sessions Announced appeared first on MySQL Performance Blog.

Upcoming Webinar: What’s new in Percona XtraDB Cluster 5.6

January 15, 2014 - 12:00am

I’ve been blogging a lot about some of the new things you can expect with Percona XtraDB Cluster 5.6 and Galera 3.x – and GA is coming soon.

To get prepared, I’ll be giving a webinar on February 5th at 1PM EST/10AM PST to talk about some of the big new features, upgrading strategies, as well as answering your questions. Alexey Yurchenko, solutions architect from Codership, will be joining me for some extra brain power.

Topics to be covered include:

  • Galera 3 replication enhancements
  • WAN segments
  • Cluster integration with 5.6 Async replication GTIDs
  • Async to cluster inbound replication enhancements
  • Minimal replication images
  • Detecting Donor IST viability
  • Upgrade paths to Percona XtraDB Cluster 5.6

You can register for free right here!

The post Upcoming Webinar: What’s new in Percona XtraDB Cluster 5.6 appeared first on MySQL Performance Blog.

InnoDB file formats: Here is one pitfall to avoid

January 14, 2014 - 6:00am

Compressed tables is an example of an InnoDB feature that became available with the “Barracuda” file format, released with InnoDB 1.1. They can bring significant gains in raw performance and scalability – given the data is stored in a compressed format the amount of memory and disk space necessary to hold it and move it around (disk/memory) is lower, thus making them attractive for servers equipped with SSD drives of smaller capacity.

Note that the notion of “file formats” (defined by the variable innodb_file_format) was introduced in MySQL 5.5: the evolution of InnoDB has lead to the development of new features and some of them required the support of new on-disk data structures not yet available. That means those particular features (like compressed tables) will only work with the new file format. To make things clear and help manage compatibility issues when upgrading and (specially) downgrading MySQL the original file format started being reffered to as “Antelope.”

The default file format in MySQL 5.6 and the latest 5.5 releases is Antelope. Note this can be a bit confusing as the first releases of 5.5 (until 5.5.7) introduced the new file format as being the default one, a decision that was later reversed to assure maximum compatibility in replication configurations comprised of servers running different versions of MySQL. To be sure about which file format is the one set as default in your server you can issue:

mysql> SHOW VARIABLES LIKE 'innodb_file_format';

The important lesson here that motivated me to write this post is that the file format can only be defined for tablespaces – not tables, in general. This is documented in the manual but maybe not entirely clear:

innodb_file_format: The file format to use for new InnoDB tables. Currently, Antelope and Barracuda are supported. This applies only for tables that have their own tablespace, so for it to have an effect, innodb_file_per_table must be enabled. The Barracuda file format is required for certain InnoDB features such as table compression.

Even if you configure your server with innodb_file_format=Barracuda and recreate the datadir and basic tables with the script mysql_install_db, the common tablespace will always use Antelope. So, to create tables under the new file format it is imperative you use innodb_file_per_table. That’s OK and clearly documented but what might be misleading here is the fact there’s no warning being issued if you set the file format to Barracuda and create a new compressed table without having innodb_one_file_per_table set. Not even in the error log. Here’s an example:

mysql> SET GLOBAL innodb_file_format=Barracuda;
Query OK, 0 rows affected (0.00 sec)

mysql> create table test.testA (id int) row_format=Compressed;
Query OK, 0 rows affected (2.71 sec)

Now, let’s take a look at what the INFORMATION_SCHEMA tell us about this table:

mysql> SELECT * FROM information_schema.tables WHERE table_schema='test' and table_name='testA'\G
*************************** 1. row ***************************
TABLE_CATALOG: def
TABLE_SCHEMA: test
TABLE_NAME: testA
TABLE_TYPE: BASE TABLE
ENGINE: InnoDB
VERSION: 10
ROW_FORMAT: Compact
TABLE_ROWS: 0
AVG_ROW_LENGTH: 0
DATA_LENGTH: 16384
MAX_DATA_LENGTH: 0
INDEX_LENGTH: 0
DATA_FREE: 0
AUTO_INCREMENT: NULL
CREATE_TIME: 2014-01-07 14:21:05
UPDATE_TIME: NULL
CHECK_TIME: NULL
TABLE_COLLATION: latin1_swedish_ci
CHECKSUM: NULL
CREATE_OPTIONS: row_format=COMPRESSED
TABLE_COMMENT:
1 row in set (0.00 sec)

There’s two at-first-look “contradictory” fields here:

  •  ”ROW_FORMAT” says the table is using the Compact format while
  •  ”CREATE_OPTIONS” indicates “row_format=COMPRESSED” has been used when creating the table

The one to consider is ROW_FORMAT: CREATE_OPTION is used to store the options that were used at the moment the table was created and is evoked by the SHOW CREATE TABLE statement to “reconstruct” it:

mysql> show create table test.testA;
*************************** 1. row ***************************
Table: testA
Create Table: CREATE TABLE `testA` (
`id` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=COMPRESSED
1 row in set (0.00 sec)

Conclusion

A customer contacted us asking how he could get a list of the tables using the compression format, which we can obtain by interrogating INFORMATION_SCHEMA:

mysql> SELECT TABLE_NAME, ROW_FORMAT FROM INFORMATION_SCHEMA.TABLES WHERE ROW_FORMAT=Compressed’;

To their surprise this returned an empty set. We verified that the tables created by them specified ROW_FORMAT=Compressed but as shown in this article this method is not to be trusted – “ask” the INFORMATION_SCHEMA instead.

[0] http://dev.mysql.com/doc/refman/5.6/en/innodb-compression.html
[1] http://dev.mysql.com/doc/refman/5.6/en/innodb-file-format-enabling.html
[2] http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_file_format

The post InnoDB file formats: Here is one pitfall to avoid appeared first on MySQL Performance Blog.

Webinars: Common (but deadly) MySQL mistakes

January 13, 2014 - 12:00am

On January 22 I’ll be presenting the first of a series of two new webinars focusing on avoiding common MySQL mistakes:

How to Avoid Common (but Deadly) MySQL Operations Mistakes.

“Don’t bother me with DBA ‘best practices,’ we have real work to get done.”

So go the famous last words from overworked IT managers (we’ve all been there at one time or another).

Best practices don’t have to interrupt your work or make it harder. Ideally, they should make database operations more predictable and less fraught with emergencies or surprises. This presentation provides a short list of common failures of DBA’s, and makes the case that you can save yourself time (and grey hair) by adopting better habits.

Some of the tips include:

  • How to know when and why your server changed configuration values.
  • How to have bulletproof backups.
  • How to save up to 20% of your database size at the same time as reducing the query optimizer’s work.

Then on March 5 I’ll present the second webinar in this series:

How to Avoid Common (but Deadly) MySQL Development Mistakes

“We need a database that ‘just works’ and runs at ‘web scale.’”

MySQL software developers are under pressure to do more in less time, and create applications that adapt to ever-changing requirements.

But it’s true that some of a developer’s time is wasted when their method of optimizing involves trying every combination of code and index definition. There must be a more straight path to achieve the best database code. This talk shows you some of these methods, including:

  • How to find out which indexes are the best fit for your MySQL application
  • How to protect yourself from the number one MySQL database security vulnerability on the web
  • How to decide when to optimize a MySQL database application with denormalization, indexes, caching, partitioning, sharding

At the end of this webinar, you’ll be more productive and confident as you develop MySQL database-driven applications.

Please join me!  Register for the first webinar or register for the second webinar!  Or register for both and get two for the price of one (just kidding; they’re both free)!

The post Webinars: Common (but deadly) MySQL mistakes appeared first on MySQL Performance Blog.

The use of Iptables ClusterIP target as a load balancer for PXC, PRM, MHA and NDB

January 10, 2014 - 12:00am

Most technologies achieving high-availability for MySQL need a load-balancer to spread the client connections to a valid database host, even the Tungsten special connector can be seen as a sophisticated load-balancer. People often use hardware load balancer or software solution like haproxy. In both cases, in order to avoid having a single point of failure, multiple load balancers must be used. Load balancers have two drawbacks: they increase network latency and/or they add a validation check load on the database servers. The increased network latency is obvious in the case of standalone load balancers where you must first connect to the load balancer which then completes the request by connecting to one of the database servers. Some workloads like reporting/adhoc queries are not affected by a small increase of latency but other workloads like oltp processing and real-time logging are. Each load balancers must also check regularly if the database servers are in a sane state, so adding more load balancers increases the idle chatting over the network. In order to reduce these impacts, a very different type of load balancer is needed, let me introduce the Iptables ClusterIP target.

Normally, as stated by the RFC 1812 Requirements for IP Version 4 Routers an IP address must be unique on a network and each host must respond only for IPs it own. In order to achieve a load balancing behavior, the Iptables ClusterIP target doesn’t strictly respect the RFC. The principle is simple, each computer in the cluster share an IP address and MAC address with the other members but it answers requests only for a given subset, based on the modulo of a network value which is sourceIP-sourcePort by default. The behavior is controlled by an iptables rule and by the content of the kernel file /proc/net/ipt_CLUSTERIP/VIP_ADDRESS. The kernel /proc file just informs the kernel to which portion of the traffic it should answer. I don’t want to go too deep in the details here since all those things are handled by the Pacemaker resource agent, IPaddr2.

The IPaddr2 Pacemaker resource agent is commonly used for VIP but what is less know is its behavior when defined as part of a clone set. When part of clone set, the resource agent defines a VIP which uses the Iptables ClusterIP target, the iptables rules and the handling of the proc file are all done automatically. That seems very nice in theory but until recently, I never succeeded in having a suitable distribution behavior. When starting the clone set on, let’s say, three nodes, it distributes correctly, one instance on each but if 2 nodes fail and then recover, the clone instances all go to the 3rd node and stay there even after the first two nodes recover. That bugged me for quite a while but I finally modified the resource agent and found a way to have it work correctly. It also now set correctly the MAC address if none is provided to the MAC multicast address domain which starts by “01:00:5E”. The new agent, IPaddr3, is available here. Now, let’s show what we can achieve with it.

We’ll start from the setup described in my previous post and we’ll modify it. First, download and install the IPaddr3 agent.

root@pacemaker-1:~# wget -O /usr/lib/ocf/resource.d/percona/IPaddr3 https://github.com/percona/percona-pacemaker-agents/raw/master/agents/IPaddr3 root@pacemaker-1:~# chmod u+x /usr/lib/ocf/resource.d/percona/IPaddr3

Repeat these steps on all 3 nodes. Then, we’ll modify the pacemaker configuration like this (I’ll explain below):

node pacemaker-1 \ attributes standby="off" node pacemaker-2 \ attributes standby="off" node pacemaker-3 \ attributes standby="off" primitive p_cluster_vip ocf:percona:IPaddr3 \ params ip="172.30.212.100" nic="eth1" \ meta resource-stickiness="0" \ op monitor interval="10s" primitive p_mysql_monit ocf:percona:mysql_monitor \ params reader_attribute="readable_monit" writer_attribute="writable_monit" user="repl_user" password="WhatAPassword" pid="/var/lib/mysql/mysqld.pid" socket="/var/run/mysqld/mysqld.sock" max_slave_lag="5" cluster_type="pxc" \ op monitor interval="1s" timeout="30s" OCF_CHECK_LEVEL="1" clone cl_cluster_vip p_cluster_vip \ meta clone-max="3" clone-node-max="3" globally-unique="true" clone cl_mysql_monitor p_mysql_monit \ meta clone-max="3" clone-node-max="1" location loc-distrib-cluster-vip cl_cluster_vip \ rule $id="loc-distrib-cluster-vip-rule" -1: p_cluster_vip_clone_count gt 1 location loc-enable-cluster-vip cl_cluster_vip \ rule $id="loc-enable-cluster-vip-rule" 2: writable_monit eq 1 location loc-no-cluster-vip cl_cluster_vip \ rule $id="loc-no-cluster-vip-rule" -inf: writable_monit eq 0 property $id="cib-bootstrap-options" \ dc-version="1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff" \ cluster-infrastructure="openais" \ expected-quorum-votes="3" \ stonith-enabled="false" \ no-quorum-policy="ignore" \ last-lrm-refresh="1384275025" \ maintenance-mode="off"

First, the VIP primitive is modified to use the new agent, IPaddr3, and we set resource-stickiness=”0″. Next, we define the cl_cluster_vip clone set using: clone-max=”3″ to have three instances, clone-node-max=”3″ to allow up to three instances on the same node and globally-unique=”true” to tell Pacemaker it has to allocate an instance on a node even if there’s already one. Finally, there’re three location rules needed to get the behavior we want, one using the p_cluster_vip_clone_count attribute and the other two around the writable_monit attribute. Enabling all that gives:

root@pacemaker-1:~# crm_mon -A1 ============ Last updated: Tue Jan 7 10:51:38 2014 Last change: Tue Jan 7 10:50:38 2014 via cibadmin on pacemaker-1 Stack: openais Current DC: pacemaker-2 - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 3 Nodes configured, 3 expected votes 6 Resources configured. ============ Online: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Clone Set: cl_cluster_vip [p_cluster_vip] (unique) p_cluster_vip:0 (ocf::percona:IPaddr3): Started pacemaker-3 p_cluster_vip:1 (ocf::percona:IPaddr3): Started pacemaker-1 p_cluster_vip:2 (ocf::percona:IPaddr3): Started pacemaker-2 Clone Set: cl_mysql_monitor [p_mysql_monit] Started: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Node Attributes: * Node pacemaker-1: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1 * Node pacemaker-2: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1 * Node pacemaker-3: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1

and the network configuration is:

root@pacemaker-1:~# iptables -L INPUT -n Chain INPUT (policy ACCEPT) target prot opt source destination CLUSTERIP all -- 0.0.0.0/0 172.30.212.100 CLUSTERIP hashmode=sourceip-sourceport clustermac=01:00:5E:91:18:86 total_nodes=3 local_node=1 hash_init=0 root@pacemaker-1:~# cat /proc/net/ipt_CLUSTERIP/172.30.212.100 2 root@pacemaker-2:~# cat /proc/net/ipt_CLUSTERIP/172.30.212.100 3 root@pacemaker-3:~# cat /proc/net/ipt_CLUSTERIP/172.30.212.100 1

In order to test the access, you need to query the VIP from a fourth node:

root@pacemaker-4:~# while [ 1 ]; do mysql -h 172.30.212.100 -u repl_user -pWhatAPassword -BN -e "select variable_value from information_schema.global_variables where variable_name like 'hostname';"; sleep 1; done pacemaker-1 pacemaker-1 pacemaker-2 pacemaker-2 pacemaker-2 pacemaker-3 pacemaker-2 ^C

So, all good… Let’s now desync the pacemaker-1 and pacemaker-2.

root@pacemaker-1:~# mysql -e 'set global wsrep_desync=1;' root@pacemaker-1:~# root@pacemaker-2:~# mysql -e 'set global wsrep_desync=1;' root@pacemaker-2:~# root@pacemaker-3:~# crm_mon -A1 ============ Last updated: Tue Jan 7 10:53:51 2014 Last change: Tue Jan 7 10:50:38 2014 via cibadmin on pacemaker-1 Stack: openais Current DC: pacemaker-2 - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 3 Nodes configured, 3 expected votes 6 Resources configured. ============ Online: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Clone Set: cl_cluster_vip [p_cluster_vip] (unique) p_cluster_vip:0 (ocf::percona:IPaddr3): Started pacemaker-3 p_cluster_vip:1 (ocf::percona:IPaddr3): Started pacemaker-3 p_cluster_vip:2 (ocf::percona:IPaddr3): Started pacemaker-3 Clone Set: cl_mysql_monitor [p_mysql_monit] Started: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Node Attributes: * Node pacemaker-1: + p_cluster_vip_clone_count : 1 + readable_monit : 0 + writable_monit : 0 * Node pacemaker-2: + p_cluster_vip_clone_count : 1 + readable_monit : 0 + writable_monit : 0 * Node pacemaker-3: + p_cluster_vip_clone_count : 3 + readable_monit : 1 + writable_monit : 1 root@pacemaker-3:~# cat /proc/net/ipt_CLUSTERIP/172.30.212.100 1,2,3 root@pacemaker-4:~# while [ 1 ]; do mysql -h 172.30.212.100 -u repl_user -pWhatAPassword -BN -e "select variable_value from information_schema.global_variables where variable_name like 'hostname';"; sleep 1; done pacemaker-3 pacemaker-3 pacemaker-3 pacemaker-3 pacemaker-3 pacemaker-3

Now, if pacemaker-1 and pacemaker-2 are back in sync, we have the desired distribution:

root@pacemaker-1:~# mysql -e 'set global wsrep_desync=0;' root@pacemaker-1:~# root@pacemaker-2:~# mysql -e 'set global wsrep_desync=0;' root@pacemaker-2:~# root@pacemaker-3:~# crm_mon -A1 ============ Last updated: Tue Jan 7 10:58:40 2014 Last change: Tue Jan 7 10:50:38 2014 via cibadmin on pacemaker-1 Stack: openais Current DC: pacemaker-2 - partition with quorum Version: 1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff 3 Nodes configured, 3 expected votes 6 Resources configured. ============ Online: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Clone Set: cl_cluster_vip [p_cluster_vip] (unique) p_cluster_vip:0 (ocf::percona:IPaddr3): Started pacemaker-3 p_cluster_vip:1 (ocf::percona:IPaddr3): Started pacemaker-1 p_cluster_vip:2 (ocf::percona:IPaddr3): Started pacemaker-2 Clone Set: cl_mysql_monitor [p_mysql_monit] Started: [ pacemaker-1 pacemaker-2 pacemaker-3 ] Node Attributes: * Node pacemaker-1: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1 * Node pacemaker-2: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1 * Node pacemaker-3: + p_cluster_vip_clone_count : 1 + readable_monit : 1 + writable_monit : 1

All the clone instances redistributed on all nodes as we wanted.

As a conclusion, Pacemaker with a clone set of IPaddr3 is a very interesting kind of load balancer, especially if you already have pacemaker deployed. It introduces almost no latency, it doesn’t need any other hardware, doesn’t increase the database validation load and is as highly-available as your database is. The only drawback I can see is in a case where the inbound traffic is very important. In that case, all nodes are receiving all the traffic and are equally saturated. With databases and web type traffics, the inbound traffic is usually small. This solution also doesn’t redistribute the connections based on the server load like a load balancer can do but that would be fairly easy to implement with something like a server_load attribute and an agent similar to mysql_monitor but that will check the server load instead of the database status. In such a case, I suggest using much more than 1 VIP clone instance per node to have a better granularity in load distribution. Finally, the ClusterIP target, although still fully supported, has been deprecated in favor of the Cluster-match target. It is basically the same principle and I plan to adapt the IPaddr3 agent to Cluster-match in a near future.

The post The use of Iptables ClusterIP target as a load balancer for PXC, PRM, MHA and NDB appeared first on MySQL Performance Blog.

Percona Cloud Tools: Making MySQL performance easy

January 9, 2014 - 7:27am

One of our primary focuses at Percona is performance. Let me make some statements on what is “performance.”

In doing so I will refer to two pieces of content:

I highly recommend that you familiarize yourself with both of them.

Performance

Performance is about tasks and time.
We say that the system is performing well if it executes a task in an acceptable period of time, or that the system executes the required amounts of tasks per given period (sec, minute, hour etc).

To put it in our area of interest, what is a task for a database? Often we consider this as an individual query, and this is good enough detalization. One may think about database task as a “transaction” or set of queries executed during web page access, but I propose we focus on queries.

From this prospective, the database (MySQL) shows good performance if it executes the query in a reasonable amount of time (this is what we know as Response Time), or executes the expected amount of queries per second (hour/day) (this is Throughput).

So we connected a query execution with performance.
And let me state this: most database performance problems are related to queries. Either there are too many queries to handle (the system is overloaded; we ask to perform more tasks then the system is able to handle); or a query takes a long time to execute (the system is not handling the task in optimal way). Of course there could be another reasons, like hardware failure, database misconfiguration, OS issues etc, but, again, in most cases it is Queries what is problem.

“Performance is HARD”

Now, in [1] Carry Millsap states that “Performance is HARD”, and it is true, if you do not know where to look.
Very often people look into some funny places like cache hit ratio (“You have only 93% cache hit ratio, that your problem!”), amount of IO per second, CPU utilization, aborted_clients, handler_select you name it. Looking at these numbers with some experience and black magic may give you some results, or may not. When you look at these numbers, make some conclusions, then make some changes in hope it helps – it is a performance tuning, not performance optimization [1].

Fortunately, Performance is EASY if you know what to look into [1].
With Percona Clould Tools we make Performance EASY – our tools show you what you should be looking for.

Right now we have one module: Query Analytics, which is focused on queries. Query Analytics is based on our command line tool pt-query-digest from Percona Toolkit and we implement our best practices in how we analyze queries working with customers systems. Peter actually names this tool “Peter in the box”.

Percona Cloud Tools / Query Analytics

Let me review what information Query Analytics provides…

1. General Query Profile
Which queries takes the most time in the database, and information on how many queries execute, total time they took, average time per query and 95% response time.

Let’s see an example for the MySQLPerformanceBlog.com for one week time frame. This blog is WordPress based, so the queries are coming from WordPress:

The two top queries are related to users:
SELECT wp_usermeta and SELECT wp_users, and we can see that both of them are executed 69M times for the last week. Though 95% response time for first one is 277.00µs, which is just fine.

There we can click and see more details about query:

Query Text and

Metrics

…which shows that execution time for the query was from (Min) 5.00µs to (Max) 529.74ms
Bunch of metrics like InnoDB* are available only of you are using Percona Server.

What is more interesting is that Query Analitycs allows us to see historical data.
If we switch to graph we can see the data for the last week:

Where we clearly see a spike in amount of queries. This corresponds to the published post: Multiple column index vs multiple indexes with MySQL 5.6.

And there we can switch between Total and 95% graph,

…which shows that even with the spike, the 95% response time of this query was not affected and that means MySQL (Percona Server) was able to handle the load.

Query Analytics helps you:

  • Be sure your queries (tasks) are performed as expected
  • Queries are executed within the time frame you need
  • If not, then you will be able to see which query (or several) are problematic and requires attention
  • Get detailed metrics for each query

Percona Cloud Tools is to make MySQL Performance Easy!

You can (and should) sign up for free Beta and try the tool yourself.

The post Percona Cloud Tools: Making MySQL performance easy appeared first on MySQL Performance Blog.

Finding a good IST donor in Percona XtraDB Cluster 5.6

January 8, 2014 - 3:00am
Gcache and IST

The Gcache is a memory-based cache of recent Galera transactions that is local to each node in a cluster.  If a node leaves and rejoins the cluster, it can use the gcache from another node that stayed in the cluster (i.e., its donor node) to fetch the transactions it missed (IST) as opposed to doing a full state snapshot transfer (SST).  However, there are a few nuances that are not obvious to the beginner:

  • The Gcache is lost when a node restarts
  • The Gcache is fixed size and implemented as a LRU.  Once it is full, older transactions roll off.
  • Donor selection is made irregardless of the gcache state
  • If the given donor for a restarting node doesn’t have all transactions needed, a full SST (read: full backup) is done instead
  • Until recent developments, there was no way to tell what, precisely, was in the Gcache.

So, with (somewhat) arbitrary donor selection, it was hard to be certain that a node restart would not trigger a SST.  For example:

  • A node crashed over night or was otherwise down for some length of time.  How do you know if the gcache on any node is big enough to contain all the transactions necessary for IST?
  • If you brought two nodes in your cluster simultaneously, the second one you restart might select the first one as its donor and be forced to SST.
Along comes PXC 5.6.15 RC1

Astute readers of the PXC 5.6.15 release notes will have noticed this little tidbit:

New wsrep_local_cached_downto status variable has been introduced. This variable shows the lowest sequence number in gcache. This information can be helpful with determining IST and/or SST.

Until this release there was no visibility into any node’s Gcache and what was likely to happen when restarting a node.  You could make some assumptions, but now it its a bit easier to:

  1. Tell if a given node would be a suitable donor
  2. And hence select a donor manually using wsrep_sst_donor instead of leaving it to chance.

 

What it looks like

Suppose I have a 3 node cluster where load is hitting node1.  I execute the following in sequence:

  1. Shut down node2
  2. Shut down node3
  3. Restart node2

At step 3, node1 is the only viable donor for node2.  Because our restart was quick, we can have some reasonable assurance that node2 will IST correctly (and it does).

However, before we restart node3, let’s check the oldest transaction in the gcache on nodes 1 and 2:

[root@node1 ~]# mysql -e "show global status like 'wsrep_local_cached_downto';" +---------------------------+--------+ | Variable_name | Value | +---------------------------+--------+ | wsrep_local_cached_downto | 889703 | +---------------------------+--------+ [root@node2 mysql]# mysql -e "show global status like 'wsrep_local_cached_downto';" +---------------------------+---------+ | Variable_name | Value | +---------------------------+---------+ | wsrep_local_cached_downto | 1050151 | +---------------------------+---------+

So we can see that node1 has a much more “complete” gcache than node2 does (i.e., a much smaller seqno). Node2′s gcache was wiped when it restarted, so it only has transactions from after its restart.

To check node3′s GTID, we can either check the grastate.dat, or (if it has crashed and the grastate is zeroed) use –wsrep_recover:

[root@node3 ~]# cat /var/lib/mysql/grastate.dat # GALERA saved state version: 2.1 uuid: 7206c8e4-7705-11e3-b175-922feecc92a0 seqno: 1039191 cert_index: [root@node3 ~]# mysqld_safe --wsrep-recover 140107 16:18:37 mysqld_safe Logging to '/var/lib/mysql/error.log'. 140107 16:18:37 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 140107 16:18:37 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.pIVkT4' --pid-file='/var/lib/mysql/node3-recover.pid' 140107 16:18:39 mysqld_safe WSREP: Recovered position 7206c8e4-7705-11e3-b175-922feecc92a0:1039191 140107 16:18:41 mysqld_safe mysqld from pid file /var/lib/mysql/node3.pid ended

So, armed with this information, we can tell what would happen to node3, depending on which donor was selected:

Donor selectedDonor’s gcache oldest seqnoNode3′s seqnoResult for node3node210501511039191SSTnode18897031039191IST

So, we can instruct node3 to use node1 as its donor on restart with wsrep_sst_donor:

[root@node3 ~]# service mysql start --wsrep_sst_donor=node1

Note that passing mysqld options on the command line is only supported in RPM packages, Debian requires you put that setting in your my.cnf.  We can see from node3′s log that it does properly IST:

2014-01-07 16:23:26 19834 [Note] WSREP: Prepared IST receiver, listening at: tcp://192.168.70.4:4568 2014-01-07 16:23:26 19834 [Note] WSREP: Node 0.0 (node3) requested state transfer from 'node1'. Selected 2.0 (node1)(SYNCED) as donor. ... 2014-01-07 16:23:27 19834 [Note] WSREP: Receiving IST: 39359 writesets, seqnos 1039191-1078550 2014-01-07 16:23:27 19834 [Note] /usr/sbin/mysqld: ready for connections. Version: '5.6.15-56' socket: '/var/lib/mysql/mysql.sock' port: 3306 Percona XtraDB Cluster (GPL), Release 25.2, Revision 645, wsrep_25.2.r4027 2014-01-07 16:23:41 19834 [Note] WSREP: IST received: 7206c8e4-7705-11e3-b175-922feecc92a0:1078550

Sometime in the future, this may be handled automatically on donor selection, but for now it is very useful that we can at least see the status of the gcache.

The post Finding a good IST donor in Percona XtraDB Cluster 5.6 appeared first on MySQL Performance Blog.

Increasing slow query performance with the parallel query execution

January 7, 2014 - 12:00am

MySQL and Scaling-up (using more powerful hardware) was always a hot topic. Originally MySQL did not scale well with multiple CPUs; there were times when InnoDB performed poorer with more  CPU cores than with less CPU cores. MySQL 5.6 can scale significantly better; however there is still 1 big limitation: 1 SQL query will eventually use only 1 CPU core (no parallelism). Here is what I mean by that: let’s say we have a complex query which will need to scan million of rows and may need to create a temporary table; in this case MySQL will not be able to scan the table in multiple threads (even with partitioning) so the single query will not be faster on the more powerful server. On the contrary, a server with more slower CPUs will show worse performance than the server with less (but faster) CPUs.

To address this issue we can use a parallel query execution. Vadim wrote about the PHP asynchronous calls for MySQL. Another way to increase the parallelism will be to use “sharding” approach, for  example with Shard Query. I’ve decided to test out the parallel (asynchronous) query execution with relatively large table: I’ve used the US Flights Ontime performance database, which was originally used by Vadim in the old post Analyzing air traffic performance. Let’s see how this can help us increase performance of the complex query reports.

Parallel Query Example

To illustrate the parallel query execution with MySQL I’ve created the following table:

CREATE TABLE `ontime` ( `YearD` year(4) NOT NULL, `Quarter` tinyint(4) DEFAULT NULL, `MonthD` tinyint(4) DEFAULT NULL, `DayofMonth` tinyint(4) DEFAULT NULL, `DayOfWeek` tinyint(4) DEFAULT NULL, `FlightDate` date DEFAULT NULL, `UniqueCarrier` char(7) DEFAULT NULL, `AirlineID` int(11) DEFAULT NULL, `Carrier` char(2) DEFAULT NULL, `TailNum` varchar(50) DEFAULT NULL, `FlightNum` varchar(10) DEFAULT NULL, `OriginAirportID` int(11) DEFAULT NULL, `OriginAirportSeqID` int(11) DEFAULT NULL, `OriginCityMarketID` int(11) DEFAULT NULL, `Origin` char(5) DEFAULT NULL, `OriginCityName` varchar(100) DEFAULT NULL, `OriginState` char(2) DEFAULT NULL, `OriginStateFips` varchar(10) DEFAULT NULL, `OriginStateName` varchar(100) DEFAULT NULL, `OriginWac` int(11) DEFAULT NULL, `DestAirportID` int(11) DEFAULT NULL, `DestAirportSeqID` int(11) DEFAULT NULL, `DestCityMarketID` int(11) DEFAULT NULL, `Dest` char(5) DEFAULT NULL, -- ... (removed number of fields) `id` int(11) NOT NULL AUTO_INCREMENT, PRIMARY KEY (`id`), KEY `YearD` (`YearD`), KEY `Carrier` (`Carrier`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

And loaded 26 years of data into it. The table is 56G with ~152M rows.

Software: Percona 5.6.15-63.0. Hardware: Supermicro; X8DTG-D; 48G of RAM; 24xIntel(R) Xeon(R) CPU L5639 @ 2.13GHz, 1xSSD drive (250G)

So we have 24 relatively slow CPUs

Simple query

Now we can run some queries. The first query is very simple: find all flights per year (in the US):

select yeard, count(*) from ontime group by yeard

As we have the index on YearD, the query will use the index:

mysql> explain select yeard, count(*) from ontime group by yeard\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: ontime type: index possible_keys: YearD,comb1 key: YearD key_len: 1 ref: NULL rows: 148046200 Extra: Using index 1 row in set (0.00 sec)

The query is simple, however, it will have to scan 150M rows. Here is the results of the query (cached):

mysql> select yeard, count(*) from ontime group by yeard; +-------+----------+ | yeard | count(*) | +-------+----------+ | 1988 | 5202096 | | 1989 | 5041200 | | 1990 | 5270893 | | 1991 | 5076925 | | 1992 | 5092157 | | 1993 | 5070501 | | 1994 | 5180048 | | 1995 | 5327435 | | 1996 | 5351983 | | 1997 | 5411843 | | 1998 | 5384721 | | 1999 | 5527884 | | 2000 | 5683047 | | 2001 | 5967780 | | 2002 | 5271359 | | 2003 | 6488540 | | 2004 | 7129270 | | 2005 | 7140596 | | 2006 | 7141922 | | 2007 | 7455458 | | 2008 | 7009726 | | 2009 | 6450285 | | 2010 | 6450117 | | 2011 | 6085281 | | 2012 | 6096762 | | 2013 | 5349447 | +-------+----------+ 26 rows in set (54.10 sec)

The query took 54 seconds and utilized only 1 CPU core. However, this query is perfect for running in parallel.  We can run 26 parallel queries, each will count its own year. I’ve used the following shell script to run the queries in background:

#!/bin/bash date for y in {1988..2013} do sql="select yeard, count(*) from ontime where yeard=$y" mysql -vvv ontime -e "$sql" &>par_sql1/$y.log & done wait date

Here are the results:

Start: 11:41:21 EST 2014 End: 11:41:26 EST 2014

So the total execution time is ~5 (10x faster) seconds. Each individual results are here:

par_sql1/1988.log:1 row in set (3.70 sec) par_sql1/1989.log:1 row in set (4.08 sec) par_sql1/1990.log:1 row in set (4.59 sec) par_sql1/1991.log:1 row in set (4.26 sec) par_sql1/1992.log:1 row in set (4.54 sec) par_sql1/1993.log:1 row in set (2.78 sec) par_sql1/1994.log:1 row in set (3.41 sec) par_sql1/1995.log:1 row in set (4.87 sec) par_sql1/1996.log:1 row in set (4.41 sec) par_sql1/1997.log:1 row in set (3.69 sec) par_sql1/1998.log:1 row in set (3.56 sec) par_sql1/1999.log:1 row in set (4.47 sec) par_sql1/2000.log:1 row in set (4.71 sec) par_sql1/2001.log:1 row in set (4.81 sec) par_sql1/2002.log:1 row in set (4.19 sec) par_sql1/2003.log:1 row in set (4.04 sec) par_sql1/2004.log:1 row in set (5.12 sec) par_sql1/2005.log:1 row in set (5.10 sec) par_sql1/2006.log:1 row in set (4.93 sec) par_sql1/2007.log:1 row in set (5.29 sec) par_sql1/2008.log:1 row in set (5.59 sec) par_sql1/2009.log:1 row in set (4.44 sec) par_sql1/2010.log:1 row in set (4.91 sec) par_sql1/2011.log:1 row in set (5.08 sec) par_sql1/2012.log:1 row in set (4.85 sec) par_sql1/2013.log:1 row in set (4.56 sec)

Complex Query 

Now we can try more complex query. Lets imagine we want to find out which airlines have maximum delays for the flights inside continental US during the business days from 1988 to 2009 (I was trying to come up with the complex query with multiple conditions in the where clause).

select min(yeard), max(yeard), Carrier, count(*) as cnt, sum(ArrDelayMinutes>30) as flights_delayed, round(sum(ArrDelayMinutes>30)/count(*),2) as rate FROM ontime WHERE DayOfWeek not in (6,7) and OriginState not in ('AK', 'HI', 'PR', 'VI') and DestState not in ('AK', 'HI', 'PR', 'VI') and flightdate < '2010-01-01' GROUP by carrier HAVING cnt > 100000 and max(yeard) > 1990 ORDER by rate DESC

As the query has “group by” and “order by” plus multiple ranges in the where clause it will have to create a temporary table:

id: 1 select_type: SIMPLE table: ontime type: index possible_keys: comb1 key: comb1 key_len: 9 ref: NULL rows: 148046200 Extra: Using where; Using temporary; Using filesort

(for this query I’ve created the combined index: KEY comb1 (Carrier,YearD,ArrDelayMinutes)  to increase performance)

The query runs in ~15 minutes:

+------------+------------+---------+----------+-----------------+------+ | min(yeard) | max(yeard) | Carrier | cnt | flights_delayed | rate | +------------+------------+---------+----------+-----------------+------+ | 2003 | 2009 | EV | 1454777 | 237698 | 0.16 | | 2006 | 2009 | XE | 1016010 | 152431 | 0.15 | | 2006 | 2009 | YV | 740608 | 110389 | 0.15 | | 2003 | 2009 | B6 | 683874 | 103677 | 0.15 | | 2003 | 2009 | FL | 1082489 | 158748 | 0.15 | | 2003 | 2005 | DH | 501056 | 69833 | 0.14 | | 2001 | 2009 | MQ | 3238137 | 448037 | 0.14 | | 2003 | 2006 | RU | 1007248 | 126733 | 0.13 | | 2004 | 2009 | OH | 1195868 | 160071 | 0.13 | | 2003 | 2006 | TZ | 136735 | 16496 | 0.12 | | 1988 | 2009 | UA | 9593284 | 1197053 | 0.12 | | 1988 | 2009 | AA | 10600509 | 1185343 | 0.11 | | 1988 | 2001 | TW | 2659963 | 280741 | 0.11 | | 1988 | 2009 | CO | 6029149 | 673863 | 0.11 | | 2007 | 2009 | 9E | 577244 | 59440 | 0.10 | | 1988 | 2009 | DL | 11869471 | 1156267 | 0.10 | | 1988 | 2009 | NW | 7601727 | 725460 | 0.10 | | 1988 | 2009 | AS | 1506003 | 146920 | 0.10 | | 2003 | 2009 | OO | 2654259 | 257069 | 0.10 | | 1988 | 2009 | US | 10276941 | 991016 | 0.10 | | 1988 | 1991 | PA | 206841 | 19465 | 0.09 | | 1988 | 2005 | HP | 2607603 | 235675 | 0.09 | | 1988 | 2009 | WN | 12722174 | 1107840 | 0.09 | | 2005 | 2009 | F9 | 307569 | 28679 | 0.09 | +------------+------------+---------+----------+-----------------+------+ 24 rows in set (15 min 56.40 sec)

 

Now we can split this query and run the 31 queries (=31 distinct airlines in this table) in parallel. I have used the following script:

date for c in '9E' 'AA' 'AL' 'AQ' 'AS' 'B6' 'CO' 'DH' 'DL' 'EA' 'EV' 'F9' 'FL' 'HA' 'HP' 'ML' 'MQ' 'NW' 'OH' 'OO' 'PA' 'PI' 'PS' 'RU' 'TW' 'TZ' 'UA' 'US' 'WN' 'XE' 'YV' do sql=" select min(yeard), max(yeard), Carrier, count(*) as cnt, sum(ArrDelayMinutes>30) as flights_delayed, round(sum(ArrDelayMinutes>30)/count(*),2) as rate FROM ontime WHERE DayOfWeek not in (6,7) and OriginState not in ('AK', 'HI', 'PR', 'VI') and DestState not in ('AK', 'HI', 'PR', 'VI') and flightdate < '2010-01-01' and carrier = '$c'" mysql -uroot -vvv ontime -e "$sql" &>par_sql_complex/$c.log & done wait date

In this case we will also avoid creating temporary table  (as we have an index which starts with carrier).

Results: total time is 5 min 47 seconds (3x faster)

Start: 15:41:02 EST 2013 End: 15:46:49 EST 2013

Per query statistics:

par_sql_complex/9E.log:1 row in set (44.47 sec) par_sql_complex/AA.log:1 row in set (5 min 41.13 sec) par_sql_complex/AL.log:1 row in set (15.81 sec) par_sql_complex/AQ.log:1 row in set (14.52 sec) par_sql_complex/AS.log:1 row in set (2 min 43.01 sec) par_sql_complex/B6.log:1 row in set (1 min 26.06 sec) par_sql_complex/CO.log:1 row in set (3 min 58.07 sec) par_sql_complex/DH.log:1 row in set (31.30 sec) par_sql_complex/DL.log:1 row in set (5 min 47.07 sec) par_sql_complex/EA.log:1 row in set (28.58 sec) par_sql_complex/EV.log:1 row in set (2 min 6.87 sec) par_sql_complex/F9.log:1 row in set (46.18 sec) par_sql_complex/FL.log:1 row in set (1 min 30.83 sec) par_sql_complex/HA.log:1 row in set (39.42 sec) par_sql_complex/HP.log:1 row in set (2 min 45.57 sec) par_sql_complex/ML.log:1 row in set (4.64 sec) par_sql_complex/MQ.log:1 row in set (2 min 22.55 sec) par_sql_complex/NW.log:1 row in set (4 min 26.67 sec) par_sql_complex/OH.log:1 row in set (1 min 9.67 sec) par_sql_complex/OO.log:1 row in set (2 min 14.97 sec) par_sql_complex/PA.log:1 row in set (17.62 sec) par_sql_complex/PI.log:1 row in set (14.52 sec) par_sql_complex/PS.log:1 row in set (3.46 sec) par_sql_complex/RU.log:1 row in set (40.14 sec) par_sql_complex/TW.log:1 row in set (2 min 32.32 sec) par_sql_complex/TZ.log:1 row in set (14.16 sec) par_sql_complex/UA.log:1 row in set (4 min 55.18 sec) par_sql_complex/US.log:1 row in set (4 min 38.08 sec) par_sql_complex/WN.log:1 row in set (4 min 56.12 sec) par_sql_complex/XE.log:1 row in set (24.21 sec) par_sql_complex/YV.log:1 row in set (20.82 sec)

As we can see there are large airlines (like AA, UA, US, DL, etc) which took most of the time. In this case the load will not be distributed evenly as in the previous example; however, by running the query in parallel we have got 3x times better response time on this server.

CPU utilization:

Cpu3 : 22.0%us, 1.2%sy, 0.0%ni, 74.4%id, 2.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 16.0%us, 0.0%sy, 0.0%ni, 84.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 39.0%us, 1.2%sy, 0.0%ni, 56.1%id, 3.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 33.3%us, 0.0%sy, 0.0%ni, 51.9%id, 13.6%wa, 0.0%hi, 1.2%si, 0.0%st Cpu7 : 33.3%us, 1.2%sy, 0.0%ni, 48.8%id, 16.7%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 24.7%us, 0.0%sy, 0.0%ni, 60.5%id, 14.8%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 : 24.4%us, 0.0%sy, 0.0%ni, 56.1%id, 19.5%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 : 40.7%us, 0.0%sy, 0.0%ni, 56.8%id, 2.5%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 19.5%us, 1.2%sy, 0.0%ni, 65.9%id, 12.2%wa, 0.0%hi, 1.2%si, 0.0%st Cpu12 : 40.2%us, 1.2%sy, 0.0%ni, 56.1%id, 2.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 82.7%us, 0.0%sy, 0.0%ni, 17.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 55.4%us, 0.0%sy, 0.0%ni, 43.4%id, 1.2%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 86.6%us, 0.0%sy, 0.0%ni, 13.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu16 : 61.0%us, 1.2%sy, 0.0%ni, 37.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 : 29.3%us, 1.2%sy, 0.0%ni, 69.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 18.8%us, 0.0%sy, 0.0%ni, 52.5%id, 28.8%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 : 14.3%us, 1.2%sy, 0.0%ni, 57.1%id, 27.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 : 12.3%us, 0.0%sy, 0.0%ni, 59.3%id, 28.4%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 : 10.7%us, 0.0%sy, 0.0%ni, 76.2%id, 11.9%wa, 0.0%hi, 1.2%si, 0.0%st Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 10.8%us, 2.4%sy, 0.0%ni, 71.1%id, 15.7%wa, 0.0%hi, 0.0%si, 0.0%st

Note that in case of “order by” we will need to manually sort the results, however, sorting 10-100 rows will be fast.

Conclusion

Splitting a complex report into multiple queries and running it in parallel (asynchronously) can increase performance (3x to 10x in the above example) and will better utilize modern hardware. It is also possible to split the queries between multiple MySQL servers (i.e. MySQL slave servers) to further increase scalability (will require more coding).

The post Increasing slow query performance with the parallel query execution appeared first on MySQL Performance Blog.

Pages

Contact Us 24 Hours A Day
Support Contact us 24×7
Emergency? Contact us for help now!
Sales North America (888) 316-9775 or
(208) 473-2904
Sales
Europe
+44-208-133-0309 (UK)
0-800-051-8984 (UK)
0-800-181-0665 (GER)
Training (855) 55TRAIN or
(925) 271-5054

 

Share This
]]>