]]>
]]>

Latest MySQL Performance Blog posts

You are here

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 1 hour 57 min ago

Using MySQL triggers and views in Amazon RDS

July 2, 2014 - 3:00am

I recently had an opportunity to migrate a customer from a physical server into Amazon’s RDS environment. In this particular case the customers’ platform makes extensive use of MySQL triggers and views.  I came across two significant issues that prevented me from following Amazon’s documentation, which basically states “use mysqldump” but doesn’t call out a specific method of dealing with MySQL triggers and views.

Amazon Relational Database Service (Amazon RDS) is a great platform if you’re looking for complete hands-off management of your MySQL environment, but comes at a cost in the area of flexibility, i.e. you don’t have SUPER privilege and this brings up additional challenges.

  1. You need to ensure you set log_bin_trust_function_creators=1 ( by default this is off, 0).
  2. You need to clean up your mysqldump syntax.

#1 is easy, you simply make a configuration change within the Amazon RDS GUI on the node’s Parameter Group to set log_bin_trust_function_creators=1 and then a restart of your Amazon RDS node.  The restart is required since without the SUPER privilege you lose access to changing DYNAMIC variables on the fly.
#2 is a little more complex.  If you go with vanilla mysqldump (from say a 5.5 mysqldump binary) on a schema that has triggers and views, you will see error 1227, something like this:

ERROR 1227 (42000) at line 27311: Access denied; you need (at least one of) the SUPER privilege(s) for this operation

You’re seeing this message because MySQL in Amazon RDS doesn’t provide the SUPER privilege, and thus you cannot set up a trigger or view to run as a different user — only a user with SUPER can do that.

mysqldump will generate syntax for a trigger like this:

DELIMITER ;; /*!50003 CREATE*/ /*!50017 DEFINER=`root`@`%`*/ /*!50003 TRIGGER `after_insert_lead` AFTER INSERT ON `leads` FOR EACH ROW BEGIN UPDATE analytics.mapping SET id_lead = NEW.id_lead WHERE mc_email = NEW.email; END */;; DELIMITER ;

and for a view like this:

/*!50001 CREATE ALGORITHM=UNDEFINED */ /*!50013 DEFINER=`web`@`%` SQL SECURITY DEFINER */ /*!50001 VIEW `admin_user_view` AS SELECT ...

The problem is in the “DEFINER” lines.

Here’s one method that worked for me:

  1. Identify all the DEFINER lines in your schema. I found it helpful to dump out a –no-data and then weed through that to get a unique list of the DEFINER lines
  2. Create a sed line for each unique DEFINER line (see my example in a moment)
  3. Include this sed line in your dump/load script

Here’s what my sed matches looked like:

sed -e 's//*!50017 DEFINER=`root`@`localhost`*///' -e 's//*!50017 DEFINER=`root`@`%`*///' -e 's//*!50017 DEFINER=`web`@`%`*///' -e 's//*!50017 DEFINER=`cron`@`%`*///' -e 's//*!50013 DEFINER=`cron`@`%` SQL SECURITY DEFINER *///' -e 's//*!50013 DEFINER=`root`@`localhost` SQL SECURITY DEFINER *///' -e 's//*!50013 DEFINER=`root`@`%` SQL SECURITY DEFINER *///' -e 's//*!50013 DEFINER=`web`@`%` SQL SECURITY DEFINER *///'

Note: the example above won’t directly work due to WordPress “helpfully” stripping my text… you need to escape the forward slashes and asterisks.

A big caveat: this method is akin to a brute force method of getting your data into Amazon RDS — you’ve lost the elegance & security of running your triggers and views as separate defined users within the database — they are all now going to run as the user you loaded them in as. If this is a show-stopper for you, contact Percona and I’d be happy to take on your case and develop a more comprehensive solution. 

Now all that’s left is to integrate this into your dump flow.  Something like this should work:

mysqldump --host=source | sed -e ... lots of lines | mysql --host=destination

I hope this helps someone!

The post Using MySQL triggers and views in Amazon RDS appeared first on MySQL Performance Blog.

Percona Server 5.6.19-67.0 with TokuDB (GA) now available

July 1, 2014 - 8:15am

Percona is glad to announce the release of Percona Server 5.6.19-67.0 on July 1, 2014. Download the latest version from the Percona web site or from the Percona Software Repositories.

Based on MySQL 5.6.19, including all the bug fixes in it, Percona Server 5.6.19-67.0 is the current GA release in the Percona Server 5.6 series. All of Percona’s software is open-source and free. Complete details of this release can be found in the 5.6.19-67.0 milestone on Launchpad.

New Features:

  • Percona has merged a contributed patch by Kostja Osipov implementing the Multiple user level locks per connection feature. This feature fixes the upstream bugs: #1118 and #67806.
  • TokuDB storage engine support is now considered general availability (GA) quality. The TokuDB storage engine from Tokutek improves scalability and the operational efficiency of MySQL with faster performance and increased compression. It is available as a separate package and can be installed along with the Percona Server by following the instructions in the release documentation.
  • Percona Server now supports the MTR --valgrind option for a server that is either statically or dynamically linked with jemalloc.

Bugs Fixed:

  • The libperconaserverclient18.1 package was missing the library files. Bug fixed #1329911.
  • Percona Server introduced a regression in 5.6.17-66.0 when support for TokuDB storage engine was initially introduced. This regression caused spurious “wrong table structure” errors for PERFORMANCE_SCHEMA tables. Bug fixed #1329772.
  • Race condition in group commit code could lead to a race condition in PFS instrumentation code resulting in a server crash. Bug fixed #1309026 (upstream #72681).

Other bugs fixed: #1326348 and #1167486.

NOTE: There was no Percona Server 5.6.18 release because there was no MySQL Community Server 5.6.18 release. That version number was used for a MySQL Enterprise Edition release to address the OpenSSL “Heartbleed” issue.

Release notes for Percona Server 5.6.19-67.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

The post Percona Server 5.6.19-67.0 with TokuDB (GA) now available appeared first on MySQL Performance Blog.

How to avoid even more of the common (but deadly) MySQL development mistakes

June 30, 2014 - 9:08am

On July 16 I’ll be presenting my next webinar focusing on common mistakes committed by MySQL users.

How to Avoid Even More of the Common (but Deadly) MySQL Development Mistakes

“Why can’t I just save my data to a file?”

Using an SQL database seems so complex to get right, and for good reason. The variety of data-driven applications is practically limitless, and as project requirements change, we find ourselves taking shortcuts and adopting bad habits. But there are proven methods to understanding how to develop and manage data in a scalable and reliable way. This talk shows you some of these methods, including:

  • How to optimize a database application with partitioning and sharding.
  • How to avoid the secret security vulnerability that you’re probably guilty of.
  • How to use optimizer hints effectively.

At the end of this webinar, you’ll be more productive and confident as you develop database-driven applications.

Please register for this webinar and join me on July 16!

You might also like to view recordings of my past “deadly mistakes” webinars:

The post How to avoid even more of the common (but deadly) MySQL development mistakes appeared first on MySQL Performance Blog.

Failover with the MySQL Utilities – Part 1: mysqlrpladmin

June 27, 2014 - 8:41am

MySQL Utilities are a set of tools provided by Oracle to perform many kinds of administrative tasks. When GTID-replication is enabled, 2 tools can be used for slave promotion: mysqlrpladmin and mysqlfailover. We will review mysqlrpladmin (version 1.4.3) in this post.

Summary
  • mysqlrpladmin can perform manual failover/switchover when GTID-replication is enabled.
  • You need to have your servers configured with --master-info-repository = TABLE or to add the --rpl-user option for the tool to work properly.
  • The check for errant transactions is failing in the current GA version (1.4.3) so be extra careful when using it or watch bug #73110 to see when a fix is committed.
  • There are some limitations, for instance the inability to pre-configure the list of slaves in a configuration file or the inability to check that the tool will work well without actually doing a failover or switchover.
Failover vs switchover

mysqlrpladmin can help you promote a slave to be the new master when the master goes down and then automate replication reconfiguration after this slave promotion. There are 2 separate scenarios: unplanned promotion (failover) and planned promotion (switchover). Beyond the words, it has implications on the way you have to execute the tool.

Setup for this test

To test the tool, our setup will be a master with 2 slaves, all using GTID replication. mysqlrpladmin can show us the current replication topology with the health command:

$ mysqlrpladmin --master=root@localhost:13001 --discover-slaves-login=root health # Discovering slaves for master at localhost:13001 # Discovering slave at localhost:13002 # Found slave: localhost:13002 # Discovering slave at localhost:13003 # Found slave: localhost:13003 # Checking privileges. # # Replication Topology Health: +------------+--------+---------+--------+------------+---------+ | host | port | role | state | gtid_mode | health | +------------+--------+---------+--------+------------+---------+ | localhost | 13001 | MASTER | UP | ON | OK | | localhost | 13002 | SLAVE | UP | ON | OK | | localhost | 13003 | SLAVE | UP | ON | OK | +------------+--------+---------+--------+------------+---------+ # ...done.

As you can see, we have to specify how to connect to the master (no surprise) but instead of listing all the slaves, we can let the tool discover them.

Simple failover scenario

What will the tool do when performing failover? Essentially we will give it the list of slaves and the list of candidates and it will:

  • Run a few sanity checks
  • Elect a candidate to be the new master
  • Make the candidate as up-to-date as possible by making it a slave of all the other slaves
  • Configure replication on all the other slaves to make them replicate from the new master

After killing -9 the master, let’s try failover:

$ mysqlrpladmin --slaves=root:@localhost:13002,root:@localhost:13003 --candidates=root@localhost:13002 failover

This time, the master is down so the tool has no way to automatically discover the slaves. Thus we have to specify them with the --slaves option.

However we get an error:

# Checking privileges. # Checking privileges on candidates. ERROR: You must specify either the --rpl-user or set all slaves to use --master-info-repository=TABLE.

The error message is clear, but it would have been nice to have such details when running the health command (maybe a warning instead of an error). That would allow you to check beforehand that the tool can run smoothly rather than to discover in the middle of an emergency that you have to look at the documentation to find which option is missing.

Let’s choose to specify the replication user:

$ mysqlrpladmin --slaves=root:@localhost:13002,root:@localhost:13003 --candidates=root@localhost:13002 --rpl-user=repl:repl failover # Checking privileges. # Checking privileges on candidates. # Performing failover. # Candidate slave localhost:13002 will become the new master. # Checking slaves status (before failover). # Preparing candidate for failover. # Creating replication user if it does not exist. # Stopping slaves. # Performing STOP on all slaves. # Switching slaves to new master. # Disconnecting new master as slave. # Starting slaves. # Performing START on all slaves. # Checking slaves for errors. # Failover complete. # # Replication Topology Health: +------------+--------+---------+--------+------------+---------+ | host | port | role | state | gtid_mode | health | +------------+--------+---------+--------+------------+---------+ | localhost | 13002 | MASTER | UP | ON | OK | | localhost | 13003 | SLAVE | UP | ON | OK | +------------+--------+---------+--------+------------+---------+ # ...done.

Simple switchover scenario

Let’s now restart the old master and configure it as a slave of the new master (by the way, this can be done with mysqlreplicate, another tool from the MySQL Utilities). If we want to promote the old master, we can run:

$ mysqlrpladmin --master=root@localhost:13002 --new-master=root@localhost:13001 --discover-slaves-login=root --demote-master --rpl-user=repl:repl --quiet switchover # Discovering slave at localhost:13001 # Found slave: localhost:13001 # Discovering slave at localhost:13003 # Found slave: localhost:13003 +------------+--------+---------+--------+------------+---------+ | host | port | role | state | gtid_mode | health | +------------+--------+---------+--------+------------+---------+ | localhost | 13001 | MASTER | UP | ON | OK | | localhost | 13002 | SLAVE | UP | ON | OK | | localhost | 13003 | SLAVE | UP | ON | OK | +------------+--------+---------+--------+------------+---------+

Notice that the master is available in this case so we can use the discover-slaves-login option. Also notice that we can tune the verbosity of the tool by using --quiet or --verbose or even log the output in a file with --log.

We also used --demote-master to make the old master a slave of the new master. Without this option, the old master will be isolated from the other nodes.

Extension points

In general doing switchover/failover at the database level is one thing but informing the other components of the application that something has changed is most often necessary for the application to keep on working correctly.

This is where the extension points are handy: you can execute a script before switchover/failover with --exec-before and after switchover/failover with --exec-after.

For instance with these simple scripts:

# cat /usr/local/bin/check_before #!/bin/bash /usr/local/mysql5619/bin/mysql -uroot -S /tmp/node1.sock -Ee 'SHOW SLAVE STATUS' > /tmp/before # cat /usr/local/bin/check_after #!/bin/bash /usr/local/mysql5619/bin/mysql -uroot -S /tmp/node1.sock -Ee 'SHOW SLAVE STATUS' > /tmp/after

We can execute:

$ mysqlrpladmin --master=root@localhost:13001 --new-master=root@localhost:13002 --discover-slaves-login=root --demote-master --rpl-user=repl:repl --quiet --exec-before=/usr/local/bin/check_before --exec-after=/usr/local/bin/check_after switchover

And looking the /tmp/before and /tmp/after, we can see that our scripts have been executed:

# cat /tmp/before # cat /tmp/after *************************** 1. row *************************** Slave_IO_State: Queueing master event to the relay log Master_Host: localhost Master_User: repl Master_Port: 13002 [...]

If the external script does not seem to work, using –verbose can be useful to diagnose the issue.

What about errant transactions?

We already mentioned that errant transactions can create lots of issues when a new master is promoted in a cluster running GTIDs. So the question is: how mysqlrpladmin behaves when there is an errant transaction?

Let’s create an errant transaction:

# On localhost:13003 mysql> CREATE DATABASE test2; mysql> FLUSH LOGS; mysql> SHOW BINARY LOGS; +------------------+-----------+ | Log_name | File_size | +------------------+-----------+ | mysql-bin.000001 | 69309 | | mysql-bin.000002 | 1237667 | | mysql-bin.000003 | 617 | | mysql-bin.000004 | 231 | +------------------+-----------+ mysql> PURGE BINARY LOGS TO 'mysql-bin.000004';

and let’s try to promote localhost:13003 as the new master:

$ mysqlrpladmin --master=root@localhost:13001 --new-master=root@localhost:13003 --discover-slaves-login=root --demote-master --rpl-user=repl:repl --quiet switchover [...] +------------+--------+---------+--------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | host | port | role | state | gtid_mode | health | +------------+--------+---------+--------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | localhost | 13003 | MASTER | UP | ON | OK | | localhost | 13001 | SLAVE | UP | ON | IO thread is not running., Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Slave has 1 transactions behind master. | | localhost | 13002 | SLAVE | UP | ON | IO thread is not running., Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.', Slave has 1 transactions behind master. | +------------+--------+---------+--------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Oops! Although it is suggested by the documentation, the tool does not check errant transactions. This is a major issue as you cannot run failover/switchover reliably with GTID replication if errant transactions are not correctly detected.

The documentation suggests errant transactions should be checked and a quick look at the code confirms that, but it does not work! So it has been reported.

Some limitations

Apart from the missing errant transaction check, I also noticed a few limitations:

  • You cannot use a configuration file listing all the slaves. This becomes boring once you have a large amount of slaves. In such a case, you should write a wrapper script around mysqlrpladmin to generate the right command for you
  • The slave election process is either automatic or it relies on the order of the servers given in the --candidates option. This is not very sophisticated.
  • It would be useful to have a –dry-run mode which would validate that everything is configured correctly but without actually failing/switching over. This is something MHA does for instance.
Conclusion

mysqlrpladmin is a very good tool to help you perform manual failover/switchover in a cluster using GTID replication. The main caveat at this point is the failing check for errant transactions, which requires a lot of care before executing the tool.

The post Failover with the MySQL Utilities – Part 1: mysqlrpladmin appeared first on MySQL Performance Blog.

Percona Server with TokuDB (beta): Installation, configuration

June 26, 2014 - 9:18am

My previous post was an introduction to the TokuDB storage engine and aimed at explaining the basics of its design and how it differentiates from InnoDB/XtraDB. This post is all about motivating you to give it a try and have a look for yourself. Percona Server is not officially supporting TokuDB as of today, though the guys in the development team are working hard on this and the first GA release of Percona Server with TokuDB is looming on the horizon. However, there’s a beta version available now. For the installation tests in this post I’ve used the latest version of Percona Server alongside the accompanying TokuDB complement, which was published last week.

Installing Percona Server with TokuDB on a sandbox

One of the tools Percona Support Engineers really love is Giuseppe Maxia’s MySQL Sandbox. It allows us to setup a sandbox running a MySQL instance of our choice and makes executing multiple ones for testing purposes very easily. Whenever a customer reaches us with a problem happening on a particular version of MySQL or Percona Server that we can reproduce, we quickly spin off a new sandbox and test it ourselves, so it’s very handy. I’ll use one here to explore this beta version of Percona Server with TokuDB but if you prefer you can install it the regular way using a package from our apt experimental or yum testing repositories.

We start by downloading the tarballs from here: TokuDB’s plugin has been packaged in its own tarball, so there are two to download. Once you get them let’s decompress both and create a unified working directory, compressing it again to create a single tarball we’ll use as source to create our sandbox:

[nando@test01 ~]# tar zxvf Percona-Server-5.6.17-rel66.0-608.Linux.x86_64.tar.gz [nando@test01 ~]# tar zxvf Percona-Server-5.6.17-rel66.0-608.TokuDB.Linux.x86_64.tar.gz [nando@test01 ~]# tar cfa Percona-Server-5.6.17-rel66.0-608.Linux.x86_64.tar.gz Percona-Server-5.6.17-rel66.0-608.Linux.x86_64/

Before going ahead, verify if you have transparent huge pages enabled as TokuDB won’t run if it is set. See this documentation page for explanation on what this is and how to disable it on Ubuntu. In my CentOS test server it was defined in a slightly different place and I’ve done the following to temporarily disable it:

[nando@test01]# echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled [nando@test01]# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

We’re now ready to create our sandbox. The following command should be enough (I’ve chosen to run Percona Server on port 5617, you can use any other available one):

[nando@test01 ~]# make_sandbox Percona-Server-5.6.17-rel66.0-608.Linux.x86_64.tar.gz -- --sandbox_directory=tokudb --sandbox_port=5617

If the creation process goes well you will see something like the following at the end:

.... sandbox server started Your sandbox server was installed in $HOME/sandboxes/tokudb

You should now be able to access the MySQL console on the sandbox with the default credentials; if you cannot, verify the log-in $HOME/sandboxes/tokudb/data/msandbox.err:

[nando@test01 ~]# mysql --socket=/tmp/mysql_sandbox5617.sock -umsandbox -pmsandbox

Alternatively, you can make use of the “use” script located inside the sandbox directory, which employs the same credentials (configured in the client section of the configuration file my.sandbox.cnf):

[nando@test01 ~]# cd sandboxes/tokudb/ [nando@test01 tokudb]# ./use

First thing to check is if TokuDB is being listed as an available storage engine:

mysql> SHOW ENGINES; +--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+ | Engine | Support | Comment | Transactions | XA | Savepoints | +--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+ | (...) | (...) | (...) | (...) | (...)| (...) | | TokuDB | YES | Tokutek TokuDB Storage Engine with Fractal Tree(tm) Technology | YES | YES | YES | | InnoDB | DEFAULT | Percona-XtraDB, Supports transactions, row-level locking, and foreign keys | YES | YES | YES | | (...) | (...) | (...) | NO | (...)| (...) | +--------------------+---------+----------------------------------------------------------------------------+--------------+------+------------+

If that’s not the case, you may need to load the plugins manually – I had to do so in my sandbox; you may not need if you’re installing it from a package in a fresh setup:

mysql> INSTALL PLUGIN tokudb SONAME 'ha_tokudb.so';

TokuDB should now figure in the list of supported ENGINES but you still need to activate the related plugins:

mysql> INSTALL PLUGIN tokudb_file_map SONAME 'ha_tokudb.so'; mysql> INSTALL PLUGIN tokudb_fractal_tree_info SONAME 'ha_tokudb.so'; mysql> INSTALL PLUGIN tokudb_fractal_tree_block_map SONAME 'ha_tokudb.so'; mysql> INSTALL PLUGIN tokudb_trx SONAME 'ha_tokudb.so'; mysql> INSTALL PLUGIN tokudb_locks SONAME 'ha_tokudb.so'; mysql> INSTALL PLUGIN tokudb_lock_waits SONAME 'ha_tokudb.so';

Please note the INSTALL PLUGIN action results in permanent changes and thus is required only once. No modifications to MySQL’s configuration file are required to have those plugins load in subsequent server restarts.

Now you should see not only the main TokuDB plugin but also the add-ons to the INFORMATION SCHEMA:

mysql> SHOW PLUGINS; +-------------------------------+----------+--------------------+--------------+---------+ | Name | Status | Type | Library | License | +-------------------------------+----------+--------------------+--------------+---------+ | (...) | (...) | (...) | (...) | (...) | | TokuDB | ACTIVE | STORAGE ENGINE | ha_tokudb.so | GPL | | TokuDB_trx | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | | TokuDB_locks | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | | TokuDB_lock_waits | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | | TokuDB_file_map | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | | TokuDB_fractal_tree_info | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | | TokuDB_fractal_tree_block_map | ACTIVE | INFORMATION SCHEMA | ha_tokudb.so | GPL | +-------------------------------+----------+--------------------+--------------+---------+

We are now ready to create our first TokuDB table – the only different thing to do here is to specify TokuDB as the storage engine to use:

mysql> CREATE TABLE test.Numbers (id INT PRIMARY KEY, number VARCHAR(20)) ENGINE=TokuDB;

Note some unfamiliar files lying in the datadir; the details surrounding those is certainly good material for future posts:

[nando@test01]# ls ~/sandboxes/tokudb/data auto.cnf _test_Numbers_main_3_2_19.tokudb ibdata1 _test_Numbers_status_3_1_19.tokudb ib_logfile0 tokudb.directory ib_logfile1 tokudb.environment log000000000000.tokulog25 __tokudb_lock_dont_delete_me_data msandbox.err __tokudb_lock_dont_delete_me_environment mysql __tokudb_lock_dont_delete_me_logs mysql_sandbox5617.pid __tokudb_lock_dont_delete_me_recovery performance_schema __tokudb_lock_dont_delete_me_temp tc.log tokudb.rollback test

Configuration: what’s really important

As noted by Vadim long ago, “Tuning of TokuDB is much easier than InnoDB, there’re only a few parameters to change, and actually out-of-box things running pretty well“:

mysql> show variables like 'tokudb_%'; +---------------------------------+------------------+ | Variable_name | Value | +---------------------------------+------------------+ | tokudb_alter_print_error | OFF | | tokudb_analyze_time | 5 | | tokudb_block_size | 4194304 | | tokudb_cache_size | 522651648 | | tokudb_check_jemalloc | 1 | | tokudb_checkpoint_lock | OFF | | tokudb_checkpoint_on_flush_logs | OFF | | tokudb_checkpointing_period | 60 | | tokudb_cleaner_iterations | 5 | | tokudb_cleaner_period | 1 | | tokudb_commit_sync | ON | | tokudb_create_index_online | ON | | tokudb_data_dir | | | tokudb_debug | 0 | | tokudb_directio | OFF | | tokudb_disable_hot_alter | OFF | | tokudb_disable_prefetching | OFF | | tokudb_disable_slow_alter | OFF | | tokudb_disable_slow_update | OFF | | tokudb_disable_slow_upsert | OFF | | tokudb_empty_scan | rl | | tokudb_fs_reserve_percent | 5 | | tokudb_fsync_log_period | 0 | | tokudb_hide_default_row_format | ON | | tokudb_init_flags | 11403457 | | tokudb_killed_time | 4000 | | tokudb_last_lock_timeout | | | tokudb_load_save_space | ON | | tokudb_loader_memory_size | 100000000 | | tokudb_lock_timeout | 4000 | | tokudb_lock_timeout_debug | 1 | | tokudb_log_dir | | | tokudb_max_lock_memory | 65331456 | | tokudb_pk_insert_mode | 1 | | tokudb_prelock_empty | ON | | tokudb_read_block_size | 65536 | | tokudb_read_buf_size | 131072 | | tokudb_read_status_frequency | 10000 | | tokudb_row_format | tokudb_zlib | | tokudb_tmp_dir | | | tokudb_version | tokudb-7.1.7-rc7 | | tokudb_write_status_frequency | 1000 | +---------------------------------+------------------+ 42 rows in set (0.00 sec)

The most important of the tokudb_ variables is arguably tokudb_cache_size. The test server where I ran those tests (test01) have a little less than 1G of memory and as you can see above TokuDB is “reserving” half (50%) of them to itself. That’s the default behavior but, of course, you can change it. And you must do it if you are also going to have InnoDB tables on your server – you should not overcommit memory between InnoDB and TokuDB engines. Shlomi Noach wrote a good post explaining the main TokuDB-specific variables and what they do. It’s definitely a worth read.

I hope you have fun testing Percona Server with TokuDB! If you run into any problems worth reporting, please let us know.

The post Percona Server with TokuDB (beta): Installation, configuration appeared first on MySQL Performance Blog.

Why %util number from iostat is meaningless for MySQL capacity planning

June 25, 2014 - 3:00am

Earlier this month I wrote about vmstat iowait cpu numbers and some of the comments I got were advertising the use of util% as reported by the iostat tool instead. I find this number even more useless for MySQL performance tuning and capacity planning.

Now let me start by saying this is a really tricky and deceptive number. Many DBAs who report instances of their systems having a very busy IO subsystem said the util% in vmstat was above 99% and therefore they believe this number is a good indicator of an overloaded IO subsystem.

Indeed – when your IO subsystem is busy, up to its full capacity, the utilization should be very close to 100%. However, it is perfectly possible for the IO subsystem and MySQL with it to have plenty more capacity than when utilization is showing 100% – as I will show in an example.

Before that though lets see what the iostat manual page has to say on this topic – from this main page we can read:

%util

Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100% for devices serving requests serially. But for devices serving requests in parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limits.

Which says right here that the number is useless for pretty much any production database server that is likely to be running RAID, Flash/SSD, SAN or cloud storage (such as EBS) capable of handling multiple requests in parallel.

Let’s look at the following illustration. I will run sysbench on a system with a rather slow storage data size larger than buffer pool and uniform distribution to put pressure on the IO subsystem. I will use a read-only benchmark here as it keeps things more simple…

sysbench –num-threads=1 –max-requests=0 –max-time=6000000 –report-interval=10 –test=oltp –oltp-read-only=on –db-driver=mysql –oltp-table-size=100000000 –oltp-dist-type=uniform –init-rng=on –mysql-user=root –mysql-password= run

I’m seeing some 9 transactions per second, while disk utilization from iostat is at nearly 96%:

[ 80s] threads: 1, tps: 9.30, reads/s: 130.20, writes/s: 0.00 response time: 171.82ms (95%)
[ 90s] threads: 1, tps: 9.20, reads/s: 128.80, writes/s: 0.00 response time: 157.72ms (95%)
[ 100s] threads: 1, tps: 9.00, reads/s: 126.00, writes/s: 0.00 response time: 215.38ms (95%)
[ 110s] threads: 1, tps: 9.30, reads/s: 130.20, writes/s: 0.00 response time: 141.39ms (95%)

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
dm-0 0.00 0.00 127.90 0.70 4070.40 28.00 31.87 1.01 7.83 7.52 96.68

This makes a lot of sense – with read single thread read workload the drive should be only used getting data needed by the query, which will not be 100% as there is some extra time needed to process the query on the MySQL side as well as passing the result set back to sysbench.

So 96% utilization; 9 transactions per second, this is a close to full-system capacity with less than 5% of device time to spare, right?

Let’s run a benchmark with more concurrency – 4 threads at the time; we’ll see…

[ 110s] threads: 4, tps: 21.10, reads/s: 295.40, writes/s: 0.00 response time: 312.09ms (95%)
[ 120s] threads: 4, tps: 22.00, reads/s: 308.00, writes/s: 0.00 response time: 297.05ms (95%)
[ 130s] threads: 4, tps: 22.40, reads/s: 313.60, writes/s: 0.00 response time: 335.34ms (95%)

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
dm-0 0.00 0.00 295.40 0.90 9372.80 35.20 31.75 4.06 13.69 3.38 100.01

So we’re seeing 100% utilization now, but what is interesting – we’re able to reclaim much more than less than 5% which was left if we look at utilization – throughput of the system increased about 2.5x

Finally let’s do the test with 64 threads – this is more concurrency than exists at storage level which is conventional hard drives in RAID on this system…

[ 70s] threads: 64, tps: 42.90, reads/s: 600.60, writes/s: 0.00 response time: 2516.43ms (95%)
[ 80s] threads: 64, tps: 42.40, reads/s: 593.60, writes/s: 0.00 response time: 2613.15ms (95%)
[ 90s] threads: 64, tps: 44.80, reads/s: 627.20, writes/s: 0.00 response time: 2404.49ms (95%)

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
dm-0 0.00 0.00 601.20 0.80 19065.60 33.60 31.73 65.98 108.72 1.66 100.00

In this case we’re getting 4.5x of throughput compared to single thread and 100% utilization. We’re also getting almost double throughput of the run with 4 thread where 100% utilization was reported. This makes sense – there are 4 drives which can work in parallel and with many outstanding requests they are able to optimize their seeks better hence giving a bit more than 4x.

So what have we so ? The system which was 96% capacity but which could have driven still to provide 4.5x throughput – so it had plenty of extra IO capacity. More powerful storage might have significantly more ability to run requests in parallel so it is quite possible to see 10x or more room after utilization% starts to be reported close to 100%

So if utilization% is not very helpful what can we use to understand our database IO capacity better ? First lets look at the performance reported from those sysbench runs. If we look at 95% response time you can see 1 thread and 4 threads had relatively close 95% time growing just from 150ms to 250-300ms. This is the number I really like to look at- if system is able to respond to the queries with response time not significantly higher than it has with concurrency of 1 it is not overloaded. I like using 3x as multiplier – ie when 95% spikes to be more than 3x of the single concurrency the system might be getting to the overload.

With 64 threads the 95% response time is 15-20x of the one we see with single thread so it is surely overloaded.

Do we have anything reported by iostat which we can use in a similar way? It turns out we do! Check out the “await” column which tells us how much the requester had to wait for the IO request to be serviced. With single concurrency it is 7.8ms which is what this drives can do for random IO and is as good as it gets. With 4 threads it is 13.7ms – less than double of best possible, so also good enough… with concurrency of 64 it is however 108ms which is over 10x of what this device could produce with no waiting and which is surely sign of overload.

A couple words of caution. First, do not look at svctm which is not designed with parallel processing in mind. You can see in our case it actually gets better with high concurrency while really database had to wait a lot longer for requests submitted. Second, iostat mixes together reads and writes in single statistics which specifically for databases and especially on RAID with BBU might be like mixing apples and oranges together – writes should go to writeback cache and be acknowledged essentially instantly while reads only complete when actual data can be delivered. The tool pt-diskstats from Percona Tookit breaks them apart and so can be much more for storage tuning for database workload.

Final note – I used a read-only workload on purpose – when it comes to writes things can be even more complicated – MySQL buffer pool can be more efficient with more intensive writes plus group commit might be able to commit a lot of transactions with single disk write. Still, the same base logic will apply.

Summary: The take away is simple – util% only shows if a device has at least one operation to deal with or is completely busy, which does not reflect actual utilization for a majority of modern IO subsystems. So you may have a lot of storage IO capacity left even when utilization is close to 100%.

The post Why %util number from iostat is meaningless for MySQL capacity planning appeared first on MySQL Performance Blog.

Sysbench Benchmarking of Tesora’s Database Virtualization Engine

June 24, 2014 - 5:00am

Tesora, previously called Parelastic, asked Percona to do a sysbench benchmark evaluation of its Database Virtualization Engine on specific architectures on Amazon EC2.

The focus of Tesora is to provide a scalable Database As A Service platform for OpenStack. The Database Virtualization Engine (DVE) plays a part in this as it aims at allowing databases to scale transparently across multiple MySQL shards.

DVE was open sourced last week. Downloads and source are already available on tesora.com

Some of the features include:

  • Transparent Sharding of data accross multiple storage nodes.
  • Applications can connect to DVE directly, using the MySQL Protocol, no code changes required.
  • Transactional and full ACID compliance with multiple storage nodes.
  • Storage Nodes can be added on an existing cluster.
Benchmarking Setup

Synthetic benchmarks were run using sysbench on different environments and different DVE architectures, as provided by Tesora. Architectures with 1 and 3 DVE nodes were benchmarked using up to 5 storage nodes.

The environments include a disk-bound dataset, the purpose of which is to create more insight into how large datasets might scale across multiple nodes. This type of use case is a common reason for companies to look into solutions like sharding.

A memory-bound dataset was also benchmarked to find out how DVE performs.

The memory-bound dataset was also used to compare a standalone MySQL Instance with a single DVE node using a single storage node. This provides more insight into the amount of overhead DVE creates at its most basic setup.

The remainder of this blog post gives a general overview on the findings of these benchmarks.
The full report, which includes more information on the configuration and goes deeper in it’s analysis of the results, can be downloaded here (PDF).

Scalability

OLTP – Disk Bound – Throughput:

OLTP – Disk Bound – Response Time:

In terms of scalability DVE lives up to its expectations as long as there is enough CPU available on the DVE nodes.

More complex transactions that query across multiple shards scale quite well. The explanation for that scalability, which is beyond linear, is that the available memory grows as storage nodes are added. The environment becomes less disk bound and performs better.

SELECT, INSERT, UPDATE – Disk Bound – Throughput:

SELECT, INSERT, UPDATE – Disk Bound – Response Time:

Single row reads and writes scale even better. This demonstrates that sharding has to be tailored towards both the data and the queries that will be executed across those tables.

Especially when using storage nodes with default EBS storage, disk performance is bad which makes the difference even larger.

The good thing about this solution is that storage and DVE nodes can be added, and the capacity of the whole system increases. No application code changes are necessary.

Running such an environment could come at a higher cost, but it could save a lot of resources and thus money on development. Of course, as a sharded architecture such as this are more complex compared to a non sharded architecture, the operational cost should not be ignored.

Overhead

Memory Bound Throughput:

Memory Bound Response Time:

The overhead of using DVE is noticeable, but much of it is related to the added network hop in the database layer and the CPU requirements of the DVE nodes.

DVE suffers as we go above 64 threads as you can see in the big increase in response time.

CPU Intensive

OLTP – Disk Bound:

DVE nodes are very CPU intensive. Throughout the analysis, the bottleneck is often caused by the CPU-constrained DVE nodes, this is very visible when looking at the benchmark results of p1_s3 and p1_s5, which use only one DVE node.

Even with simpler single-row, primary key-based SELECT statements, the overhead is noticeable.

Keep in mind that the hardware specifications of the DVE nodes (8 cores) were much higher than the database nodes (2 cores) itself. This makes the issue even more apparent.

Summary

Overall DVE looks very promising. When looking at the sysbench results, DVE seems to scale very well, provided that there are enough CPU resources available for the DVE nodes. You can download the full report here (PDF).

It will be interesting to see how much DVE scales on larger instance types. Other types of benchmarks should also be performed, such as linkbench. It is also important to understand the impact on the operational side, such as adding storage nodes, backups & recovery, high availability and how well it works transactions in various scenarios. Stay tuned.

The post Sysbench Benchmarking of Tesora’s Database Virtualization Engine appeared first on MySQL Performance Blog.

Getting to know TokuDB for MySQL

June 23, 2014 - 12:00am

During last April’s Percona Live MySQL Conference and Expo, TokuDB celebrated it’s first full-year as an open source storage engine. I still remember reading the official announcement and the expectations it created one year ago. The premises were very interesting as it had the potential of helping MySQL manage “big data” in a way InnoDB just couldn’t. It also provided additional interesting features like “hot schema changes,” all the while making our dear flash storages last longer.

While I’ve kept an eye on the evolution of TokuDB this past year, I reckon I haven’t given it a try…. until recently, when a beta version of Percona Server was released with support for TokuDB as a plugin.

If you haven’t tried TokuDB yet here’s an opportunity to do so. This first post looks into giving a bit of context about how TokuDB fits into MySQL and the next one, to be published in the following days, will document my experience in installing it with Percona Server. I hope you enjoy both and I would appreciate it if you’d take the time to add your comments and questions at the end of the post, so we can learn from each other.

The rise of InnoDB

As most of you know well, the heart of MySQL lies in the storage engine. InnoDB has completely changed MySQL, bringing not only transacional capabilities but also stability and new levels of maturity to the whole system. Even those that don’t really need transactions rejoice in the crash resistance strength of InnoDB. But do you remember that not so long ago InnoDB was a third-party proprietary plugin ? At first you would need to compile MySQL with it. Later they made it all easier by allowing the installation and load of the plugin into existing servers. But things really started to flourish when InnoDB went Open Source: it’s adoption increased and slowly it started gaining track and people’s thrust. With the code available for anyone to see, fix and extend, companies started committing their own resources into making InnoDB better, until it became the de facto storage engine of MySQL.

Balancing “big data” and storage cost

It’s true that today data stored (even compressed) into an InnoDB table takes more disk space compared to a similar MyISAM table but no one would expect there would be no trade-ins in the development of a new technology. At the same time, disk’s capacity were also increasing, which contributed to leverage the rate of bytes/$ and kind of compensated the gourmand needs of InnoDB.

But the increase in disk capacity also pushed the boundaries of what’s worth storing. What was for many the unlimited capacity of Gigabyte disks became, well, a limit, and then Terabyte disks developped into a “must-have”, a true need. At the same time though, with so many interesting things to see and surf, people’s attention became disputed and what was a commodity before transformed into a scarce good. Today, if a website takes more than a handful of seconds to load chances are it may loose the attention of some. SSD disks came to the rescue here, providing access to data in a fraction of the time a regular spindle disk would take. However, they don’t scale as well: the increase in cost for bytes/$ is proportional to the data access speed gains it brought and the longevity (or durability) of SSD disks is not as good, which makes for an expensive investment. It needs to be used wisely. For this reason, it’s becoming common to find a mixed use of fast and expensive SSD drives to store “popular” data and slower and cheaper spindle disks to store all the rest. Of course, this is a short term solution as it’s not that practical to maintain and requires a good amount of manual labor to decide which one stores what. In the long haul, it’s safe to predict SSD-based solutions will prosper as inexpensive storage but until then it is necessary to find a compromise between “big data” and hardware investment.

TokuDB’s premise

Another way of tackling this problem is changing the logical part. If one could store more data in the same amount of disk space and yet be able to store and retrieve it as fast, or even faster, then we would possibly get better results (in terms of performance) and a better return for the investment made in storage. That was the approach targeted by Tokutek in the development of the TokuDB storage engine. The core of it’s architecture is based in a different, modern indexing approach, baptized Fractal Tree Indexes (FTI). I say “different” because most popular storage engines such as MyISAM and InnoDB have a B-tree indexing foundation, which remained the somewhat “unchallenged” standard for the past three decades at least. And “modern” because it’s design take into account the write-intensive workloads we see rising more and more in contemporaneous data systems, as well as the “wear and tear” characteristic of the latest storage devices.

Both data structures are tree-based, store data in similar leaf nodes and make use of index key for ordering. But the way they manage and store data across the trees is different. TokuDB and its Fractal Tree structure make use of larger block sizes (bigger leaves) compared to InnoDB’s implementation of B-tree, which allows for better compression (the key for using less disk space) while also improving the performance of range queries. As important, TokuDB claims to make a better use of I/O by employing a message propagation system and an “optimal” buffering mechanism.

While in a traditional B-tree based system a change made in the table would reflect in an update in the index to acomodate it, TokuDB treats each change as a message at first. The interesting bit here is that even before the message reaches the respective leaf and modifies it, the changes it carries is already accounted by the database. It is like the database’s content is composed by the data found in the nodes plus the messages circulating in the tree. This brings agility to the storage engine and play an important role in providing Hot Schema Changes, for instance.

Regarding the optimized I/O buffering system, it is partly inherent to the use of bigger leaves. Or if you prefer, the other way around: the practical use of bigger leaves is made possible because buffers are used in a more efficient way. Efficiency here is measured according to bandwidth use. Remember, an I/O to disk is many times more expensive (in time) than an I/O to memory; that’s why buffers are used – you fill data into the buffer more frequently (less costly) so you can “flush” its content to disk less frequently (more costly). The fullest the buffer is when you flush it to disk, the more efficient use of bandwidth you’re doing. TokuDB’s tries to make the most out of it, “with hundreds or thousands of operations for a single I/O“. The problem of B-trees is that, by design, it’s difficult to implement an efficient buffering system and you tend to flush slightly filled buffers often. For this reason it is better to maintain smaller leafs in B-trees, which has the side effect of allowing for less good compression. Tim Callaghan, head of engineering at Tokutek,  explained those differences much better than I can at Percona Live London last November and his slides are available here.

One scenario that benefits from this optimized use of I/O is write-intensive applications. We have recently been using TokuDB with our Percona Cloud Tools (PCT) service to store and analyze slow query logs from MySQL servers. The compression benefits were also a driven reason in the choice of TokuDB as the storage engine for PCT, without which we would be much more limited about the number of organizations we could accomodate in this beta phase of the service. How big is the compression impact ? Like everything else in MySQL it depends of your schema. Shlomi Noach reported he was able to convert 4 Terabytes worth of uncompressed InnoDB data (or 2 Terabytes of compressed InnoDB data using KEY_BLOCK_SIZE=8) down to 200 Gygabytes. It may be that impressive.

Compression alone is a huge attractive characteristic of TokuDB but the storage engine also fits well in scenarios where storage space is not an issue. The optimization in I/O can help lagging replicas where writes (INSERTS) are the limiting factor, and not the network. The “hot schema changes” functionality can be a bless if you need to add a column to a big table, or a secondary index. There’s also the non-less important impact on flash drives durability. Mark Callaghan commented the following in a past post in this blog: “For pure-disk servers the advantages of TokuDB versus InnoDB are limited to a few workloads. For pure-flash servers the advantages of TokuDB are universal — 2X better compression (compared to InnoDB compression on my data) and larger (more sequential) writes means that you will buy much less flash and either it will last longer or you can buy less-expensive flash and it will last long enough“. And let’s not forget Vadim’s favorite feature in TokuDB: following query progress live in SHOW PROCESSLIST.

The future

Tokutek was clever in breaking with tradition and looking at the problem by another angle in the development of TokuDB. It bennefitted from the openess of MySQL and it’s storage engine API to implement a different solution, one that contemplates the reality of today – faster multi-core CPUs, modern but more “fragile” storage devices and a thirst for “big data.” Of course, it benefitted as well from observing how B-tree based storage engines coped with evolving data systems in the last decades and the development of new algorithms to come up with a new approach. And to make some things simpler along the way. It’s easier to tune TokuDB compared to InnoDB: I’ve counted 40 “tokudb_” variables while we find at least 100 more “innodb_“. But it has yet to endure the test of time. Even though we’re not talking about a brand-new storage engine (Vadim reported his first experiences with it 5 years ago) it has recently gone open source and community adoption is still in its initial phase, though steadily growing, as we can see by the number of opened bugs.

One thing that must worry many is the fact there’s no open source hot backup software for TokuDB. Even though there’s a community HotBackup API available on GitHub, which is a specification for a pluggable backup utility,” the only hot backup working solution available today is bundled in the Enterprise Edition of TokuDB. And since the design of TokuDB doesn’t allow for a backup approach based in copying the database files and then applying the logs containing the changes made in the database during the backup, which is how MySQL Enterprise Backup and Xtrabackup works, there’s no hope to easily extend an existing open source software such as Percona XtraBackup to include TokuDB.

Hopefully we’ll see a new open source backup software implementing the available API in the near future but for now it seems the community is left with filesystem-level snapshoot-based tools such as mylvmbackup and xfs_freeze as the sole alternatives to the proprietary solution.

The post Getting to know TokuDB for MySQL appeared first on MySQL Performance Blog.

How to setup Docker for Percona ClusterControl and add existing Percona XtraDB Cluster

June 20, 2014 - 8:12am

In my previous post I showed you how to setup Percona XtraDB Cluster 5.6 on Docker. This time I will show you how to setup Percona ClusterControl and add the existing Percona XtraDB Cluster 5.6 that we’ve managed to setup from the previous post.

Let us note the following details about our existing containers:

  • 172.17.0.2 dockerpxc1
  • 172.17.0.3 dockerpxc2
  • 172.17.0.4 dockerpxc3
  • 172.17.0.5 dockerccui-test

A quick tip for everyone who has followed my previous blog on setting up Percona XtraDB Cluster 5.6 on Docker: I did not install OpenSSH on the Docker instances on purpose and relied on ‘docker attach’ command to be able to get into each container. For this case however, we will need to install openssh-server in each container and make sure SSH is running as well as setup SSH key access for the Percona ClusterControl container to SSH into each Percona XtraDB Cluster node.

Create the Percona ClusterControl UI Docker container

We will need to create a docker container manually instead of building a container from a Dockerfile since we can’t run the ClusterControl installation non-interactively.

root@Perconallc-Support / # docker run --name dockerccui-test -p 80 -i -t ubuntu:12.04 bash

Notice that I had to add ‘-p 80′ to expose port 80 to the host network so we can access the Percona ClusterControl UI from a web browser, we will use Ubuntu 12.04 docker image. I will show you how to check the port that was dynamically allocated on the host network.

I would recommend to run ‘apt-get upgrade’ and ‘apt-get dist-upgrade’ just to make sure we have the latest software packages installed. Install wget and lsb-release packages since these are needed in the next steps.

Download and run the Percona ClusterControl installer and follow instructions on the prompt.

root@Perconallc-Support / # chmod +x install-cc.sh root@Perconallc-Support / # ./install-cc.sh

The installer will give you several options, one of which is to install Percona Server as Percona ClusterControl’s backend database, I’d highly recommend to choose ‘Yes’. If everything goes well you will need to continue setup of the Percona ClusterControl on the web browser. If you missed installing lsb-release earlier then you will get an error midway through the installation, but you can always install lsb-release package and re-run the installation.

To identify the exposed port on the host’s side we will need to verify it:

root@Perconallc-Support / # docker inspect dockerccui-test | grep HostPort "HostPort": "49154" "HostPort": "49154"

As we can see, port 49154 was dynamically allocated to the host network and mapped to port 80 on the docker instance. You may explicitly set the port mapping on the host network ‘-p {hostPort}:{containerPort}’, please consult official Docker documentation for further reading.

Setup Percona ClusterControl on the web browser

We can now access the Percona ClusterControl user interface through http://{Host}:49154/clustercontrol. Use the username (in email form) that you indicated during the installation and the default password ‘admin’ to log in.

We should see the following page after successfully logging in:

Percona ClusterControl Wizard

Select ‘Add an existing cluster’ and click Next then follow further instructions to get to the next page:

Add existing cluster

If all goes well you will be seeing the Database Clusters and can view your cluster nodes.

Percona ClusterControl UI on Docker container with Percona XtraDB Clusters 5.6 with each node on docker containers


Summary

In this blog I showed you how to setup Percona ClusterControl on Docker and adding an existing cluster to ClusterControl.

* Create a Docker container for Percona ClusterControl
* Download Percona ClusterControl and installed it
* Added existing Percona XtraDB Cluster on the Percona ClusterControl UI
* Profit!

You may also read the following blogs related to Percona ClusterControl:

For those who are new to Docker and containerization you may read through Patrick Galbraith’s blog series about Docker.

The post How to setup Docker for Percona ClusterControl and add existing Percona XtraDB Cluster appeared first on MySQL Performance Blog.

Using UDFs for geo-distance search in MySQL

June 19, 2014 - 7:44am

In my previous post about geo-spatial search in MySQL I described (along with other things) how to use geo-distance functions. In this post I will describe the geo-spatial distance functions in more details.

If you need to calculate an exact distance between 2 points on Earth in MySQL (very common for geo-enabled applications) you have at least 3 choices.

  • Use stored function and implement haversine formula
  • Use UDF (user defined function) for haversine (see below)
  • In MySQL 5.6 you can use st_distance function (newly documented), however, you will get the distance on plane and not on earth; the value returned will be good for sorting by distance but will not represent actual miles or kilometers.

MySQL stored function for calculating distance on Earth

I previously gave an example for a MySQL-stored function which implements the haversine formula. However, the approach I used was not very precise: it was optimized for speed. If you need a more precise haversine formula implementation you can use this function (result will be in miles):

delimiter // create DEFINER = CURRENT_USER function haversine_distance_sp (lat1 double, lon1 double, lat2 double, lon2 double) returns double begin declare R int DEFAULT 3958.76; declare phi1 double; declare phi2 double; declare d_phi double; declare d_lambda double; declare a double; declare c double; declare d double; set phi1 = radians(lat1); set phi2 = radians(lat2); set d_phi = radians(lat2-lat1); set d_lambda = radians(lon2-lon1); set a = sin(d_phi/2) * sin(d_phi/2) + cos(phi1) * cos(phi2) * sin(d_lambda/2) * sin(d_lambda/2); set c = 2 * atan2(sqrt(a), sqrt(1-a)); set d = R * c; return d; end; // delimiter ;

(the algorithm is based on the standard formula, I’ve used the well-known Movable Type scripts calculator)

This is a slower implementation as it uses arctangent, however it is more precise.  

MySQL UDF for Haversine distance

Another approach, which will give you much more performance is to use UDF. There are a number of implementations, I’ve used lib_mysqludf_haversine.

Here is the simple steps to install it in MySQL 5.6 (will also work with earlier versions):

$ wget 'https://github.com/lucasepe/lib_mysqludf_haversine/archive/master.zip' $ unzip master.zip $ cd lib_mysqludf_haversine-master/ $ make mysql> show global variables like 'plugin%'; +---------------+-------------------------+ | Variable_name | Value | +---------------+-------------------------+ | plugin_dir | /usr/lib64/mysql/plugin | +---------------+-------------------------+ 1 row in set (0.00 sec) $ sudo cp lib_mysqludf_haversine.so /usr/lib64/mysql/plugin/ mysql> CREATE FUNCTION haversine_distance RETURNS REAL SONAME 'lib_mysqludf_haversine.so'; mysql> select haversine_distance(37.470295464, -122.572938858498, 37.760150536, -122.20701914150199, 'mi') as dist_in_miles; +---------------+ | dist_in_miles | +---------------+ | 28.330467 | +---------------+ 1 row in set (0.00 sec)

Please note:

  • Make sure you have the mysql-devel or percona-server-devel package installed (MySQL development libraries) before installing.
  • You will need to specify the last parameter to be “mi” if you want to get the results in miles, otherwise it will give you kilometers.

MySQL ST_distance function

In MySQL 5.6 you can use ST_distance function:

mysql> select st_distance(point(37.470295464, -122.572938858498), point( 37.760150536, -122.20701914150199)) as distance_plane; +---------------------+ | distance_plane | +---------------------+ | 0.46681174155173943 | +---------------------+ 1 row in set (0.00 sec)

As we can see it does not give us an actual distance in mile or kilometers as it does not take into account that we have latitude and longitude, rather than X and Y on plane.

Geo Distance Functions Performance

The stored procedures and functions in MySQL are known to be slower, especially with trigonometrical functions. I’ve did a quick test, using MySQL function benchmark.

First I set 2 points (10 miles from SFO airport)

set @rlon1 = 122.572938858498; set @rlat1 = 37.470295464; set @rlon2 = -122.20701914150199; set @rlat2 = 37.760150536;

Next I use 4 function to benchmark:

  • Less precise stored function (haversine)
  • More precise stored function (haversine)
  • UDF for haversine
  • MySQL 5.6 native ST_distance (plane)

The benchmark function will execute the above function 100000 times.

Here are the results:

mysql> select benchmark(100000, haversine_old_sp(@rlat1, @rlon1, @rlat2, @rlon2)) as less_precise_mysql_stored_proc; +--------------------------------+ | less_precise_mysql_stored_proc | +--------------------------------+ | 0 | +--------------------------------+ 1 row in set (1.46 sec) mysql> select benchmark(100000, haversine_distance_sp(@rlat1, @rlon1, @rlat2, @rlon2)) as more_precise_mysql_stored_proc; +--------------------------------+ | more_precise_mysql_stored_proc | +--------------------------------+ | 0 | +--------------------------------+ 1 row in set (2.58 sec) mysql> select benchmark(100000, haversine_distance(@rlat1, @rlon1, @rlat2, @rlon2, 'mi')) as udf_haversine_function; +------------------------+ | udf_haversine_function | +------------------------+ | 0 | +------------------------+ 1 row in set (0.17 sec) mysql> select benchmark(100000, st_distance(point(@rlat1, @rlon1), point(@rlat2, @rlon1))) as mysql_builtin_st_distance; +---------------------------+ | mysql_builtin_st_distance | +---------------------------+ | 0 | +---------------------------+ 1 row in set (0.10 sec)

As we can see the UDF gives much faster response time (which is comparable to built-in function).

Benchmark chart (smaller the better)

Conclusion

The lib_mysqludf_haversine UDF provides a good function for geo-distance search in MySQL. Please let me know in the comments what geo-distance functions or approaches do you use in your applications.

The post Using UDFs for geo-distance search in MySQL appeared first on MySQL Performance Blog.

“How to monitor MySQL performance” with Percona Cloud Tools: June 25 webinar

June 18, 2014 - 6:58am

We recently released a new version of Percona Cloud Tools with MySQL monitoring capabilities. Join me June 25 and learn the details about all of the great new features inside Percona Cloud Tools – which is now free in beta. The webinar is titled “Monitoring All (Yes, All!) MySQL Metrics with Percona Cloud Tools” and begins at 10 a.m. Pacific time.

In addition to MySQL metrics, Percona Cloud Tools also monitors OS performance-related stats. The new Percona-agent gathers metrics with fine granularity (up to once per second), so you are able to see any of these metrics updated real-time.

During the webinar I’ll explain how the new Percona-agent works and how to configure it. And I’ll demonstrate the standard dashboard with the most important MySQL metrics and how to read them to understand your MySQL performance.

Our goal with the new implementation was to make installation as easy as possible. Seriously it should not take so much effort as it has in the past to get visibility into your MySQL performance. We also wanted to provide as much visibility as possible.

Please take a moment and register now for the webinar. I also encourage you, if you haven’t already, to sign up for access to the free Percona Cloud Tools beta ahead of time. At the end of the next week’s webinar you’ll know how to use the Percona-agent and will be able to start monitoring MySQL in less than 15 minutes!

See you June 25 and in the meantime you can check out our previous related posts: “From zero to full visibility of MySQL in 3 minutes with Percona Cloud Tools” and “Introducing the 3-Minute MySQL Monitor.”

The post “How to monitor MySQL performance” with Percona Cloud Tools: June 25 webinar appeared first on MySQL Performance Blog.

MySQL Backup Service from Percona

June 17, 2014 - 5:00am

The Percona Backup Service managed service launched today. It ensures properly configured backups run successfully as scheduled on customer provided backup storage – whether on premise, in the cloud, or a hybrid. Backup issues or production data recovery are efficiently handled by Percona Managed Services technicians with deep knowledge of MySQL.

As we state in our white papers, “Backup and recovery are a foundational piece of any infrastructure. A well-tested backup and recovery system can be the difference between a minor outage and the end of a business.” While MySQL backups are “foundational,” they still require constant management, and the most important use of a backup, namely recovery, is often complex.

The Percona MySQL Backup Service is available for companies using any variant of single-node or clustered MySQL — on premise or in the cloud. This includes MySQL, Percona Server, MariaDB, and Percona XtraDB Cluster.

Reliable MySQL Backup Services

Developed by MySQL experts with decades of operational experience, the Percona MySQL Backup Service is based on widely adopted open source database backup software solutions such as Percona XtraBackup. Monitoring software is installed to ensure MySQL backups run as expected and alert the Percona Managed Services team to any issues. Percona experts on duty 24x7x365 resolve problems before the availability of backups is impacted and can implement a full, partial, or point in time recovery to minimize downtime and data loss on production servers.

MySQL backup data sets are secure and compliant with regulatory requirements. 256 bit encryption meets or exceeds common security and compliance requirements. Internal procedures ensure that backups are managed with extreme care and are only stored safely on secure servers. Backups are available for on-demand recovery to comply with HIPAA, SOX, or PCI requirements.

The status of current and past backups is easily accessible through the Percona MySQL Backup Service customer portal. The portal includes instructions on how to use the backups to restore data for routine purposes such as restoration of development databases.

Efficient MySQL Data Recovery

Percona Backup Service technical experts respond within 30 minutes to ensure that production data is recovered quickly and as completely as possible. We can recover:

  • Data back to the last full capture image
  • Specific tables which saves significant time when only a single table or set of tables needs to be recovered
  • Full data to a specific point in time which ensures an application can be recovered to the same state as when the data was lost

Unlike database-as-a-service solutions, the Percona Backup Service can recover specific tables and full data to a specific point in time because we create additional backup points.

Cost Effective and Highly Flexible

Our MySQL backup-as-a-service solution costs less than managing backups in-house. Our 24x7x365 team serves multiple customers so the cost of the Percona MySQL Backup Service is lower than having someone on staff to manage backups. We also use a proprietary backup approach which significantly reduces the size of backup data sets, requiring much less storage space than conventional methods.

The Percona MySQL Backup Service accommodates any combination of MySQL server and backup server locations. We can work with MySQL server and backup servers on premise or in the cloud and can even configure the process to store one backup set on premise and another in the cloud for additional protection. This configuration flexibility means we can meet a wide range of data availability and budget needs.

MySQL Backup Service Features

The Percona Backup Service is designed so backups will run smoothly and reliably. The following features are included:

  • Customer portal for anytime access to current and past backup status as well as instructions on how to restore the MySQL backups for non-production purposes
  • Efficient data recovery for production issues with full, partial, and point in time recovery options
  • A high level of security with 256 bit encryption and backups only stored on the customer’s secure servers
  • Regulatory compliance with backups available for on-demand recovery to comply with HIPAA, SOX, and PCI requirements
  • Lower cost than managing MySQL backups in-house with 24x7x365 monitoring and issue resolution combined with a proprietary backup approach which significantly reduces the size of the backup data set versus conventional methods
  • Ability to accommodate any combination of MySQL server and backup server locations – on premise, in the cloud, or a hybrid
  • Flexible configuration options which enable the service to meet a wide range of data availability and budget requirements
Learn More

Learn more about our MySQL Backup Service solution as well as register for our upcoming “MySQL Backup and Recovery Best Practices” webinar on June 18, 2014. Contact us now to learn more about the Percona Backup Service and how we can ensure your backups are ready when you need them!

The post MySQL Backup Service from Percona appeared first on MySQL Performance Blog.

mydumper [less] locking

June 13, 2014 - 7:15am
In this post I would like to review how my dumper for MySQL works from the point of view of locks. Since 0.6 serie we have different options, so I will try to explain how they work

As you may know mydumper is multithreaded and this adds a lot of complexity compared with other logical backup tools as it also needs to coordinate all threads with the same snapshot to be consistent. So let review how mydumper does this with the default settings.

By default mydumper uses 4 threads to dump data and 1 main thread

Main Thread
  • FLUSH TABLES WITH READ LOCK
Dump Thread X
  • START TRANSACTION WITH CONSISTENT SNAPSHOT;
  • dump non-InnoDB tables
Main Thread
  • UNLOCK TABLES
Dump Thread X
  • dump InnoDB tables
As you can see in this case we need FTWRL for two things, coordinate transaction’s snapshots and dump non-InnoDB tables in a consistent way. So we have have global read lock until we dumped all non-InnoDB tables.What less locking does is this:Main Thread
  • FLUSH TABLES WITH READ LOCK
Dump Thread X
  • START TRANSACTION WITH CONSISTENT SNAPSHOT;
 LL Dump Thread X
  • LOCK TABLES non-InnoDB
Main Thread
  • UNLOCK TABLES
 LL Dump Thread X
  • dump non-InnoDB tables
  • UNLOCK non-InnoDB
Dump Thread X
  • dump InnoDB tables

So now the global read lock its in place until less-locking threads lock non-InnoDB tables, and this is really fast. The only downsize is that it uses double the amount of threads, so for the default (4 threads) we will end up having 9 connections, but always 4 will be running at the same time.

Less-locking really helps when you have MyISAM or ARCHIVE that are not heavily updated by production workload, also you should know that LOCK TABLE … READ LOCAL allows non conflicting INSERTS on MyISAM so if you use that tables to keep logs (append only) you will not notice that lock at all.

For the next release we will implement backup locks that will avoid us to run FTWRL.

The post mydumper [less] locking appeared first on MySQL Performance Blog.

Announcing Percona XtraBackup 2.2.3 GA

June 12, 2014 - 7:13am

Percona is glad to announce the release of Percona XtraBackup 2.2.3 on June 12th 2014. Downloads are available from our download site here and Percona Software Repositories.

Percona XtraBackup enables backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backup. The new 2.2.3 GA version offers improved performance, enterprise-grade security, and lower resource usage.

This release is the first GA (Generally Available) stable release in the 2.2 series.

New Features:

  • Percona XtraBackup implemented support for Backup Locks.
  • Percona XtraBackup can now store backup history on the server itself in a special table created for that purpose.
  • Percona XtraBackup has been rebased on MySQL 5.6.17.

Bugs Fixed:

  • Fixed the InnoDB redo log incompatibility with 5.1/5.5 server and compressed tables which was introduced by the upstream fix in MySQL 5.6.11 that could make InnoDB crash on recovery when replaying redo logs created on earlier versions. Bug fixed #1255476.
  • Percona XtraBackup did not flush the InnoDB REDO log buffer before finalizing the log copy. This would only become a problem when the binary log coordinates were used after restoring from a backup: the actual data files state after recovery could be inconsistent with the binary log coordinates. Bug fixed #1320685.
  • innobackupex now sets wsrep_causal_reads to 0 before executing FLUSH TABLES WITH READ LOCK if the server is a member of the Galera cluster. Bug fixed #1320441.
  • storage/innobase/xtrabackup/CMakeLists.txt now honors the XB_DISTRIBUTION environment variable when configuring innobackupex.pl to innobackupex. Bug fixed #1320856.
  • Percona XtraBackup does not add XtraDB-specific fields when preparing a
    backup of an upstream MySQL server. Bug fixed #988310.
  • Information about backup type and parameters is now stored in the
    xtrabackup_info file in the backup directory. Bug fixed #1133017.
  • When backup locks are used, xtrabackup_slave_info should be written under BINLOG lock instead of TABLE lock. Bug fixed #1328532.

Release notes with all the bugfixes for Percona XtraBackup 2.2.3 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

The post Announcing Percona XtraBackup 2.2.3 GA appeared first on MySQL Performance Blog.

Percona Server 5.6.17-66.0 is now available

June 11, 2014 - 11:20am

 

Percona is glad to announce the release of Percona Server 5.6.17-66.0 on June 11, 2014. Downloads are available here and from the Percona Software Repositories.

Based on MySQL 5.6.17, including all the bug fixes in it, Percona Server 5.6.17-66.0 is the current GA release in the Percona Server 5.6 series. All of Percona’s software is open-source and free, all the details of the release can be found in the 5.6.17-66.0 milestone at Launchpad.

 

New Features:

  • Percona Server has ported MariaDB code enhancement for start transaction with consistent snapshot.
  • Percona Server has implemented the ability to make a clone of a snapshot created by START TRANSACTION WITH CONSISTENT SNAPSHOT in another session.
  • TokuDB Storage engine is now available as a separate package that can be installed along with the Percona Server 5.6.17-66.0. This feature is currently considered Release Candidate quality.
  • SQL-bench has been made compatible with the TokuDB storage engine.
  • Percona Server 5.6 now includes HandlerSocket in addition to Percona Server 5.5.

Bugs Fixed:

  • Fix for #1225189 introduced a regression in Percona Server 5.6.13-61.0 which could lead to an error when running mysql_install_db. Bug fixed #1314596.
  • InnoDB could crash if workload contained writes to compressed tables. Bug fixed #1305364.
  • GUI clients such as MySQL Workbench could not authenticate with a user defined with auth_pam_compat plugin. Bug fixed #1166938.
  • Help in Percona Server 5.6 command line client was linking to Percona Server 5.1 manual. Bug fixed #1198775.
  • InnoDB redo log resizing could crash when XtraDB changed page tracking was enabled. Bug fixed #1204072.
  • Audit Log wasn’t parsing escape characters correctly in the OLD format. Bug fixed #1313696.
  • Percona Server version was reported incorrectly in Debian/Ubuntu packages. Bug fixed #1319670.

Other bugs fixed: #1318537 (upstream #72615), #1318874, #1285618, #1272732, #1314568, #1271178, and #1323014.

Release notes for Percona Server 5.6.17-66.0 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

The post Percona Server 5.6.17-66.0 is now available appeared first on MySQL Performance Blog.

Measure the impact of MySQL configuration changes with Percona Cloud Tools

June 11, 2014 - 12:00am

When you make a change to your MySQL configuration in production it would be great to know the impact (a “before and after” type of picture). Some changes are obvious. For many variables proper values can be determined beforehand, i.e. innodb_buffer_pool_size or innodb_log_file_size. However, there is 1 configuration variable which is much less obvious for many people working with MySQL: query_cache.

The idea of query cache is great, however, there are a lot of issues with MySQL query cache, one of the most important issues is query cache mutex which can cause a severe contention on the CPU bound workloads. In MySQL 5.6 you can remove the mutex by disabling the query cache completely (set query_cache_type=0).

There is a big issue with disabling query_cache though and it is not a technical issue. The issue is how do you convince your boss or dev team/operations team that query cache is bad. Almost all measurements available in MySQL will tell you that the query cache is utilized: there will be a good number of Qcache_hits. On the other hand you may see the contention in processlist: for example you can see one of those states associates with query cache contention:  Waiting for query cache lock or Waiting on query cache mutex (if you are running Percona Server).

Now you use Percona Cloud Tools (with Percona Server) to actually see how query cache is used globally and on per query basis.

Query cache visualization

Percona Server has this very useful additional feature: if you set log_slow_verbosity”  option to “microtime, query_plan, innodb” it will also log the information if the query was a query cache hit or query cache miss. This can give you a valuable inside for the query cache real utilization.

Percona Cloud Tools will be really helpful here as it will visualize this information and will actually see “before and after” graph. (It will set “log_slow_verbosity” option to “microtime, query_plan, innodb” for you if you set Slow log verbosity = Full in the agent configuration options).

Here is the example. I’ve disabled the query cache and then looked at the total query and lock time.

As we can see the total query time (across all queries) decreased significantly. The Average QPS on this picture is actually a little bit misleading and should be named “Queries running at the same time” (it is calculated as query count / time). The above graphs shows clear contention on the query cache level.

Just to confirm, we can look at the number of query cache hits:

The number of Query Cache “hits” dropped to 0 when we disabled the query cache.

Please note: if your workload is readonly and you are running the same complex query over and over again, query cache may be helpful. Usually, in normal circumstances however, the query cache can cause contention and reduce the response time (as you can see above).

Percona Cloud Tools is a free beta, please give it a try and share your experience in the comments. If you want to learn more about it, join the June 25 webinar hosted by Vadim Tkachenko titled, “Monitoring All (Yes, All!) MySQL Metrics with Percona Cloud Tools.” The webinar, which starts at 10 a.m. Pacific time, is free but I recommend registering now to reserve your spot.

 

The post Measure the impact of MySQL configuration changes with Percona Cloud Tools appeared first on MySQL Performance Blog.

Architecture and Design of MySQL-powered applications: June 11 Webinar

June 9, 2014 - 1:21pm

The architecture of MySQL-powered applications is one of my favorite topics to talk about. It’s a very important topic because if you do not get the architecture right then you’re very likely to fail with your project – either from the standpoint of failing with performance, high availability or security requirements… or failing to deliver on time and at the planned cost.

It’s also a great topic because there is so much knowledge available these days about MySQL-powered applications. MySQL has been around for a rather long time compared with many other solutions – and now we know what architectures have enabled people to build successful MySQL-powered applications and grow them to hundreds of millions of users as well as what applications did not work out.

This level of maturity really allows us to essentially take “off-the-shelf” MySQL architectures that can be used to build very successful applications with very low risk, at least when it comes to the database backend.

June 11 at 10 a.m. Pacific

On Wednesday, June 11 at 10 a.m. Pacific I’ll be leading  a webinar titled, “Architecture and Design of MySQL Powered Applications” during which I’ll be covering a lot of high-level scenarios such as “architectures that have been proven to be successful.” I’ll also discuss the architectures best suited for different kinds of applications and different kind of teams. Additionally, I’ll address what’s changed in the MySQL space in the last few years and how those changes have impacted modern MySQL architecture design approaches.

I hope to see you June 11. Register now to reserve your spot!

The post Architecture and Design of MySQL-powered applications: June 11 Webinar appeared first on MySQL Performance Blog.

Using MySQL 5.6 Performance Schema in multi-tenant environments

June 9, 2014 - 7:33am

Hosting a shared MySQL instance for your internal or external clients (“multi-tenant”) was always a challenge. Multi-tenants approach or a “schema-per-customer” approach is pretty common nowadays to host multiple clients on the same MySQL sever. One of issues of this approach, however, is the lack of visibility: it is hard to tell how many resources (queries, disk, cpu, etc) each user will use.

Percona Server contains userstats Google patch, which will allow you to get the resource utilization per user. The new MySQL 5.6 performance_schema has even more instrumentation which can give you a better visibility on per-user or per-schema/per-database level. And if you are running MySQL 5.6.6 or higher, Performance Schema is already enabled (with minimum set of “instrumentation” thou) and ready to use. In this post I will share examples of using Performance Schema for multi-tenant environments.

Overview

If you want to use Performance Schema, make sure it is enabled (enabled by default starting with MySQL 5.6.6):

mysql> show global variables like 'performance_schema'; +--------------------+-------+ | Variable_name | Value | +--------------------+-------+ | performance_schema | ON | +--------------------+-------+ 1 row in set (0.00 sec)

Performance_schema provides you with the “raw” metrics and it may be difficult to select data from it. The good news is that you can use the “sys” schema project by Mark Leith. The “sys” schema (old name: ps_helper) is a collection of views and stored procedures which will provide you with reports in human readable format. The installation is easy, download it from github and run:

$ mysql -u root -p < ./sys_56.sql

(it will only create database “sys” and a set of views and stored procedures/stored functions in it)

Examples

For the multi-tenant environment the most interesting is resource utilization breakdown. Lets say you want to “charge per IO” similar to Amazon RDS for MySQL model. You can now run this simple query against “sys” schema, which will use Performance Schema in MySQL 5.6 to generate report (in my example all users starts with “a”):

mysql sys> select * from user_summary_by_file_io where user like 'a%'; +-------+------------+-------------+ | user | ios | io_latency | +-------+------------+-------------+ | api01 | 3061469563 | 4.04h | | api03 | 912296937 | 1.36h | | api02 | 815473183 | 1.22h | | app01 | 9704381 | 00:01:06.33 | | app02 | 1160149 | 8.18 s | | app03 | 1424065 | 7.23 s | +-------+------------+-------------+ 6 rows in set (0.06 sec)

If you need more extensive metrics you can use this report:

mysql sys> select * from user_summary where user like 'a%' limit 10; +-------+------------+-------------------+-----------------------+-------------+------------+-----------------+---------------------+-------------------+--------------+ | user | statements | statement_latency | statement_avg_latency | table_scans | file_ios | file_io_latency | current_connections | total_connections | unique_hosts | +-------+------------+-------------------+-----------------------+-------------+------------+-----------------+---------------------+-------------------+--------------+ | api01 | 1837898366 | 192.29h | 376.65 us | 2221018 | 3064724488 | 4.05h | 22 | 73748519 | 1 | | api02 | 833712609 | 206.83h | 893.11 us | 1164 | 836753563 | 1.26h | 7 | 246730 | 1 | | api03 | 988875004 | 222.03h | 808.29 us | 1156 | 943363030 | 1.43h | 8 | 247639 | 1 | | app01 | 1864159 | 4.01h | 7.75 ms | 46756 | 9704623 | 00:01:06.33 | 0 | 300004 | 1 | | app02 | 812520 | 00:20:51.69 | 1.54 ms | 42513 | 1164954 | 8.37 s | 0 | 122994 | 1 | | app03 | 811558 | 00:21:59.57 | 1.63 ms | 42310 | 1429373 | 7.40 s | 0 | 123110 | 1 | +-------+------------+-------------------+-----------------------+-------------+------------+-----------------+---------------------+-------------------+--------------+ 6 rows in set

Or a breakdown per user per statement:

mysql sys> select user, statement, total, rows_sent, rows_examined, rows_affected, full_scans from user_summary_by_statement_type where user = 'api01' limit 100; +-------+-------------+-----------+-----------+---------------+---------------+------------+ | user | statement | total | rows_sent | rows_examined | rows_affected | full_scans | +-------+-------------+-----------+-----------+---------------+---------------+------------+ | api01 | insert | 368405774 | 0 | 0 | 494179199 | 0 | | api01 | commit | 376276686 | 0 | 0 | 0 | 0 | | api01 | select | 476129532 | 761038004 | 1067363348 | 0 | 0 | | api01 | update | 142896430 | 0 | 142896437 | 141978066 | 0 | | api01 | begin | 376280424 | 0 | 0 | 0 | 0 | | api01 | show_fields | 2215016 | 310102240 | 310102240 | 0 | 2215016 | | api01 | Quit | 73421361 | 0 | 0 | 0 | 0 | | api01 | change_db | 17258767 | 0 | 0 | 0 | 0 | | api01 | delete | 7896 | 0 | 7896 | 7896 | 0 | | api01 | rollback | 3711 | 0 | 0 | 0 | 0 | +-------+-------------+-----------+-----------+---------------+---------------+------------+ 10 rows in set (0.04 sec)

If you are using  “schema per customer” approach you get get the per-schema resource utilization using sys.schema_table_statistics. Example (from Peter Zaitsev’s webinar):

mysql [localhost] {msandbox} > select * from schema_table_statistics where table_schema='sbtest' limit 1 G *************************** 1. row *************************** table_schema: sbtest table_name: sbtest rows_fetched: 158764154 fetch_latency: 1.37h rows_inserted: 378901 insert_latency: 00:07:17.38 rows_updated: 1136714 update_latency: 00:45:40.08 rows_deleted: 378902 delete_latency: 00:03:00.34 io_read_requests: 636003 io_read: 9.70 GiB io_read_latency: 00:28:12.01 io_write_requests: 203925 io_write: 3.11 GiB io_write_latency: 17.26 s io_misc_requests: 2449 io_misc_latency: 3.87 s

This report may be really slow if you have lots of tables. If you are only interested in disk utilization per database you can directly query the performance_schema:

SELECT extract_schema_from_file_name(fsbi.file_name) AS table_schema, SUM(fsbi.count_read) AS io_read_requests, sys.format_bytes(SUM(fsbi.sum_number_of_bytes_read)) AS io_read, sys.format_time(SUM(fsbi.sum_timer_read)) AS io_read_latency, SUM(fsbi.count_write) AS io_write_requests, sys.format_bytes(SUM(fsbi.sum_number_of_bytes_write)) AS io_write, sys.format_time(SUM(fsbi.sum_timer_write)) AS io_write_latency, SUM(fsbi.count_misc) AS io_misc_requests, sys.format_time(SUM(fsbi.sum_timer_misc)) AS io_misc_latency FROM performance_schema.file_summary_by_instance AS fsbi GROUP BY table_schema ORDER BY io_write_requests DESC limit 10; +--------------------+------------------+------------+-----------------+-------------------+------------+------------------+------------------+-----------------+ | table_schema | io_read_requests | io_read | io_read_latency | io_write_requests | io_write | io_write_latency | io_misc_requests | io_misc_latency | +--------------------+------------------+------------+-----------------+-------------------+------------+------------------+------------------+-----------------+ | binlogs | 6943921992 | 19.57 TiB | 2.46h | 1124408959 | 6.60 TiB | 2.18h | 914411 | 00:03:55.75 | | d1 | 146448 | 2.24 GiB | 1.97 s | 22768241 | 241.44 GiB | 00:12:57.97 | 5095817 | 00:12:50.16 | | stats1 | 853 | 1.81 MiB | 97.12 ms | 1042768 | 15.93 GiB | 10.27 s | 292303 | 22.74 s | | stats2 | 3024 | 11.22 MiB | 781.96 ms | 750182 | 11.45 GiB | 9.32 s | 267780 | 50.84 s | | relaylogs | 105065 | 925.71 MiB | 312.82 ms | 447607 | 925.71 MiB | 1.19 s | 72169 | 530.70 ms | | o1268 | 2078 | 3.02 MiB | 7.40 ms | 1591 | 203.97 KiB | 1.17 ms | 2693 | 92.34 ms | +--------------------+------------------+------------+-----------------+-------------------+------------+------------------+------------------+-----------------+

Conclusion

Performance_schema in MySQL 5.6 is very useful tool and can help you to get better visualization in MySQL multi-user (multi-tenant) installations. It can also do much more, i.e. find all queries with temporary tables or profile locks and mutex. “sys” schema provide you with the useful pre-defined reports, the table (view) names are almost self explanatory:

+-----------------------------------------------+ | Tables_in_sys | +-----------------------------------------------+ | innodb_buffer_stats_by_schema | | innodb_buffer_stats_by_table | ... | processlist | ... | schema_tables_with_full_table_scans | | schema_unused_indexes | | statement_analysis | | statements_with_errors_or_warnings | | statements_with_full_table_scans | | statements_with_runtimes_in_95th_percentile | | statements_with_sorting | | statements_with_temp_tables | ...

If you are using MySQL 5.6, performance_schema will give a additional valuable inside.

The post Using MySQL 5.6 Performance Schema in multi-tenant environments appeared first on MySQL Performance Blog.

Why did we develop percona-agent in Go?

June 4, 2014 - 8:12am

We recently open-sourced our percona-agent and if you check out the source code, you’ll find that it is written in the Go programming language (aka Golang). For those not up to speed, the percona-agent is a real-time client-side agent for Percona Cloud Tools.

Our requirements are quite demanding for our agents. This one is software that works on a real production server, so it must be fast, reliable, lightweight and easy to distribute. Surprisingly enough, binaries compiled by Go fit these characteristics.

There are of course alternatives that we considered. On the scripting side: Perl, Python, PHP, Ruby et al. These are not necessarily fast, and the distribution is also interesting. We have enough experience with Percona Toolkit and Perl’s “modules dependencies hell.”

On a high-end level side, there is C / C++ and I am sure we could produce an efficient agent. However we also have experience in the distribution of Percona Server / Percona XtraDB Cluster / Percona XtraBackup. Well, I have had enough with different versions of Ubuntus, Red Hats, CentOSes and the rest of the flavors of Linux.

And, yeah, there is Java, but let me just say that we are not the Java sort of developers.

So what is so charming about Go? Go feels like a scripting language, but produces executable binaries. I see it as having the ability to attack performance issues on two sides. First is the performance of software developers: They are more productive working with scripting-like languages. Second is the performance of a final product: Native self-executable binaries are more efficient than a script running through a interpreter.

It is worth noting that included batteries (a set of packages that are coming with Go) are excellent, and in many cases that will be just enough to get you going and produce software that is quite complex. And if that is not enough, there is also a list of packages and projects for Go.

Of course, there are some limitations you will find in Go (some of them are temporary I hope). These are:

1. The list of supported platforms is limited… FreeBSD (release 8 and above), Linux (ARM not supported), Mac OS X and Windows. There are no packages for Solaris yet.
2. A good MySQL driver is still a work in progress. the most advanced is Go-MySQL-Driver
3. Go comes with built-in testing capabilities, but our testing enthusiast, Daniel, says it is not enough to build a complicated testing suite.
4. There is no support of “generics” (or “templates” if you are in C++ world). Basically it means that if you developed a data structure that works with integers, you will need to copy-paste-replace to make it working with floats. Yes, there are workarounds like using a “parent-to-everything” type “interface{}”, but often it is not efficient and just looks ugly.

There is also no automatic type-conversion between int and floats, so if you need to do complex math which involves ints and floats, you may end up with a lot back-and-forth conversions, i.e. int(math.Floor(t.epsilon*float64(t.count*2)))

To finish, I would like to invite you to my webinar, “Monitoring All (Yes, All!) MySQL Metrics with Percona Cloud Tools” on Wednesday, June 25 at 10 a.m. Pacific Daylight Time, where I will talk on the new features in Percona Cloud Tools, including our new percona-agent.

The post Why did we develop percona-agent in Go? appeared first on MySQL Performance Blog.

Percona Toolkit 2.2.8 is now available

June 4, 2014 - 5:53am

Percona is glad to announce the release of Percona Toolkit 2.2.8 on June 4th, 2014 (Downloads are available here and from the Percona Software Repositories). This release is the current GA (Generally Available) stable release in the 2.2 series.

New Features:

Bugs Fixed:

All Percona software is free and open-source. Details of the release can be found in the release notes and the 2.2.8 milestone at Launchpad. Bugs can be reported on the Percona Toolkit launchpad bug tracker.

The post Percona Toolkit 2.2.8 is now available appeared first on MySQL Performance Blog.

Pages

Contact Us 24 Hours A Day
Support Contact us 24×7
Emergency? Contact us for help now!
Sales North America (888) 316-9775 or
(208) 473-2904
Sales
Europe
+44-208-133-0309 (UK)
0-800-051-8984 (UK)
0-800-181-0665 (GER)
More Numbers
Training (855) 55TRAIN or
(925) 271-5054

 

Share This
]]>