Latest MySQL Performance Blog posts

You are here

Subscribe to Latest MySQL Performance Blog posts feed
Percona's MySQL & InnoDB performance and scalability blog
Updated: 7 min 39 sec ago

Using Apache Hadoop and Impala together with MySQL for data analysis

April 21, 2014 - 6:43am

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from  MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting on top of that. For the examples below I will use the “ontime flight performance” data from my previous post (Increasing MySQL performance with parallel query execution). I’ve used the Cloudera Manager v.4 to install Apache Hadoop and Impala. For this test I’ve (intentionally) used an old hardware (servers from 2006) to show that Hadoop can utilize the old hardware and still scale. The test cluster consists of 6 datanodes. Below are the specs:

PurposeServer specsNamenode, Hive metastore, etc + Datanodes2x PowerEdge 2950, 2x L5335 CPU @ 2.00GHz, 8 cores, 16G RAM, RAID 10 with 8 SAS drivesDatanodes only4x PowerEdge SC1425, 2x Xeon CPU @ 3.00GHz, 2 cores, 8G RAM, single 4TB drive

As you can see those a pretty old servers; the only thing I’ve changed is added a 4TB drive to be able to store more data. Hadoop provides redundancy on the server level (it writes 3 copies of the same block to all datanodes) so we do not need RAID on the datanodes (need redundancy for namenodes thou).

Data export

There are a couple of ways to export data from MySQL to Hadoop. For the purpose of this test I have simply exported the ontime table into a text file with:

select * into outfile '/tmp/ontime.psv'  FIELDS TERMINATED BY ',' from ontime;

(you can use “|” or any other symbol as a delimiter) Alternatively, you can download data directly from www.transtats.bts.gov site using this simple script:

for y in {1988..2013} do for i in {1..12} do u="http://www.transtats.bts.gov/Download/On_Time_On_Time_Performance_${y}_${i}.zip" wget $u -o ontime.log unzip On_Time_On_Time_Performance_${y}_${i}.zip done done

Load into Hadoop HDFS

First thing we will need to do is to load data into HDFS as a set of files. Hive or Impala it will work with a directory to which you have imported your data and concatenate all files inside this directory. In our case it is easy to simply copy all our files into the directory inside HDFS

$ hdfs dfs -mkdir /data/ontime/ $ hdfs -v dfs -copyFromLocal On_Time_On_Time_Performance_*.csv /data/ontime/

 Create external table in Impala

Now, when we have all data files loaded we can create an external table:

CREATE EXTERNAL TABLE ontime_csv ( YearD int , Quarter tinyint , MonthD tinyint , DayofMonth tinyint , DayOfWeek tinyint , FlightDate string , UniqueCarrier string , AirlineID int , Carrier string , TailNum string , FlightNum string , OriginAirportID int , OriginAirportSeqID int , OriginCityMarketID int , Origin string , OriginCityName string , OriginState string , OriginStateFips string , OriginStateName string , OriginWac int , DestAirportID int , DestAirportSeqID int , DestCityMarketID int , Dest string , ... ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE  LOCATION '/data/ontime';

Note the “EXTERNAL” keyword and LOCATION (LOCATION points to a directory inside HDFS, not a file). The impala will create a meta information only (will not modify the table). We can query this table right away, however, impala will need to scan all files (full scan) for queries.


[d30.local:21000] > select yeard, count(*) from ontime_psv group by yeard; Query: select yeard, count(*) from ontime_psv group by yeard +-------+----------+ | yeard | count(*) | +-------+----------+ | 2010 | 6450117 | | 2013 | 5349447 | | 2009 | 6450285 | | 2002 | 5271359 | | 2004 | 7129270 | | 1997 | 5411843 | | 2012 | 6096762 | | 2005 | 7140596 | | 1999 | 5527884 | | 2007 | 7455458 | | 1994 | 5180048 | | 2008 | 7009726 | | 1988 | 5202096 | | 2003 | 6488540 | | 1996 | 5351983 | | 1989 | 5041200 | | 2011 | 6085281 | | 1998 | 5384721 | | 1991 | 5076925 | | 2006 | 7141922 | | 1993 | 5070501 | | 2001 | 5967780 | | 1995 | 5327435 | | 1990 | 5270893 | | 1992 | 5092157 | | 2000 | 5683047 | +-------+----------+ Returned 26 row(s) in 131.38s

(Note that “group by” will not sort the rows, unlike MySQL. To sort we will need to add “ORDER BY yeard”)

Explain plan:

Query: explain select yeard, count(*) from ontime_csv group by yeard +-----------------------------------------------------------+ | Explain String | +-----------------------------------------------------------+ | PLAN FRAGMENT 0 | | PARTITION: UNPARTITIONED | | | | 4:EXCHANGE | | | | PLAN FRAGMENT 1 | | PARTITION: HASH_PARTITIONED: yeard | | | | STREAM DATA SINK | | EXCHANGE ID: 4 | | UNPARTITIONED | | | | 3:AGGREGATE (merge finalize) | | | output: SUM(COUNT(*)) | | | group by: yeard | | | | | 2:EXCHANGE | | | | PLAN FRAGMENT 2 | | PARTITION: RANDOM | | | | STREAM DATA SINK | | EXCHANGE ID: 2 | | HASH_PARTITIONED: yeard | | | | 1:AGGREGATE | | | output: COUNT(*) | | | group by: yeard | | | | | 0:SCAN HDFS | | table=ontime.ontime_csv #partitions=1/1 size=45.68GB | +-----------------------------------------------------------+ Returned 31 row(s) in 0.13s

As we can see it will scan 45G of data.

Impala with columnar format and compression

The great benefit of the impala is that it supports columnar format and compression. I’ve tried the new “parquet” format with “snappy” compression codec. As our table is very wide (and de-normalized) it will help alot to use columnar format. To take advantages of the “parquet” format we will need to load data into it, which is easy to do when we already have a table inside impala and files inside HDFS:

[d30.local:21000] > set PARQUET_COMPRESSION_CODEC=snappy; [d30.local:21000] > create table ontime_parquet_snappy LIKE ontime_parquet_snappy STORED AS PARQUET; [d30.local:21000] > insert into ontime_parquet_snappy select * from ontime_csv; Query: insert into ontime_parquet_snappy select * from ontime_csv Inserted 152657276 rows in 729.76s

Then we can test our query against the new table:

Query: explain select yeard, count(*) from ontime_parquet_snappy group by yeard +---------------------------------------------------------------------+ | Explain String | +---------------------------------------------------------------------+ | PLAN FRAGMENT 0 | | PARTITION: UNPARTITIONED | | | | 4:EXCHANGE | | | | PLAN FRAGMENT 1 | | PARTITION: HASH_PARTITIONED: yeard | | | | STREAM DATA SINK | | EXCHANGE ID: 4 | | UNPARTITIONED | | | | 3:AGGREGATE (merge finalize) | | | output: SUM(COUNT(*)) | | | group by: yeard | | | | | 2:EXCHANGE | | | | PLAN FRAGMENT 2 | | PARTITION: RANDOM | | | | STREAM DATA SINK | | EXCHANGE ID: 2 | | HASH_PARTITIONED: yeard | | | | 1:AGGREGATE | | | output: COUNT(*) | | | group by: yeard | | | | | 0:SCAN HDFS | | table=ontime.ontime_parquet_snappy #partitions=1/1 size=3.95GB | +---------------------------------------------------------------------+ Returned 31 row(s) in 0.02s

As we can see it will scan much smaller amount of data: 3.95 (with compression) compared to 45GB


Query: select yeard, count(*) from ontime_parquet_snappy group by yeard +-------+----------+ | yeard | count(*) | +-------+----------+ | 2010 | 6450117 | | 2013 | 5349447 | | 2009 | 6450285 | ... Returned 26 row(s) in 4.17s

And the response time is much better as well.

Impala complex query example

I’ve used the complex query from my previous post. I had to adapt it for use with Impala: it does not support “sum(ArrDelayMinutes>30)” notation but “sum(if(ArrDelayMinutes>30, 1, 0)” works fine.

select min(yeard), max(yeard), Carrier, count(*) as cnt, sum(if(ArrDelayMinutes>30, 1, 0)) as flights_delayed, round(sum(if(ArrDelayMinutes>30, 1, 0))/count(*),2) as rate FROM ontime_parquet_snappy WHERE DayOfWeek not in (6,7) and OriginState not in ('AK', 'HI', 'PR', 'VI') and DestState not in ('AK', 'HI', 'PR', 'VI') and flightdate < '2010-01-01' GROUP by carrier HAVING cnt > 100000 and max(yeard) > 1990 ORDER by rate DESC LIMIT 1000;

The query is intentionally designed the way it does not take advantage of the indexes: most of the conditions will only filter out less than 30% of the data.

Impala results:

+------------+------------+---------+----------+-----------------+------+ | min(yeard) | max(yeard) | carrier | cnt | flights_delayed | rate | +------------+------------+---------+----------+-----------------+------+ | 2003 | 2009 | EV | 1454777 | 237698 | 0.16 | | 2003 | 2009 | FL | 1082489 | 158748 | 0.15 | | 2006 | 2009 | XE | 1016010 | 152431 | 0.15 | | 2003 | 2009 | B6 | 683874 | 103677 | 0.15 | | 2006 | 2009 | YV | 740608 | 110389 | 0.15 | | 2003 | 2005 | DH | 501056 | 69833 | 0.14 | | 2001 | 2009 | MQ | 3238137 | 448037 | 0.14 | | 2004 | 2009 | OH | 1195868 | 160071 | 0.13 | | 2003 | 2006 | RU | 1007248 | 126733 | 0.13 | | 2003 | 2006 | TZ | 136735 | 16496 | 0.12 | | 1988 | 2009 | UA | 9593284 | 1197053 | 0.12 | | 1988 | 2009 | AA | 10600509 | 1185343 | 0.11 | | 1988 | 2001 | TW | 2659963 | 280741 | 0.11 | | 1988 | 2009 | CO | 6029149 | 673863 | 0.11 | | 2007 | 2009 | 9E | 577244 | 59440 | 0.10 | | 1988 | 2009 | US | 10276941 | 991016 | 0.10 | | 2003 | 2009 | OO | 2654259 | 257069 | 0.10 | | 1988 | 2009 | NW | 7601727 | 725460 | 0.10 | | 1988 | 2009 | DL | 11869471 | 1156267 | 0.10 | | 1988 | 2009 | AS | 1506003 | 146920 | 0.10 | | 1988 | 2005 | HP | 2607603 | 235675 | 0.09 | | 2005 | 2009 | F9 | 307569 | 28679 | 0.09 | | 1988 | 1991 | PA | 206841 | 19465 | 0.09 | | 1988 | 2009 | WN | 12722174 | 1107840 | 0.09 | +------------+------------+---------+----------+-----------------+------+ Returned 24 row(s) in 15.28s

15.28 seconds is significantly faster than original MySQL results (15 min 56.40 sec without parallel execution and  5 min 47 with the parallel execution). However, this is not “apple to apple comparison”:

  • MySQL will scan 45G of data and Impala with parquet will only scan 3.5G
  • MySQL will run on a single server, Hadoop + Impala will run in parallel on 6 servers.

Nevertheless, Hadoop + Implala shows impressive performance and ability to scale out the box, which can help a lot with the large data volume analysis.


Hadoop + Impala will give us an easy way to analyze large datasets using SQL with the ability to scale even on the old hardware.

In my next posts I will plan to explore:

As always, please share your thoughts in the comments.

The post Using Apache Hadoop and Impala together with MySQL for data analysis appeared first on MySQL Performance Blog.

Percona Software in Ubuntu 14.04 LTS (Trusty Tahr) release

April 18, 2014 - 6:22am

I’d like to congratulate Canonical with the new Ubuntu 14.04 LTS (Trusty Tahr) Release, it really looks like a great release, and I say it having my own agenda It looks even more great because it comes with a full line of Percona Software.
If you install Ubuntu 14.04 and run aptitude search you will find:

Percona Toolkit and Percona XtraBackup are up to the latest versions, but Percona Server and Percona XtraDB Cluster comes with 5.5 versions, and it is in line with default MySQL version, which again is 5.5.

I expect this release will make it much easier for users to get familiar with our software, so you can go and try this today!

The post Percona Software in Ubuntu 14.04 LTS (Trusty Tahr) release appeared first on MySQL Performance Blog.

How to find bugs in MySQL

April 16, 2014 - 10:00pm

Finding bugs in MySQL is not only fun, it’s also something I have been doing the last four years of my life.

Whether you want to become the next Shane Bester (who is generally considered the most skilled MySQL bug hunter worldwide), or just want to prove you can outsmart some of the world’s best programmers, finding bugs in MySQL is a skill not reserved anymore to top QA engineers armed with a loads of scripts, expensive flash storage and top-range server hardware. Off course, for professionals that’s still the way to go, but now anyone with an average laptop and a standard HDD can have a lot of fun trying to find that elusive crash…

 If you follow this post carefully, you may well be able to find a nice crashing bug (or two) running RQG (an excellent database QA tool). Linux would be the preferred testing OS, but if you are using Windows as your main OS, I would recommend getting Virtual Box and running a Linux guest in a suitably sized (i.e. large) VM. In terms of the acronym “RQG”, this stands for “Random Query Generator,” also named “randgen.”

If you’re not just after finding any bug out there (“bug hunting”), you can tune the RQG grammars (files that define what sort of SQL RQG executes) to more or less match your “issue area.” For example, if you are always running into a situation where the server crashes on a DELETE query (as seen at the end of the mysqld error log for example), you would want an SQL grammar that definitely has a variety of DELETE queries in it. These queries should be closely matched with the actual crashing query – crashes usually happen due to exactly the same, or similar statements with the same clauses, conditions etc.

But, taking a step back, to get started with RQG, you can either use the setup_server.sh script in percona-qa (more on how to obtain percona-qa from Launchpad using bazaar below), or you can use yum to manually install a set of modules installed on your Linux machine:

$ sudo yum install kernel-devel wget patch make cmake automake autoconf libtool bzr gtest zlib-static \
gcc gcc-c++ ncurses-devel libaio libaio-devel bison valgrind perl-DBD-mysql cpan zlib-devel \
bzip2 valgrind-devel svn pam-devel openssl openssl-dev screen strace sysbench

(Note that I have included a few extra items needed for Percona Server & a few items that are handy like screen and sysbench for example. Note also that you can change yum to apt-get, though this may require a few package name changes – search the web for info if you want to use apt-get.)

After these modules are installed, you can:

1. Pull the tree from Launchpad:
$ bzr branch lp:randgen

2. Make sure to have the many (Perl modules etc.) dependencies installed. Follow:

3. Do a quick test to see if RQG is working fine:

If all that worked, then congratulations, you now have simple RQG runs working. Let’s now look into how RQG is structured.

The way you can think about RQG “execution” in a hierarchical tree format is like this: combinations.pl starts runall.pl which starts gentest.pl which may start gendata.pl. You don’t have to use combinations.pl, (or even runall.pl as you would have learned by following the ‘running your first test’ example above; i.e. mysqld can be started manually and gentest.pl can then be used for testing the already started server), but runall.pl and definitely combinations.pl surely add more power to our testing, as we will see soon.

In terms of the various Perl scripts (.pl) listed in the tree, it is:

- combinations.pl which generates the many different ways (read ‘trials’) in which RQG is started (using runall.pl) with various options to mysqld etc.
- runall.pl which starts/stops mysqld, does various high-level checking etc. (in other words; ‘a basic RQG run’)
- gentest.pl which is the executor component (iow ‘the actual basic RQG test’)
- gendata.pl which setups the data (tables + data)

If you know the performance testing tool SysBench, you may compare gentest.pl with an actual SysBench run and gendata.pl with a “sysbench prepare.”

To get into real bug hunting territory, we will use combinations.pl to do an extensive random testing run against a server. You never know what you may find. Small warning; before you log your newly discovered bug, make sure that it is not logged on bugs.mysql.com (for MySQL Server bugs) or on bugs.launchpad.net/percona-server (for Percona Server bugs) already.

In this example, we will be testing Percona Server, as the combination.pl (.cc) grammar we use is optimized for Percona Server. If you would like to test the MySQL server, you can build your own grammars, or use one of the many grammars available in RQG (though they are not many combinations.pl grammars in RQG yet. There are plenty of (less-powerful) runall.pl grammars however). For a MySQL-compatible combinations.pl grammar which tests the optimizer, see randgen/conf/optimizer/starfish.cc – a grammar which I developed whilst working for oracle.

Another very extensive grammar set, usable with Percona Server 5.6 (we call this our ‘base grammar’ as it test many features developed for Percona Server), can be found in randgen/conf/percona_qa/5.6/* – edit and then use 5.6.sh – the startup script (in this set WORKDIR and RQG_DIR) and 5.6.cc – the combinations file (in this change path names for the optimized/debug and valgrind compiled server to match your system) to get started . More on this below.

An earlier and more limited version of this base grammar can be found in your randgen tree; go to randgen/conf/percona_qa/ and review the files there. This more limited base grammar can be used for testing any version of Percona Server, or you can follow along and use the 5.6 grammar mentioned above and test Percona Server 5.6 – the same basic steps (detailed below) apply.

In the percona_qa directory, the percona_qa.sh script is the start script (like 5.6.sh), percona_qa.yy file contains the SQL (like 5.6.yy etc.), the .zz file contains the data definitions, and finally the .cc file is a combinations.pl setup which “combines” various options from various blocks. Combinations.pl has great bug-hunting power.

Side note: you can read more about how the option blocks work at:

All you need to do to get an exhaustive test run started, is edit some options (assuming you have Percona Server installed on your test machine already) and start the percona_qa.sh script:

1. Edit the “percona_qa.sh” script and set the WORKDIR and RQG_DIR variables (In regards RQG_DIR, the script will normally assume that randgen is stored under WORKDIR, but you can change RQG_DIR to point to your randgen download path instead, for example RQG_DIR=/randgen).

2. Edit the “percona_qa.cc” script and point it to the location of your server in the –basedir= setting (i.e. replace “/Percona-Server” with “/path_to_your_Percona_Server_installation”.

For the moment, you can just use a standard Percona Server installation, and remove the Valgrind line directly under the one we just edited (use “dd” in vim), but once you get a bit more professional, compiling from source (“building”) is the way to to go.

The reason for building yourself is that if you use a debug compiled server (i.e. execute ./build/build-binary.sh –debug in a Percona Server source download) or a Valgrind instrumented compiled server (i.e. execute ./build/build-binary.sh –debug –valgrind in a Percona Server source download) you will find more bugs (the debug server contains more developer debug asserts etc.).

Note you can use the “build_percona.sh” in the percona-qa Launchpad project (more on this below) to quickly build an optimized, debug and Valgrind server from source. build_mysql.sh does the same for MySQL server.

3. Now you’re ready to go; execute ./percona_qa.sh and watch the screen carefully. You’ll likely immediately see some STATUS_ENVIRONMENT_FAILURE runs. This is quite common and means you have made a little error somewhere a long the way. Stop the run (ctrl+z, then kill -9 all relevant pids, then execute “fg”). Now edit the files as needed (check all the logs, starting with the failed trials ‘trial<no>.log’, etc.). Then start the run again. If your machine is used for testing only (i.e. no production programs running), you can use the following quick command to kill all relevant running mysqld, perl and Valgrind processes:

ps -ef | grep `whoami` | egrep "mysql|perl|valgrind" | grep -v "grep" | awk '{print $2}' | xargs sudo kill -9;

4. Once you’re run is going, leave it going for a few hours, or a few days (we regularly test with runs that go for 2-5 days or more), and then start checking logs (trial<nr>.log is the one you want to study first. Use “:$” in vim to jump to the end of the file, or “:1″ to jump back to the first line).

5. Once you get a bit more professional, use the percona-qa scripts (bzr branch lp:percona-qa) to quickly handle trials of interest. You may want to initially checkout rqg_results.sh, analyze_crash.sh, startup.sh, build_percona.sh, delete_single_trial.sh and keep_single_trial.sh – simply execute them without parameters to get an initial idea on how to use them. These scripts greatly reduce the efforts required when analyzing multiple trials.

6. Finally, for those of you needing the reduce long SQL testcases (from any source) quickly, see reducer.sh in randgen/util/reducer/reducer.sh – it’s a multi-threaded high-performance simplification script I developed whilst working at oracle. They kindly open sourced it some ago. You may then also want to checkout parse_general_log.pl in the percona-qa scripts listed in point 5 above. This script parses a general log created by mysqld (–general_log option) into a ready-to-use SQL trace.

If you happen to find a bug, share the joy! If you happen to run into issues, post your questions below so others who run into the same can find answers quickly. Also feel free to share any tips you find while playing around with this.


The post How to find bugs in MySQL appeared first on MySQL Performance Blog.

‘Open Source Appreciation Day’ draws OpenStack, MySQL and CentOS faithful

April 16, 2014 - 3:00am

210 people registered for the inaugural “Open Source Appreciation Day” March 31 in Santa Clara, Calif. The event will be held each year at Percona Live henceforth.

To kick off the Percona Live MySQL Conference & Expo 2014, Percona held the first “Open Source Appreciation Day” on Monday, March 31st. Over 210 people registered and the day’s two free events focused on CentOS and OpenStack.

The OpenStack Today event brought together members of the OpenStack community and MySQL experts in an afternoon of talks and sharing of best practices for both technologies. After a brief welcome message from Peter Zaitsev, co-founder and CEO of Percona, Florian Haas shared an introduction to OpenStack including its history and the basics of how it works.

Jay Pipes delivered lessons from the field based on his years of OpenStack experience at AT&T, at Mirantis, and as a frequent code contributor to the project. Jay Janssen, a Percona managing consultant, complemented Jay Pipes’ talk with a MySQL expert’s perspective of OpenStack. He also shared ways to achieve High Availability using the latest version of Galera (Galera 3) and other new features found in the open source Percona XtraDB Cluster 5.6.

Amrith Kumar’s presentation focused on the latest happenings in project Trove, OpenStack’s evolving DBaaS component, and Tesora’s growing involvement. Amrith also won quote of the day for his response to a question about the difference between “elastic” and “scalable.” Amrith: “The waistband on my trousers is elastic. It is not scalable.” Sandro Mazziotta wrapped up the event by sharing the challenges and opportunities of OpenStack from both an integrator as well as operator point of view based on the customer experiences of eNovance.

OpenStack Today was made possible with the support of our sponsors, Tesora and hastexo. Here are links to presentations from the OpenStack Today event. Any missing presentations will soon be added to the OpenStack Today event page.

Speakers in the CentOS Dojo Santa Clara event shared information about the current status of CentOS, the exciting road ahead, and best practices in key areas such as system administration, running MySQL, and administration tools. Here’s a rundown of topics and presentations from the event. Any missing presentations will soon be added to the CentOS Dojo Santa Clara event page.

  • Welcome and Housekeeping
    Karsten Wade, CentOS Engineering Manager, Red Hat
  • The New CentOS Project
    Karsten Wade, CentOS Engineering Manager, Red Hat
  • Systems Automation and Metrics at Pinterest
    Jeremy Carroll, Operations Engineer, Pinterest
  • Software Collections on CentOS
    Joe Brockmeier, Open Source & Standards, Red Hat
  • Two Years Living Your Future
    Joe Miller, Lead Systems Engineer, Pantheon
  • Running MySQL on CentOS Linux
    Peter Zaitsev, CEO and Co-Founder, Percona
  • Notes on MariaDB 10
    Michael Widenius, Founder and CTO, MariaDB Foundation
  • Happy Tools
    Jordan Sissel, Systems Engineer, DreamHost

Thank you to all of the presenters at the Open Source Appreciation Day events and to all of the attendees for joining.

I hope to see you all again this November 3-4  at Percona Live London. The Percona Live MySQL Conference and Expo 2015 will also return to the Hyatt Santa Clara and Santa Clara Convention Center from April 13-16, 2015 – watch for more details in the coming months!

The post ‘Open Source Appreciation Day’ draws OpenStack, MySQL and CentOS faithful appeared first on MySQL Performance Blog.

percona-millipede – Sub-second replication monitor

April 15, 2014 - 6:00am

I recently helped a client implement a custom replication delay monitor and wanted to share the experience and discuss some of the iterations and decisions that were made. percona-millipede was developed in conjunction with Vimeo with the following high-level goal in mind: implement a millisecond level replication delay monitor and graph the results.  Please visit http://making.vimeo.com for more information and thanks to Vimeo for sharing this tool!

Here is the rough list of iterations we worked through in developing this tool/process:

  1. Standard pt-heartbeat update/monitor
  2. Asynchronous, threaded update/monitor tool
  3. Synchronized (via zeroMQ), threaded version of the tool

Initially, we had been running pt-heartbeat (with default interval of 1.0) to monitor real replication delay.  This was fine for general alerting, but didn’t allow us to gain deeper insight into the running slaves.  Even when pt-heartbeat says it is “0 seconds behind”, that can actually mean the slave is up to .99 seconds behind (which in SQL time, can be a lot).  Given the sensitivity to replication delay and the high end infrastructure in place (Percona Server 5.6, 384G RAM, Virident and FusionIO PCI-Flash cards), we decided it was important to absorb the extra traffic in an effort gain further insight into the precise points of any slave lag.

There had been discussion about tweaking the use of pt-heartbeat (–interval = .01 with reduced skew) to look at the sub-second delay, but there were some other considerations:

  • The tool needed to update/monitor multiple hosts from a single process (i.e. we didn’t want multiple pt-heartbeat processes to track)
  • Native integration with existing statsd/graphite system

We likely could’ve achieved the above using pt-heartbeat, but it would’ve required a custom wrapper application to handle the other pieces (threads/subprocesses/inter-process communication).  As the main gist of pt-heartbeat is fairly straightforward (select ts from heartbeat table, calculate delay), it was decided to mimic that logic in a custom application that was configurable and met the other goals.

First Iteration – Async update/monitor

The first iteration was to spin up several threads within a main process (I chose Python for this, but it could be anything really) and set each in a loop with a set delay (.01 seconds for example).  One thread sends update statements with a timestamp, the other threads simply select that timestamp and calculate how long ago the row was updated (i.e. current time – row timestamp):

This iteration was better (we could see replication delay of 1-999 ms and isolate some sub-optimal processes), but there was still some concern.  When testing the solution, we noticed that when pointing the monitor against a single box (for both the update and select), we were still seeing replication delay.  After some discussion, it became apparent that using the current CPU time as the baseline was introducing time to calculate the delay as part of the delay.  Further, since these threads weren’t synchronized, there was no way to determine how long after the update statement the select was even run.

Final Iteration – ZeroMQ update/monitor

Based on this analysis, we opted to tweak the process and use a broadcast model to keep the monitor threads in sync.  For this, I chose to use ZeroMQ for the following reasons:

  • Built in PUB/SUB model with filtering – allows for grouping masters with slaves
  • Flexibility in terms of synchronization (across threads, processes, or servers – just a config tweak to the sockets)

After the update, here was the final architecture:

In this model, the update thread publishes the timestamp that was set on the master and each monitor thread simply waits as a consumer and then checks the timestamp on the slave vs the published timestamp.  Synchronization is built in using this model and we saw much more accurate results.  As opposed to sitting at 5-10ms all the time with spikes up to 50ms, we found 0ms in most cases (keep in mind that means 0ms from an application standpoint with network latency, time to process the query, etc) with spikes up to 40ms.  Here is a sample graph (from Graph Explorer on top of statsd, courtesy of Vimeo) showing some micro spikes in delay that pointed us to a process that was now noticeable and able to be optimized:

While this process was what the client needed, there are some things that I’d like to point out about this approach that may not apply (or even hinder) other production workloads:

  • The frequency of update/select statements added several hundred queries per second
    • You could configure less frequent update/selects, but then you may see less accuracy
    •  The longer delay between updates, the less chance you will see delay
  • For replication delay monitoring (i.e. Nagios), 1 second granularity is plenty
    • Typically, you would only alert after several seconds of delay were noticed

Naturally, there are some other factors that can impact the delay/accuracy of this system (pub/sub time, time to issue select, etc), but for the purpose of isolating some sub-optimal processes at the millisecond level, this approach was extremely helpful.

Stay tuned for a followup post where I’ll share the tool and go over it’s installation, configuration, and other details!

The post percona-millipede – Sub-second replication monitor appeared first on MySQL Performance Blog.

Advisory on Heartbleed (CVE-2014-0160) for Percona’s customers and users

April 14, 2014 - 8:03am

Other the last few days, the Percona team has spent a lot of time evaluating the impact of the Heartbleed bug (CVE-2014-0160) for our customers and for the users of our software. We published a formal disclosure a few days ago. However, I thought a quick summary and some additional information would be good to provide for our MySQL Performance Blog readers.

First, I want to point out that “Heartbleed” is an issue in a commonly used third-party library which typically comes with your operating system, so there is a lot of software which is impacted. An openly exposed service, which is typically a website or some form of API, can potentially cause the biggest impact for anyone. Even though we talk a lot about MySQL Server (and its variants), it will not be the primary concern for organizations following best practices and not exposing their MySQL server to the open Internet.

Second, if you take care of patching your operating system, this will take care of Percona Server, MariaDB or MySQL Server (see note below) as well as other software which uses the OpenSSL library as long as it is linked dynamically. It is highly recommended to dynamically link OpenSSL exactly to take care of such security issues with a single library update and not wait for separate security updates for multiple software packages. Note that updating the library is not enough – you need to restart the service in order for the new library to be loaded. In most cases, I recommend a full system restart as the simplest way to guaranty that all processes using the library have been restarted.

Third, it is worth noting that not all MySQL variants have been impacted and not in all cases. Obviously, your MySQL Server is not impacted if you’re running an operating system which is not vulnerable. You will also not be vulnerable if the MySQL Server or variant you’re using uses yaSSL instead of OpenSSL. In addition, in many cases SSL support is disabled on the server side by default, which might not be the best thing from a security standpoint but can save us from this bug. Finally, in many configurations the SSL/TLS connection setup will take place after initial handshake which does not allow this vulnerability in all cases. I do not have hard numbers but I would guess no more than 10-20% of MySQL (and variants) installations would be vulnerable, even before you look at when they are exposed to the Internet.

To find out whenever MySQL is dynamically compiled with OpenSSL or yaSSL you can use this command:

[root@localhost vagrant]# ldd /usr/sbin/mysqld | grep ssl libssl.so.10 => /usr/lib64/libssl.so.10 (0x00007fb7f4cbc000

It will show “libssl” for server linked with OpenSSL dynamically and it will show no matches for server compiled with yaSSL

It is worth noting as Lenz Grimmer pointed out in a blog post comment that Heartbleed impacts not only vulnerable servers but vulnerable clients can be at risk as well if they connect to a compromised server which implements code specially targeting the clients. This means you want to make sure to update your client machines as well, especially if you’re connecting to a non-trusted MySQL Server.

But enough on Percona Software. There is an impact to Percona web systems as well. The majority of our systems have not been impacted directly because they were running an OpenSSL version which did not have the Heartbleed vulnerability. However, because of how our systems are integrated, there was a small probability that some customer accounts could be exposed through impacted services: https://rdba.percona.com and https://cloud.percona.com. We promptly patched these services last week, regenerated keys, and reset passwords for all accounts which had even a small chance of being impacted.

We believe our teams have acted quickly and appropriately to secure our systems and minimize the chance of information leaks. We will continue to monitor the situation closely and update you via our advisory document if there is any new information needing your attention.

The post Advisory on Heartbleed (CVE-2014-0160) for Percona’s customers and users appeared first on MySQL Performance Blog.

Percona Live MySQL Conference Highlights

April 10, 2014 - 4:59am

The Percona Live MySQL Conference and Expo 2014 was March 31st through April 4th in Santa Clara, California. I heard numerous positive comments from attendees and saw even more on social media. Our conference team lead by Kortney Runyan pulled together a smooth, enjoyable event which made it easy for attendees to focus on learning and networking. Some of the Percona Live MySQL Conference highlights from this year follow.

Percona Live MySQL Conference Highlights

A few stats for the conference this year versus last year:

  • Total registrations were up nearly 15%
  • Attendees represented 40 countries, up from 36 in 2013
  • 34 companies sponsored the conference this year, up from 33 last year
  • This year’s conference covered 5 days including the Open Source Appreciation Day, up from 4 days last year

The conference could not grow without the support of attendees and sponsors. It’s great to see continued growth in attendees which is driven in large part by the word of mouth endorsements of those who attended in previous years.

Keynotes & SiliconANGLE Interviews Video Recordings

The Percona Live MySQL Conference 2014 featured some great keynote presentations. I would like to thank our great keynote speakers who covered a wide variety of topics including MySQL 5.7, integrating MySQL and Hadoop, SSD storage, OpenStack, and the future of running MySQL at scale. If you couldn’t attend this year, you can watch the keynote recordings by visiting the conference website. I was particularly interested in the presentation on MySQL 5.7 from Tomas Ulin, VP of MySQL Engineering for Oracle, which demonstrated the ongoing improvements and commitment Oracle has made to MySQL. I also found the talk by Boris Renski of Mirantis especially helpful in increasing my understanding of the OpenStack ecosystem and the organizations involved in that space. The talk by Robert Hodges of Continuent on integrating MySQL and Hadoop and the talk by Nisha Talagala from Fusion-io on advances in SSD storage are also definitely worth a look.

We were fortunate to have SiliconANGLE at the conference this year, recording video interviews with a wide range of exhibitors and attendees. Whether you attended the conference or not, I think you’ll find the interviews entertaining and enlightening. You can find the videos on the Percona Live conference website or on the SiliconANGLE website or YouTube channel.

Conference Committee

Thanks go first to the nearly 130 speakers who were chosen based on over 300 submitted proposals. The Conference Committee lead by Shlomi Noach, which represented a wide array of ecosystem participants, created a solid schedule of tutorials and breakout sessions which covered a wide variety of MySQL-related topics at a variety of levels. Thanks go out to all of the Committee members:

  • Shlomi Noach, Engineer, Outbrain
  • Roland Bouman, Software Engineer, Pentaho
  • Tim Callaghan, VP of Engineering, Tokutek
  • Laine Campbell, CEO and Co-Founder, Blackbird
  • Jeremy Cole, Sr. Systems Engineer, Google
  • Todd Farmer, Director, Oracle
  • Jay Janssen, Consulting Lead, Percona
  • Giuseppe Maxia, QA Director, Continuent
  • Cedric Peintre, DBA, Dailymotion
  • Ivan Zoratti, CTO, SkySQL
  • Liz van Dijk, Head of Technical Account Management, Percona

Visit the Percona Live MySQL Conference 2014 website and click the “Session Slides” to download the slides from the sessions. Check back periodically for new sets of slides added by the conference speakers.


We cannot put on this conference without the support of our sponsors. Special thanks go out to our Diamond Plus sponsors, Fusion-io and Continuent, for their support and their knowledgeable keynote speakers. The full list of sponsors is:

Diamond Plus: Continuent, Fusion-io
Platinum: Booking.com
Gold: Machine Zone, Micron, Pythian, SkySQL
Silver: AppDynamics, Box, InfiniDB, Diablo, Galera/Codership, Google, Rackspace, Tesora, Twitter, Yelp
Exhibitors: Attunity, Blackbird (formerly PalominoDB), Dropbox, Etsy, FoundationDB, HGST, Rocket Fuel, RSSBus, ScaleArc, ScaleBase, Severalnines, Sphinx, Tokutek, VividCortex
Other: Devart, Facebook, Webyog, MailChimp
Media: ADMIN Magazine, Datanami, DBTA, Linux Journal, Linux Pro Magazine, O’Reilly, Software Developer’s Journal
Open Source Appreciation Day: Tesora, hastexo, CentOS

Open Source Appreciation Day

Special thanks go to our partners who helped put together the OpenStack Today and CentOS Dojo Santa Clara events on Monday during the inaugural Open Source Appreciation Day. Attendees for these free events were treated to great talks from noteworthy speakers from the OpenStack, CentOS, and MySQL worlds on a wide variety of topics. Watch for links to the presentations in a future blog post.


One of the biggest announcements this year wasn’t made by a sponsor. The announcement of the WebScaleSQL project, which is being driven by team members from Facebook, Google, Twitter, and LinkedIn, got a lot of attention from attendees and from the media. My takeaway was that the project is both a strong vote of confidence in MySQL (versus other MySQL variants or alternatives) and a promise of continued improvement in MySQL for large scale web applications in the future.

Percona Live London 2014 and Percona Live MySQL Conference 2015 Dates

Our next event is Percona Live London which is November 2-3, 2014. The Percona Live MySQL Conference and Expo 2015 will be April 13-16, 2015 at the Hyatt Santa Clara and Santa Clara Convention Center – watch for more details in the coming months!

The post Percona Live MySQL Conference Highlights appeared first on MySQL Performance Blog.

Heartbleed: Separating FAQ From FUD

April 9, 2014 - 11:52am

If you’ve been following this blog (my colleague, David Busby, posted about it yesterday) or any tech news outlet in the past few days, you’ve probably seen some mention of the “Heartbleed” vulnerability in certain versions of the OpenSSL library.

So what is ‘Heartbleed’, really?

In short, Heartbleed is an information-leak issue. An attacker can exploit this bug to retrieve the contents of a server’s memory without any need for local access. According to the researchers that discovered it, this can be done without leaving any trace of compromise on the system. In other words, if you’re vulnerable, they can steal your keys and you won’t even notice that they’ve gone missing. I use the word “keys” literally here; by being able to access the contents of the impacted service’s memory, the attacker is able to retrieve, among other things, private encryption keys for SSL/TLS-based services, which means that the attacker would be able to decrypt the communications, impersonate other users (see here, for example, for a session hijacking attack based on this bug), and generally gain access to data which is otherwise believed to be secure. This is a big deal. It isn’t often that bugs have their own dedicated websites and domain names, but this one does: http://www.heartbleed.com

Why is it such a big deal?

One, because it has apparently been in existence since at least 2012. Two, because SSL encryption is widespread across the Internet. And three, because there’s no way to know if your keys have been compromised. The best detection that currently exists for this are some Snort rules, but if you’re not using Snort or some other IDS, then you’re basically in the dark.

What kinds of services are impacted?

Any software that uses the SSL/TLS features of a vulnerable version of OpenSSH. This means Apache, NGINX, Percona Server, MariaDB, the commercial version of MySQL 5.6.6+, Dovecot, Postfix, SSL/TLS VPN software (OpenVPN, for example), instant-messaging clients, and many more. Also, software packages that bundle their own vulnerable version of SSL rather than relying on the system’s version, which might be patched. In other words, it’s probably easier to explain what isn’t affected.

What’s NOT impacted?

SSH does not use SSL/TLS, so you’re OK there. If you downloaded a binary installation of MySQL community from Oracle, you’re also OK, because the community builds use yaSSL, which is not known to be vulnerable to this bug. Obviously, any service which doesn’t use SSL/TLS is not going to be vulnerable, either, since the salient code paths aren’t going to be executed. So, for example, if you don’t use SSL for your MySQL connections, then this bug isn’t going to affect your database server, although it probably still impacts you in other ways (e.g., your web servers).

What about Amazon cloud services?

According to Amazon’s security bulletin on the issue, all vulnerable services have been patched, but they still recommend that you rotate your SSL certificates.

Do I need to upgrade Percona Server, MySQL, NGINX, Apache, or other server software?

No, not unless you built any of these and statically-linked them with a vulnerable version of OpenSSL. This is not common. 99% of affected users can fix this issue by upgrading their OpenSSL libraries and cycling their keys/certificates.

What about client-level tools, like Percona Toolkit or XtraBackup?

Again, no. The client sides of Percona Toolkit, Percona XtraBackup, and Percona Cloud Tools (PCT) are not impacted. Moreover, the PCT website has already been patched. Encrypted backups are still secure.

There are some conflicting reports out there about exactly how much information leakage this bug allows. What’s the deal?

Some of the news articles and blogs claim that with this bug, any piece of the server’s memory can be accessed. Others have stated that the disclosure is limited to memory space owned by processes using a vulnerable version of OpenSSL. As far as we are aware, and as reported in CERT’s Vulnerability Notes Database, the impact of the bug is the latter; i.e., it is NOT possible for an attacker to use this exploit to retrieve arbitrary bits of your server’s memory, only bits of memory from your vulnerable services. That said, your vulnerable services could still leak information that attackers could exploit in other ways.

How do I know if I’m affected?

You can test your web server at http://filippo.io/Heartbleed/ – enter your site’s URL and see what it says. If you have Go installed, you can also get a command-line tool from Github and test from the privacy of your own workstation. There’s also a Python implementation. You can also check the version of OpenSSL that’s installed on your servers. If you’re running OpenSSL 1.0.1 through 1.0.1f or 1.0.2-beta, you’re vulnerable. (Side note here: some distributions, such as RHEL/CentOS, have patched the issue without actually updating the OpenSSL version number; on RPM-based systems you can view the changelog to see if it’s been fixed, for example:

rpm -q --changelog openssl | head -2 * Mon Apr 07 2014 Tomáš Mráz <tmraz@redhat.com> 1.0.1e-16.7 - fix CVE-2014-0160 - information disclosure in TLS heartbeat extension

Also, note that versions of OpenSSL prior to 1.0.1 are not known to be vulnerable. If you’re still unsure, we would be happy to assist you in determining the extent of the issue in your environment and taking any required corrective action. Just give us a call.

How can I know if I’ve been compromised?

If you’ve already been compromised, you won’t. However, if you use Snort as an IDS, you can use some rules developed by Fox-IT to detect and defer new attacks; other IDS/IPS vendors may have similar rule updates available. Without some sort of IDS in place, however, attacks can and will go unnoticed.

Are there any exploits for this currently out there?

Currently, yes, there are some proofs-of-concept floating around out there, although before this week, that was uncertain. But given that this is likely a 2-year-old bug, I would be quite surprised if someone, somewhere (you came to my talk at Percona Live last week, didn’t you? Remember what I said about assuming that you’re already owned?) didn’t have a solid, viable exploit.

So, then, what should I do?

Ubuntu, RedHat/CentOS, Amazon, and Fedora have already released patched versions of the OpenSSL library. Upgrade now. Do it now, as in right now. If you’ve compiled your own version of OpenSSL from source, upgrade to 1.0.1g or rebuild your existing source with the -DOPENSSL_NO_HEARTBEATS flag.

Once that’s been done, stop any certificate-using services, regenerate the private keys for any services that use SSL (Apache, MySQL, whatever), and then obtain a new SSL certificate from whatever certificate authority you’re using. Once the new certificates are installed, restart your services. You can also, of course, do the private key regeneration and certificate cycling on a separate machine, and only bring the service down long enough to install the new keys and certificates. Yes, you will need to restart MySQL. Or Apache. Or whatever. Full stop, full start, no SIGHUP (or equivalent).

Unfortunately, that’s still not all. Keeping in mind the nature of this bug, you should also change / reset any user passwords and/or user sessions that were in effect prior to patching your system and recycling your keys. See, for example, the session hijacking exploit referenced above. Also note that Google services were, prior to their patching of the bug, vulnerable to this issue, which means that it’s entirely possible that your Gmail login (or any other Google-related login) could have been compromised.

Can I get away with just upgrading OpenSSL?

NO. At a bare minimum, you will need to restart your services, but in order to really be sure you’ve fixed the problem, you also need to cycle your keys and certificates (and revoke your old ones, if possible). This is the time-consuming part, but since you have no way of knowing whether or not someone has previously compromised your private keys (and you can bet that now that the bug is public, there will be a lot of would-be miscreants out there looking for servers to abuse), the only safe thing to do is cycle them. You wouldn’t leave your locks unchanged after being burgled, would you?

Also note that once you do upgrade OpenSSL, you can get a quick list of the services that need to be restarted by running the following:

sudo lsof | grep ssl | grep DEL

Where can I get help and/or more information?

In addition to the assorted links already mentioned, you can read up on the nuts and bolts of this bug, or various news articles such as this or this. There are a lot of articles out there right now, most of which are characterizing this as a major issue. It is. Test your vulnerability, upgrade your OpenSSL and rotate your private keys and certificates, and then change your passwords.

The post Heartbleed: Separating FAQ From FUD appeared first on MySQL Performance Blog.

OpenSSL heartbleed CVE-2014-0160 – Data leaks make my heart bleed

April 8, 2014 - 8:04am

The heartbleed bug was introduced in OpenSSL 1.0.1 and is present in

  • 1.0.1
  • 1.0.1a
  • 1.0.1b
  • 1.0.1c
  • 1.0.1d
  • 1.0.1e
  • 1.0.1f

The bug is not present in 1.0.1g, nor is it present in the 1.0.0 branch nor the 0.9.8 branch of OpenSSL some sources report 1.0.2-beta is also affected by this bug at the time of writing, however it is a beta product and I would really recommend not to use beta quality releases for something as fundamentally important as OpenSSL in production.

The bug itself is within the heartbeat extension of OpenSSL (RFC6520). The bug allows an attacker to leak the memory in up to 64k chunks, this is not to say the data being leaked is limited to 64k as the attacker can continually abuse this bug to leak data, until they are satisfied with what has been recovered.

At worst the attacker can retrieve the primary keys, the implications for which is that the attacker now has the keys to decrypt the encrypted data, as such the only way to be 100% certain of protection against this bug is to first update OpenSSL (>= 1.0.1f) and then revoke and regenerate new keys and certificates, expect to see a tirade of revocations and re-issuing of CA certs and the like in the coming days.

So how does this affect you as a MySQL user?

Taking Percona Server as an example, this is linked against OpenSSL, meaning if you want to use TLS for your client connections and/or your replication connections you’re going to need to have openSSL installed.

You can find your version easily via your package manager for example:

  • rpm -q openssl
  • dpkg-query -W openssl

If you’re running a vulnerable installation of OpenSSL an update will be required.

  • update OpenSSL >= 1.0.1f (1.0.1e-2+deb7u5 is reported as patched on debian, 1.0.1e-16.el6_5.7 is reported as patched in RedHat and CentOS)
  • shutdown mysqld
  • regenerate keys and certs used by mysql for TLS connections (revoking the old certs if possible to do so)
  • start mysqld

You can read more about the heartbleed bug at heartbleed.com Redhat Bugzilla Mitre CVE filing Ubuntu Security Notice

The post OpenSSL heartbleed CVE-2014-0160 – Data leaks make my heart bleed appeared first on MySQL Performance Blog.

Optimizing MySQL Performance: Choosing the Right Tool for the Job

April 7, 2014 - 2:00pm

Next Wednesday, I will present a webinar about MySQL performance profiling tools that every MySQL DBA should know.

Application performance is a key aspect of ensuring a good experience for your end users. But finding and fixing performance bottlenecks is difficult in the complex systems that define today’s web applications. Having a method and knowing how to use the tools available can significantly reduce the amount of time between problems manifesting and fixes being deployed.

In the webinar, titled “Optimizing MySQL Performance: Choosing the Right Tool for the Job,” we’ll start with the basic top, iostat, and vmstat then move onto advanced tools like GDB, Oprofile, and Strace.

I’m looking forward to this webinar and invite you to join us April 16th at 10 a.m. Pacific time. You can learn more and also register here to reserve your spot. I also invite you to submit questions ahead of time by leaving them in the comments section below. Thanks for reading and see you next Wednesday!

The post Optimizing MySQL Performance: Choosing the Right Tool for the Job appeared first on MySQL Performance Blog.

Facebook’s Yoshinori Matsunobu on MySQL, WebScaleSQL & Percona Live

April 4, 2014 - 10:41am

Facebook’s Yoshinori Matsunobu

I spoke with Facebook database engineer Yoshinori Matsunobu here at Percona Live 2014 today about a range of topics, including MySQL at Facebook, the company’s recent move to WebScaleSQL, new MySQL flash storage technologies – and why attending the Percona Live MySQL Conference and Expo each year is very important to him.

Facebook engineers are hosting several sessions at this year’s conference and all have been standing room only. That’s not too surprising considering that Facebook, the undisputed king of online social networks, has 1.23 billion monthly active users collectively contributing to an ocean of data-intensive tasks – making the company one of the world’s top MySQL users. And they are sharing what they’ve learned in doing so this week.

You can read my previous post where I interviewed five Facebook MySQL experts (Steaphan Greene, Evan Elias, Shlomo Priymak, Yoshinori Matsunobu and Mark Callaghan) and below is my short video interview with Yoshi, who will lead his third and final session of the conference today at 12:50 p.m. Pacific time titled, “Global Transaction ID at Facebook.”

The post Facebook’s Yoshinori Matsunobu on MySQL, WebScaleSQL & Percona Live appeared first on MySQL Performance Blog.


Contact Us 24 Hours A Day
Support Contact us 24×7
Emergency? Contact us for help now!
Sales North America (888) 316-9775 or
(208) 473-2904
+44-208-133-0309 (UK)
0-800-051-8984 (UK)
0-800-181-0665 (GER)
Training (855) 55TRAIN or
(925) 271-5054


Share This