]]>
]]>

Feed aggregator

You are here

Percona Toolkit 2.2.11 for MySQL is now available

Latest MySQL Performance Blog posts - September 30, 2014 - 6:20am

Percona is pleased to announce the availability of Percona Toolkit 2.2.11.  Released on Sept. 25, Percona Toolkit is a collection of advanced command-line tools to perform a variety of MySQL server and system tasks that are too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release contains bug fixes for pt-query-digest, pt-mysql-summary, pt-stalk, as well as other tools and is the current GA (Generally Available) stable release in the 2.2 series. Downloads are available here and from the Percona Software Repositories.

Bugs Fixed:

  • Fixed bug #1262456: pt-query-digest didn’t report host details when host was using skip-name-resolve option. Fixed by using the IP of the host instead of its name, when the hostname is missing.
  • Fixed bug #1264580: pt-mysql-summary was incorrectly parsing key/value pairs in the wsrep_provider_options option, which resulted in incomplete my.cnf information.
  • Fixed bug #1318985: pt-stalk is now using SQL_NO_CACHE when executing queries for locks and transactions. Previously this could lead to situations where most of the queries that were waiting on query cache mutex were the pt-stalk queries (INNODB_TRX).
  • Fixed bug #1348679: When using -- -p option to enter the password for pt-stalk it would ask user to re-enter the password every time tool connects to the server to retrieve the information. New option --ask-pass has been introduced that can be used to specify the password only once.
  • Fixed bug #1368379: A parsing error caused pt-summary ( specifically the report_system_info module) to choke on the “Memory Device” parameter named “Configured Clock Speed” when using dmidecode to report memory slot information.

Details of the release can be found in the release notes and the 2.2.11 milestone at Launchpad. Bugs can be reported on the Percona Toolkit launchpad bug tracker.

The post Percona Toolkit 2.2.11 for MySQL is now available appeared first on MySQL Performance Blog.

MySQL & OpenStack: How to overcome issues as your dataset grows

Latest MySQL Performance Blog posts - September 29, 2014 - 6:00am

MySQL is the database of choice for most OpenStack components (Ceilometer is a notable exception). If you start with a small deployment, it will probably run like a charm. But as soon as the dataset grows, you will suddenly face several challenges. We will write a series of blog posts explaining the issues you may hit and how to overcome them.

Where is MySQL used in OpenStack?

Have a look at the logical diagram of OpenStack below (click the image for a larger view).

 

The diagram is a bit outdated: Neutron appears as Quantum and newer components like Heat are not pictured. But it shows that a database has to be used to store metadata or runtime information. And although many databases are supported, MySQL is the most common choice. Of course MySQL can also be used in instances running inside an OpenStack cloud.

What are the most common issues?

As with many applications, when you start small, the database is running well and maintenance operations are fast and easy to perform. But with a dataset that grows, you will find that the following operations are becoming increasingly difficult:

  1. Having good backups: mysqldump is the standard backup tool for small deployments. While backups of instances having 100GB of data is still quite fast, restore is single-threaded and will take hours. You will probably need to use other tools such as Percona XtraBackup, but what are the tradeoffs?
  2. Changing the schema: whenever you have to add an index, change a datatype or add a column, it can trigger a table rebuild which will prevent writes to proceed on the table. While the rebuild is fast when the table has only a few hundreds of MBs of data, ALTER TABLE statements can easily take hours or days for very large tables. Using pt-online-schema-change from Percona Toolkit is a good workaround, but it doesn’t mean that you can blindly run it without any precaution.
  3. Making the database highly available: whenever the database is down, the whole platform is down or runs in a degraded state. So you need to plan for a high availability solution. One option is to use Galera, but that can introduce subtle issues.
  4. Monitoring the health of your database instances: MySQL exposes hundreds of metrics, how do you know which ones to lookt at to quickly identify potential issues?

1. and 2. are not likely to be issues for the MySQL instance backing your OpenStack cloud as it will be very small, but they can be big hurdles for guest databases that can grow very large depending on the application.

3. and 4. are highly desirable no matter the size of the database.

Stay tuned for more related posts on MySQL & OpenStack – and feel free to give us your feedback! And remember that if MySQL is showing bad performance in your OpenStack deployment, Percona is here to help. Just give us a call anytime, 24/7. I also invite you and your team to attend the inaugural OpenStack Live 2015 conference, which runs April 13-14, 2015 in Santa Clara, Calif. It runs alongside the Percona Live MySQL Conference and Expo (April 13-16) at the Hyatt Regency Santa Clara and the Santa Clara Convention Center.

The post MySQL & OpenStack: How to overcome issues as your dataset grows appeared first on MySQL Performance Blog.

‘Bash Bug’ giving you Shellshock? CVE-2014-6271 update

Latest MySQL Performance Blog posts - September 26, 2014 - 9:34am

The media train is in full steam today over the the CVE-2014-6271 programming flaw, better known as the “Bash Bug” or “Shellshock” – the original problem was disclosed on Wednesday via this post. Firstly this issue exploits bash environment variables in order to execute arbitrary commands; a simple check for this per the Red Hat security blog is the following:

env x='() { :;}; echo vulnerable’ bash -c “echo this is a test”

If you see an error:

bash: warning: x: ignoring function definition attempt
bash: error importing function definition for `x’

Your version of bash is not vulnerable, if you see the text “vulnerable” – however you are.

The question becomes “how much of a problem is this really?” It’s a valid question given that even with the potential to exploit this issue via AcceptEnv for ssh connections; the attack appears to be a “local user” exploit.

I’d like to point out that it has been noted that there’s the potential for this to be exploitable in CGI web applications; and it’s also worth being aware of this new metasploit module which exploits an issue in dhclient where code injection can occur via crafted hostname response to DHCP requests, in my personal opinion this is a far wider issue in dhclient itself.

Redhat also notes that the current fix for “shellshock” is incomplete as per CVE-2014-7169 

Is MySQL affected?

It does not appear to be directly affected at this time; unless you have a UDF allowing shell command execution.

MySQL documentation on environment variables as per here modified local behavior of the client only not the server. (without local modification server side).

Additional resources:

Is my application affected?

There’s no singular answer here given the diversity of applications.  For instance if you’re using PHP and putenv then you potentially have quiet a large attack surface in you application for this specific vulnerability; the best recourse here is to ensure your follow best practices – e.g. update to the latest packages, test the vulnerability, ensure you application is running as a non privileged user, ensure you application only has access to the MySQL permissions it needs; and ensure you’re running a mandatory access control e.g. SELinux / Apparmor as an additional layer of defense.

Suricata and Snort signatures for shellshock as per this volexity blog post

Suricata

alert http $EXTERNAL_NET any -> $HOME_NET any (msg:”Volex – Possible CVE-2014-6271 bash Vulnerability Requested (header)”; flow:established,to_server; content:”() {“; http_header;  threshold:type limit, track by_src, count 1, seconds 120; sid:2014092401;

Snort


alert tcp $EXTERNAL_NET any -> $HOME_NET $HTTP_PORTS (msg:”Volex – Possible CVE-2014-6271 bash Vulnerability Requested (header) “; flow:established,to_server; content:”() {“; http_header;  threshold:type limit, track by_src, count 1, seconds 120; sid:2014092401;)

The post ‘Bash Bug’ giving you Shellshock? CVE-2014-6271 update appeared first on MySQL Performance Blog.

Logical MySQL backup tool mydumper 0.6.2 now available

Latest MySQL Performance Blog posts - September 26, 2014 - 2:51am

We are pleased to announce the third release in the 0.6 series of mydumper, a tool for performing logical MySQL backups. In this release, we focused on simplifying compiling the code and added new features for making logical backups. These new features include enhancements to AWS RDS support and extending TokuDB support.

Due to recent changes (or not so much) on mysql libs, it became impossible to compile mydumper without the complete mysql source code. To simplify this, we had to disable the binlog functionality by default, as it was the one affected by the mysql libs changes. Now you should be able to compile against any mysql version without issues. If you still want the binlog feature, it is still there and you can enable it with:

cmake . -DWITH_BINLOG=ON

The one caveat is that you will need to compile against a mysql version greater than 5.5.34. It’s also possible to compile with binlog enabled against the latest 5.6 versions. In this case, you will need the source code to make some changes. You can find related information about this issue at these locations:

Download mydumper-0.6.2 source code here.

Bugs Fixed:
  • #1347392 Last row of table not dumped if it brings statement over statement_size
  • #1157113 Compilation of latest branch fails on CentOS 6.3 64bit
  • #1326368 Can’t make against Percona-Server-devel-55 headers
  • #1282862 Unknown type name ‘HASH’
  • #1336860 k is used twice
  • #913307 Can’t compile – missing libs crypto and ssl
  • #1364393 Rows chunks doesn’t increase non innodb jobs
New MySQL Backup Features:
  • --lock-all-tables

Use LOCK TABLE for all instead of FLUSH TABLES WITH READ LOCK. With this option you will be able to backup RDS instances and also get the binlog coordinates (5.6).

  • TokuDB support

Now TokuDB tables are dumped within the consistent snapshot instead of being locked like MyISAM.

  • Support to dump tables from different schemas

mydumper has two arguments,

    --database (-B)

and

    --tables-list (-T)

so until now you were able to do;

-B db1

or

-B db1 -T t1,t2,tn

To dump a whole database or a list of tables from one database respectively. In 0.6.2 you can list tables in different databases in this way;

-T db1.t1,db1.t2,db2.t1,db2.t1

NOTE: You should not use -B here because mydumper will take the -T list as table names.

The post Logical MySQL backup tool mydumper 0.6.2 now available appeared first on MySQL Performance Blog.

Looking forward to Oracle OpenWorld 2014; visit us at Booth 2413!

Latest MySQL Performance Blog posts - September 25, 2014 - 1:45pm

I’m excited to once again be heading to San Francisco next week for Oracle OpenWorld, and also very pleased to have a booth there this year along with some great speakers from Percona.

The scope of Oracle OpenWorld 2014, which runs Sept. 28-Oct. 2, is enormous and there are several keynotes and sessions I’m looking forward to attending. Especially the talks focused on the areas of:

  • MySQL architecture and application development
  • Cloud and Big Data
  • Database administration and DevOps
  • High-Availability and Replication
  • Performance and Scalability

I also have the honor of speaking on the topic of “Practical MySQL Optimization.” Other sessions led by Percona staff include:

All of us will, at one time or another, be at the Percona booth (Booth 2413), so please stop by and say “hello.” And while you’re there, register to win a new Kindle – the Fire HDX 8.9 Tablet.

We’ll also be raffling off passes to Percona Live London (Nov. 3-4), the annual Percona Live MySQL Conference and Expo 2015 (PLMCE, April 13-16) and the new OpenStack Live 2015 conference, which runs alongside PLMCE April 13-14 – also at the Hyatt Regency Santa Clara & the Santa Clara Convention Center.

Let me know if you’ll be attending and do be sure to stop by our booth and say “hello!” I hope to see you there!

 

 

 

 

The post Looking forward to Oracle OpenWorld 2014; visit us at Booth 2413! appeared first on MySQL Performance Blog.

More then 1000 columns – get transactional with TokuDB

Latest MySQL Performance Blog posts - September 25, 2014 - 8:03am

Recently I encountered a specific situation in which a customer was forced to stay with the MyISAM engine due to a legacy application using tables with over 1000 columns. Unfortunately InnoDB has a limit at this point. I did not expect to hear this argument for MyISAM. It is usually about full text search or spatial indexes functionality that were missing in InnoDB, and which were introduced in MySQL 5.6 and 5.7, respectively, to let people forget about MyISAM. In this case though, InnoDB still could not be used, so I gave the TokuDB a try.

I’ve created a simple bash script to generate a SQL file with CREATE TABLE statement with the number of columns I desired and then tried to load this using different storage engines. Bit surprisingly, InnoDB failed with column count above 1017, so little more then documented maximum of 1000:

mysql> source /home/vagrant/multicol2.sql ERROR 1117 (HY000): Too many columns

MyISAM let me to create maximum 2410 columns and I could achieve the same result for the TokuDB table! Tried with tinyint or char(10) datatype, same maximum cap applied, not quite sure why it’s exactly 2410 though.

mysql> SELECT tables.TABLE_NAME,count(*) columns,engine,row_format FROM information_schema.columns JOIN information_schema.tables USING (TABLE_NAME) where TABLE_NAME like "multicol%" group by TABLE_NAME; +-----------------+---------+--------+-------------+ | TABLE_NAME      | columns | engine | row_format  | +-----------------+---------+--------+-------------+ | multicol_innodb |    1017 | InnoDB | Compact     | | multicol_myisam |    2410 | MyISAM | Fixed       | | multicol_toku   |    2410 | TokuDB | tokudb_zlib | +-----------------+---------+--------+-------------+ 3 rows in set (0.31 sec)

So if you have that rare kind of table schema with that many columns and you wish to be able to use a transaction storage engine, you may go with TokuDB, available also with recent Percona Server 5.6 versions.

You can find more details about column number limits in MySQL in this post, “Understanding the maximum number of columns in a MySQL table.”

The post More then 1000 columns – get transactional with TokuDB appeared first on MySQL Performance Blog.

Percona Server, OpenStack and the Tesora DBaaS Platform

Latest MySQL Performance Blog posts - September 24, 2014 - 8:13am

Percona Server and Percona XtraDB Cluster provide high-performance alternatives for MySQL users. We have also seen rapidly growing interest in these solutions in the OpenStack community where higher performance and high availability are crucial. Many OpenStack users are adopting these solutions but we’ve also seen demand from companies creating OpenStack distros. For example, Percona XtraDB Cluster is now certified for the RHEL OSP (OpenStack Platform) and is included in the Ubuntu 14.04 LTS release. Tesora recently joined this rapidly growing list when they announced the Tesora DBaaS Platform Enterprise Edition which includes Percona Server.

The Tesora platform is an enterprise-ready database as a service (DBaaS) platform based on the OpenStack Trove project. Percona Server was certified by Tesora in August as part of their Tesora DBaaS Platform certification program.

What does this mean for you if you are one of the users who have downloaded Percona Server more than 1,000,000 times?

Many enterprises are evaluating how to deliver robust, high-performance MySQL-as-a-service. OpenStack Trove is is an open-source platform that addresses this challenge. OpenStack operators can build and offer multiple databases as a service (such as MySQL, MongoDB, or Redis). These users can create, operate, maintain and delete different kinds of SQL or NoSQL databases on demand. Users don’t need to worry about the administrative details of the database technology nor complexities such as availability, resiliency, backups, recovery and security.

The Tesora DBaaS Platform Enterprise Edition builds upon OpenStack Trove and makes the “provisioning, deployment, configuration, tuning, monitoring and administration” simpler for operators. The platform includes support for multiple database backends including Percona Server which is certified on the Tesora DBaaS platform. Administrators and DBAs looking to build and offer a database as a service using the Tesora DBaaS Platform Enterprise Edition can rest assured that Percona Server is ready to meet their needs.

OpenStack is a rapidly evolving open-source platform that depends heavily on MySQL for optimal performance. Percona is participating in the Trove project and is sharing our knowledge on OpenStack through webinars (such as “OpenStack: A MySQL DBA Perspective“), blog posts (such as “OpenStack users shed light on Percona XtraDB Cluster deadlock issues“), professional services (such as Percona Consulting Services) and the recently announced OpenStack Live Conference which will be held April 13-14 2015 at the Santa Clara Convention Center in Silicon Valley (the Call for Speakers is open until November 9!).

We look forward to working with Tesora as they build out their platform as well as the rest of the OpenStack community as this exciting technology continues to mature. I hope you can join us in Santa Clara for the OpenStack Live Conference this April – submit your speaker proposal now or purchase your ticket at Super Saver prices!

The post Percona Server, OpenStack and the Tesora DBaaS Platform appeared first on MySQL Performance Blog.

How to scale big data applications using MySQL sharding frameworks

Latest MySQL Performance Blog posts - September 23, 2014 - 12:00am

This Wednesday I’ll be discussing two common types of big data: machine-generated data and user-generated content. These types of big data are amenable to sharding, a commonly used technique for spreading data over more than one database server.

I’ll be discussing this in-depth during a live webinar at 10 a.m. Pacific time on Sept. 24. I’ll also talk about two major sharding frameworks: MySQL Fabric and Shard-Query for OLTP or OLAP workloads, respectively. Following the webinar there will be a brief Q/A session.

Find the webinar link here: “How to Scale Big Data Applications Using MySQL Sharding Frameworks” for more information or register directly here.

Find Shard-Query (part of Swanhart-Tools) here, in Github
Find MySQL Fabric (part of MySQL Utilities) here, at the MySQL documentation page

The post How to scale big data applications using MySQL sharding frameworks appeared first on MySQL Performance Blog.

Should you migrate to Percona XtraDB Cluster?

Latest MySQL Performance Blog posts - September 22, 2014 - 9:02am

Interest in Percona XtraDB Cluster / Galera has been high ever since we introduced the product in 2012.  I typically have a conversation about Galera and Percona XtraDB Cluster (PXC) at least once a week with a consulting customer who wants to know if it will be a good fit for their application.  Last week I gave a webinar entitled “Migrating to Percona XtraDB Cluster.”

I covered everything in the webinar that I feel it is important for someone to know who is considering Galera and I’d suggest anyone who wants a brief overview of PXC/Galera spends an hour watching the recording.  There were many questions asked in the webinar, but I answered all of them regarding Percona XtraDB Cluster. Access to the webinar is free along with download of the accompanying slides.

* * *

What is Percona XtraDB Cluster? PXC is a replacement for conventional MySQL master/slave architectures to eliminate replication lag and achieve a highly-available masterless cluster of MySQL servers. Like all Percona software, PXC is open source and free.

The post Should you migrate to Percona XtraDB Cluster? appeared first on MySQL Performance Blog.

MySQL upgrade best practices

Latest MySQL Performance Blog posts - September 19, 2014 - 10:06am

MySQL upgrades are necessary tasks and we field a variety of questions here at Percona Support regarding MySQL upgrade best practices. This post highlights recommended ways to upgrade MySQL in different scenarios.

Why are MySQL upgrades needed? The reasons are many and include: Access to new features, performance benefits, bug fixes…. However, MySQL upgrades can be risky if not tested extensively beforehand with your application because the process might break it, prevent the application from functioning properly – or performance issues could arise following the upgrade. Moreover, I suggest keeping an eye on new releases of MySQL and Percona Server – check what has changed in the most recent version. Perhaps the latest release has a fix for an issue that you have been experiencing.

Upgrading one major version via SQL Dump:

Upgrading between one major version covers upgrading from Percona Server 5.1 to 5.5 or Percona Server 5.5 to 5.6 and the same implies to Oracle MySQL.

First of all, upgrading between one major version is neither straightforward nor risk-free. Initially you should read “Upgrading from Previous Series” documentation here and here. In that documentation, please place special attention to all of the sections marked “Incompatible Change” and check whether you may be affected by those changes. There might be configuration changes as well as variables renamed, a few older variables obsoleted and new variables introduced – so make sure that you adjust your my.cnf accordingly. For Percona Server specific changes please refer here and here for Percona Server 5.5 & Percona Server 5.6, respectively.

Now there are several possible approaches you may take, where one may be more feasible than the other depending on the current replication topology and total data size – and one might also be safer than another. Let me show you an upgrade procedure… an example upgrading from Percona Server 5.5 to Percona Server 5.6.

In general, there are two types of MySQL upgrades:

  • In place, where you use existing datadir against the new MySQL major version, with just running mysql_upgrade after binaries are upgraded,
  • SQL dump on an old version and then restore it on a new version (using mysqldump utility or alternatives, like mydumper).

Also in general the second type is safer, but as you may expect a much slower MySQL upgrade process.

Theoretically, the safest scenario is:

Here’s a basic procedure (you should stop application writes before starting).

1) Capture users and permissions information. This will backup all your existing user privileges.

$ wget percona.com/get/pt-show-grants; $ perl pt-show-grants --user=root --ask-pass --flush > /root/grants.sql

2) Produce a logical dump of the 5.5 instance, excluding the mysql, information_schema and performance_schema databases:

$ mysql -BNe "SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME NOT IN ('mysql', 'performance_schema', 'information_schema')" | tr 'n' ' ' > /root/dbs-to-dump.sql $ mysqldump --routines --events --single-transaction --databases $(cat /root/dbs-to-dump.sql) > /root/full-data-dump.sql

3) Stop 5.5 instance.

$ service mysql stop or $ /etc/init.d/mysql stop

4) Move old datadir (assuming /var/lib/mysql; edit accordingly to your setup):

$ mv /var/lib/mysql/ /var/lib/mysql-55

5) Install 5.6 (simply as you would do when not upgrading). If you don’t use a package manager (yum/apt-get) then is likely that you need to run mysql_install_db and mysql_upgrade.

6) Load the users back to new upgraded version of MySQL.

$ mysql -uroot < /root/grants.sql

7) Load the data back to new version of MySQL.

$ mysql -e "SET GLOBAL max_allowed_packet=1024*1024*1024"; $ mysql -uroot -p --max-allowed-packet=1G < /root/full-data-dump.sql;

At this point all tables have been re-created and loaded in MySQL 5.6 so every binary representation is native to MySQL 5.6. You’ve also completed the cleanest/most-stable upgrade path and your application can resume service – and for that reason it’s worth mentioning that this upgrade path is the same with either upgrading vanila MySQL or Percona Server. Further, you may upgrade from Oracle MySQL to Percona Server, for example,  upgrading Oracle MySQL 5.5 to Percona Server 5.6. Again, the MySQL upgrade path as described would be the same as Percona Server, which is a drop-in replacement of Oracle MySQL.

“SQL dump” is also known as a logical backup. It is safer in the sense that when restoring, all tables will be created using the format of the new MySQL binaries you’re using, which bypasses compatibility issues in general. Still for large data like data in terabytes, gigabytes… this may be a very time-consuming approach. On the other hand, by dumping/reloading such large data sets, it is possible that you will be able to recover a lot of free space on the disk as the InnoDB table spaces will be re-created from scratch, thus optimized and defragmented. If the data was often updated/deleted, the benefits may be significant.

Minor version MySQL upgrade within the same major version via In-Place Upgrade:

This implies to upgrading within the same series e.g. MySQL 5.5.35 to MySQL 5.5.38 or Percona Server 5.6.14 to latest Percona Server 5.6.20.

This is known as an in-place upgrade, where you just install a newer binary package and then run mysql_upgrade script, which checks and updates system tables if necessary. Still, with the in-place upgrade we highly recommend checking release notes for new features, bug fixes, etc. For Percona Server 5.5 and Percona Server 5.6, release notes can be found here and here respectively.

For Percona Server we have additional documents describing some details when it comes to upgrading Percona Server with a focus on Percona-specific features that can be found here and here. This also covers complete In-Place Upgrade procedure with the yum/apt package manager.

Also, to be on safe side you can do the upgrade with a logical dump using the earlier described procedure via mysqldump or mydumper program – where the former does parallel backups and restore and logical backup – and is the safest approach for the upgrade.

MySQL Upgrade directly to the latest version by skipping one major version in between:

This includes upgrading from MySQL 5.0 to MySQL 5.5 by skipping version 5.1 in between or upgrading MySQL 5.1 to MySQL 5.6 by skipping version 5.5 in between. Further, this also includes upgrading to MySQL 5.6 directly from MySQL 5.0 although there should be very few users still using MySQL version 5.0. This also implies to Percona Server.

For the topic, we would assume upgrading from Oracle MySQL or Percona Server 5.1 directly to version 5.6 by skipping one major version 5.5 in between.

Before anything, this is a serious upgrade, and a huge step over one major MySQL version. That is, it’s risky. Upgrading by using just binaries update is not supported and it’s not safe skipping major versions in between, so you should never do this from 5.0->5.5, 5.1->5.6, and surely not for 5.0->5.6. One problem is that not all changes in MySQL versions are backwards compatible. Some differences were introduced that may affect both how the data is handled, but also how the server behaves including both SQL language and MySQL server and storage engines internals. Another thing is that between MySQL 5.0 and 5.6 versions, a number of default setting variables were changed, which may result in completely different, unexpected behavior. For example since MySQL 5.5 the default storage engine is InnoDB and since MySQL 5.6 by default InnoDB will use a separate tablespace for each table and GTID replication was also introduced. But there are many more details which I won’t list here. All of those changes are described in “Upgrading from Previous Series” documentation as described above.

It’s worth mentioning that upgrading by skipping one major version is highly not recommended. Upgrading from MySQL 5.1 to 5.6 shouldn’t be done in one shot. Instead, I would suggest upgrading from version 5.1 to 5.5 and then from version 5.5 to 5.6 and running mysql_upgrade at each step. That will cope with the changes in formats as explained in the manual.

MySQL Upgrade Precautions:

MySQL upgrade precautions are an essential part of the upgrade itself. Before you upgrade make sure you have thoroughly tested all application parts with the desired version of MySQL. This is especially needed for an upgrade between major versions or if you are upgrading by skipping one major version in-between (e.g. upgrade from MySQL 5.1 to MySQL 5.6).

Make sure you read release notes carefully and that you are aware of all the changes. You can find Oracle MySQL 5.5 and 5.6 release notes as follows:

http://dev.mysql.com/doc/relnotes/mysql/5.5/en/
http://dev.mysql.com/doc/relnotes/mysql/5.6/en/

While Percona Server specific release notes can be found below for same versions as described above.

http://www.percona.com/doc/percona-server/5.5/release-notes/release-notes_index.html
http://www.percona.com/doc/percona-server/5.6/release-notes/release-notes_index.html

If you are planning to upgrade to Oracle MySQL 5.6 or Percona Server 5.6 I would recommend first checking for existing critical bugs. Bugs you should aware of:

http://bugs.mysql.com/bug.php?id=66546
http://bugs.mysql.com/bug.php?id=68953
http://bugs.mysql.com/bug.php?id=69444
http://bugs.mysql.com/bug.php?id=70922
http://bugs.mysql.com/bug.php?id=72794
http://bugs.mysql.com/bug.php?id=73820

Upgrade Hierarchy:

This is yet another important aspect of any MySQL upgrade. You should plan your upgrade along with an upgrade hierarchy. This is always recommend: upgrade your dev/QA servers first, then staging server’s before moving to production. In fact, you can spare upgraded instances where you have desired upgraded versions of MySQL and then test your application extensively.

Once you are happy with the MySQL upgrade on your test servers, staging servers, etc., then you can begin the MySQL upgrade on your production servers. In replication environments we highly recommend upgrading the MySQL slaves first (one by one) and then finally upgrading the MySQL master. In reality,  you can upgrade one of the slaves first and run it for few days to be on safe side – all the while closely monitoring its performance. If you don’t have a replication setup it may be worth creating a replica to test the newer version of MySQL on it first. Once you are happy with the results you can upgrade remaining the slaves and finally the master.

 How Percona software helps you in a MySQL upgrade:

In any MySQL upgrade, Percona Toolkit comes to the rescue. Percona Tookit contains a number of tools that help a great deal.

pt-upgrade is one of such tool. It allows you to test whether the new MySQL instance handles some specific queries at least as fast as old version. There may be some substantial differences as the MySQL query optimizer has changed a lot between versions 5.1 and 5.6 and also data statistics may be refreshed, hence the query plan may change. You can check further in the manual about optimizer changes.

pt-query-digest is another great tool that might help you in the upgrade. You can replay your slow query log against existing and new desired MySQL versions for before and after query performance validation.

You can also benefit from Percona Cloud Tools for MySQL which is a hosted service providing access to query performance insights for all MySQL uses. You can signup for free now because this service is in public beta. Percona Cloud Tools, among other things, allows you to visually check your queries performance after a MySQL upgrade.

It’s highly recommended to backup your data before your MySQL upgrade. Percona XtraBackup is free and open source (like all Percona software). It’s a hot backup tool which backs-up your data online without scarifying read/write ability from the database and it will backup your data with minor impact.

Last but not least, You will find this post pretty useful, too: “Upgrading MySQL.” It’s a few years old but still very relevant. And also take a look at this informative webinar, “Upgrading to MySQL 5.6: Best Practices.” Both are from Percona CEO Peter Zaitsev.

Conclusion:
A MySQL upgrade might look like a simple task –  but actually it’s not. I’ve tried to cover most of the MySQL upgrade scenarios in this post that you will encounter. Again, I recommend to briefly test your application parts before pushing it “live,” otherwise it may break your application or part of it – or may minimize performance instead of a performance gain. Finally, I recommend having a downgrade plan in place before the MySQL upgrade just in case something goes wrong. Planning a proper downgrade procedure will minimize your app downtime when things go wrong. I’m looking forward to your comments and questions below.

The post MySQL upgrade best practices appeared first on MySQL Performance Blog.

Managing big data? Say ‘hello’ to HP Vertica

Latest MySQL Performance Blog posts - September 18, 2014 - 7:29am

Over the past few months, I’ve seen an increase in the following use case while working on performance and schema review engagements:

I need to store exponentially increasing amounts of data and analyze all of it in real-time.

This is also known simply as: “We have big data.” Typically, this data is used for user interaction analysis, ad tracking, or other common click stream applications. However, it can also be seen in threat assessment (ddos mitigation, etc), financial forecasting, and other applications as well. While MySQL (and other OLTP systems) can handle this to a degree, it is by no means a forte. Some of the pain points include:

  • Cost of rapidly increasing, expensive disk storage (OLTP disks need to be fast == $$)
  • Performance decrease as the data size increases
  • Wasted hardware resources (excess I/O, etc)
  • Impact against other time-sensitive transactions (i.e. OLTP workload)

While there are many approaches to this problem – and often times, the solution is actually a hybrid of many individually tailored components – a solution that I have seen more frequently in recent work is HP Vertica.

At the 30,000 foot overview, Vertica is built around the following principles:

  • Columnar data store
  • Highly compressed data
  • Clustered solution for both availability and scalability

Over the next few weeks, I’ll discuss several aspects of Vertica including:

  • Underlying architecture and concepts
  • Basic installation and use
  • Different data loading techniques
  • Some various maintenance/operational procedures
  • Some comparisons vs. traditional OLTP (MySQL) performance
  • Some potential use-cases
  • Integration with other tools (such as Hadoop)

While Vertica is by no means the silver bullet that will solve all of your needs, it may prove to be a very valuable tool in your overall approach to managing big data.

The post Managing big data? Say ‘hello’ to HP Vertica appeared first on MySQL Performance Blog.

Syncing MySQL slave table with pt-online-schema-change

Latest MySQL Performance Blog posts - September 17, 2014 - 7:06am

I recently encountered a situation in which after running Percona Toolkit’s pt-table-checksum on a customer system, 95% of the table on the MySQL master was different on the MySQL slave. Although this table was not a critical part of the infrastructure, from time to time, writes to the table from the master would break replication. Additionally, this table has about 6 million rows, and running pt-table-sync would take sometime. Rebuilding the slave from backup of the master would not be an easy option as well since the slave acts as an archive where it has a lot more data than the master.

So how did we solve it? With pt-online-schema-change and a NOOP ALTER.

pt-online-schema-change --alter 'ENGINE=INNODB' D=dbname,t=tblname

How is it possible? pt-online-schema-change works by creating a shadow copy of the original table and start copying the rows from the latter to the former. An additional set of TRIGGERs also ensures that any additional changes to existing rows after they have been copied to the shadow version will propagate.

There is little one caveat though, the binlog_format  on the master would have to be ROW so the actual ROW images from the master would be copied to the slave. If your binlog_format is set to STATEMENT, you’d still end up with the same inconsistency. Since change statements will be logged as STATEMENT, the NOOP ALTER will operate on the slave table instead of copying from the master. You can configure the binlog_format for the alter process via the –set-vars option i.e. --set-vars 'binlog_format=ROW'  to make it more easier!

The post Syncing MySQL slave table with pt-online-schema-change appeared first on MySQL Performance Blog.

Percona Live London: Top Ten reasons to attend Nov. 3-4

Latest MySQL Performance Blog posts - September 16, 2014 - 12:00am

Percona Live London 2014 is fast approaching – November is just around the corner. This year’s conference, November 3-4, will be even bigger and better than last year thanks to the participation of leading MySQL experts the world over (including you!).

The Percona Live London MySQL Conference is a great event for users of any level using any of the major MySQL branches: MySQL, MariaDB or Percona Server. And this year we once again host a star-studded group of keynote speakers from industry-leading companies in the MySQL space.

We’ll also be welcoming leading MySQL practitioners from across the industry (and from all corners of the world) who will speak on topics that matter to you now  – see the full conference schedule here:

Monday starts early with a full day of tutorials and a fun evening at the community dinner.  Attendees will be arriving in true London style on a double-decker bus! Tuesday morning will kick-off with a series of keynotes followed by interactive breakout sessions – wrapping things up at the end of the day with a fun post-conference reception (a great chance to make new friends and reconnect with old ones).

Here’s a sneak peek at some of the must-see events this year:

To recap, here are the Top Ten reasons to attend Percona Live London this November 3-4:

10. Advanced Rate Pricing ends October 5th
9. Hear about the hottest current topics and trends.
8. Network! Meet face-to-face in the “hallway track” and make lasting connections.
7. Learn how to make MySQL work better for you – regardless of your expertise.
6. Have a blast at the community dinner!
5. Discuss your unique challenges with experts and discover options for solving them.
4. Engage with the sponsors at their tabletop exhibits.
3. Listen to top industry leaders describe the future of the MySQL ecosystem
2. Learn what works, and what doesn’t, from leading companies using MySQL
1. And the Number 1 reason to attend Percona Live London 2014: ALL of the above!

I look forward to seeing you in London this November and don’t forgot that Advanced Rate Pricing pricing ends October 5 so be sure to register now!

The post Percona Live London: Top Ten reasons to attend Nov. 3-4 appeared first on MySQL Performance Blog.

OpenStack Live 2015: Call for speakers open through November 9

Latest MySQL Performance Blog posts - September 12, 2014 - 10:32am

OpenStack Live 2015: Call for speakers open through Nov. 9

I am proud to announce OpenStack Live, a new annual conference that will run in parallel with the Percona Live MySQL Conference & Expo at the Santa Clara Convention Center in Silicon Valley. The inaugural event, OpenStack Live 2015, is April 13-14, 2015. We are lining up a strong Conference Committee and are now accepting tutorial and breakout session speaking proposals through November 9.

OpenStack Live will emphasize the essential elements of making OpenStack work better with emphasis on the critical role of MySQL and the value of Trove. You’ll hear about the hottest current topics, learn about operating a high-performing OpenStack deployment, and listen to top industry leaders describe the future of the OpenStack ecosystem. We are seeking speaking proposals on the following topics:

  • Performance Optimization of OpenStack
  • OpenStack Operations
  • OpenStack Trove
  • Replication and Backup for OpenStack
  • High Availability for OpenStack
  • OpenStack User Stories
  • Monitoring and Tools for OpenStack

The conference features a full day of keynotes, breakout sessions, and Birds of a Feather sessions on April 14 preceded by an optional day of tutorials on April 13. A Monday reception will be held on the exhibit floor and joint lunches with both conferences offer you the opportunity to network with both the OpenStack and MySQL communities from both conferences. The OpenStack Live conference is a great event for users of any level.

As a bonus, OpenStack Live attendees may attend any Percona Live MySQL Conference session during the days of the OpenStack event. Conference only passes are available for April 14 and conference and tutorial passes are available for both April 13 and 14.

If you are using OpenStack and have a story to share – or a skill to teach – then now is the time to put pen to paper (or fingers to keyboard) and write your speaking proposal for either breakout or tutorial sessions (or both). Submissions will be reviewed by the OpenStack Live Conference Committee, which includes:

  • Mark Atwood: Director – Open Source Evangelism for HP Cloud Services
  • Rich Bowen: OpenStack Community Liaison at Red Hat
  • Jason Rouault: Senior Director OpenStack Cloud at Time Warner Cable
  • Peter Boros: Principal Architect at Percona

Presenting at OpenStack Live 2015 is your chance to put your ideas, case studies, best practices and technical knowledge in front of an intelligent, engaged audience of OpenStack users. If selected as a speaker by our Conference Committee, you will receive a complimentary full conference pass.

Public speaking not your thing or just want to learn about the latest and greatest OpenStack technologies, deployments and projects? Then register now and save big with our early bird discount. OpenStack Live 2015 is an ideal opportunity for organizations to connect with the community of OpenStack enthusiasts from Silicon Valley and around the world. The Percona Live MySQL Conference this past April had over 1,100 registered attendees from 40 countries and the OpenStack Open Source Appreciation Day on the Monday before the conference was fully booked so don’t delay, register today to save your seat!

We are currently accepting sponsors. You can learn more about sponsorship opportunities here.

I hope to see you at OpenStack Live 2015 next April! And speakers, remember the deadline to submit your proposals is November 9. In the meantime you can learn more by visiting the official OpenStack Live 2015 website.

The post OpenStack Live 2015: Call for speakers open through November 9 appeared first on MySQL Performance Blog.

Percona XtraBackup 2.2.4 is now available

Latest MySQL Performance Blog posts - September 12, 2014 - 6:47am

Percona is glad to announce the release of Percona XtraBackup 2.2.4 on September 12th 2014. Downloads are available from our download site here and Percona Software Repositories.

Percona XtraBackup enables backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backup.

New Features:

  • Percona XtraBackup has implemented support for Galera GTID auto-recovery. Percona XtraBackup retrieves the GTID information, after backing up a server with backup locks support, from the InnoDB trx header on recovery and creates the xtrabackup_galera_info during that stage.

Bugs Fixed:

  • Percona XtraBackup is now built with system zlib library instead of the older bundled one. Bug fixed #1108016.
  • apt-get source was downloading older version of Percona XtraBackup. Bug fixed #1363259.
  • innobackupex would ignore the innobackupex --databases without innobackupex --stream option and back up all the databases. Bug fixed #569387.
  • rsync package wasn’t a dependency although it is required for the innobackupex --rsync option. Bug fixed #1259436.
  • innobackupex --galera-info was checking only for non-capitalized wsrep_* status variables which was incompatible with MariaDB Galera Cluster 10.0. Bug fixed #1306875.
  • Percona XtraBackup would crash trying to remove absent table from InnoDB data dictionary while preparing a partial backup. Bug fixed #1340717.
  • Percona XtraBackup now supports MariaDB GTID. Bugs fixed #1329539 and #1326967 (Nirbhay Choubey).
  • MariaDB 10.1 is now added to the list of supported servers. Bug fixed #1364398.
  • Percona XtraBackup would fail to restore (copy-back) tables that have partitions with their own tablespace location. Bug fixed #1322658.

Other bugs fixed: #1333570, #1326224, and #1181171.

Release notes with all the bugfixes for Percona XtraBackup 2.2.4 are available in our online documentation. Bugs can be reported on the launchpad bug tracker

The post Percona XtraBackup 2.2.4 is now available appeared first on MySQL Performance Blog.

OpenStack: A MySQL DBA Perspective – Sept. 17 webinar

Latest MySQL Performance Blog posts - September 11, 2014 - 1:58pm

OpenStack: A MySQL DBA Perspective

I’ll have the pleasure to present, next Wednesday, September 17 at 10 a.m. PDT (1 p.m. EDT) a webinar titled “OpenStack: A MySQL DBA Perspective.” Everyone is invited.

The webinar will be divided into two parts. The first part will cover how MySQL can be used by the OpenStack infrastructure including the expected load, high-availability solutions and geo-DR.

The second part will focus on the use of MySQL within an OpenStack cloud. We’ll look into the various options that are available, the traditional ones and Trove. We’ll also discuss the block device options in regards with MySQL performance and, finally, we’ll review the high-availability implications of running MySQL in an OpenStack cloud.

Register here. I look forward to your questions, and if you have any related to OpenStack that I can help with in advance of the webinar please feel free to post those in the comments section below. I’ll write a followup post after the webinar to recap all related questions and answers. I’ll also provide the slides.

See you next Wednesday!

The post OpenStack: A MySQL DBA Perspective – Sept. 17 webinar appeared first on MySQL Performance Blog.

OpenStack users shed light on Percona XtraDB Cluster deadlock issues

Latest MySQL Performance Blog posts - September 11, 2014 - 8:27am

I was fortunate to attend an Ops discussion about databases at the OpenStack Summit Atlanta this past May as one of the panelists. The discussion was about deadlock issues OpenStack operators see with Percona XtraDB Cluster (of course this is applicable to any Galera-based solution). I asked to describe what they are seeing, and as it turned out, nova and neutron uses the SELECT … FOR UPDATE SQL construct quite heavily. This is a topic I thought was worth writing about.

Write set replication in a nutshell (with oversimplification)

Any node is writable, and replication happens in write sets. A write set is practically a row based binary log event or events and “some additional stuff.” The “some additional stuff” is good for 2 things.

  • Two write sets can be compared and told if they are conflicting or not.
  • A write set can be checked against a database if it’s applicable.

Before committing on the originating node, the write set is transferred to all other nodes in the cluster. The originating node checks that the transaction is not conflicting with any of the transactions in the receive queue and checks if it’s applicable to the database. This process is called certification. After the write set is certified the transaction is committed. The remote nodes will do certification asynchronously compared to the local node. Since the certification is deterministic, they will get the same result. Also the write set on the remote nodes can be applied later because of this reason. This kind of replication is called virtually synchronous, which means that the data transfer is synchronous, but the actual apply is not.

We have a nice flowchat about this.

Since the write set is only transferred before commit, InnoDB row level locks, which are held locally, are not held on remote nodes (if these were escalated, each row lock would take a network round trip to acquire). This also means that by default if multiple nodes are used, the ability to read your own writes is not guaranteed. In that case, a certified transaction, which is already committed on the originating node can still sit in the receive queue of the node the application is reading from, waiting to be applied.

SELECT … FOR UPDATE

The SELECT … FOR UPDATE construct reads the given records in InnoDB, and locks the rows that are read from the index the query used, not only the rows that it returns. Given how write set replication works, the row locks of SELECT … FOR UPDATE are not replicated.

Putting it together

Let’s create a test table.

CREATE TABLE `t` ( `id` int(11) NOT NULL AUTO_INCREMENT, `ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1

And some records we can lock.

pxc1> insert into t values(); Query OK, 1 row affected (0.01 sec) pxc1> insert into t values(); Query OK, 1 row affected (0.01 sec) pxc1> insert into t values(); Query OK, 1 row affected (0.01 sec) pxc1> insert into t values(); Query OK, 1 row affected (0.00 sec) pxc1> insert into t values(); Query OK, 1 row affected (0.01 sec)

pxc1> select * from t; +----+---------------------+ | id | ts | +----+---------------------+ | 1 | 2014-06-26 21:37:01 | | 4 | 2014-06-26 21:37:02 | | 7 | 2014-06-26 21:37:02 | | 10 | 2014-06-26 21:37:03 | | 13 | 2014-06-26 21:37:03 | +----+---------------------+ 5 rows in set (0.00 sec)

On the first node, lock the record.

pxc1> start transaction; Query OK, 0 rows affected (0.00 sec) pxc1> select * from t where id=1 for update; +----+---------------------+ | id | ts | +----+---------------------+ | 1 | 2014-06-26 21:37:01 | +----+---------------------+ 1 row in set (0.00 sec)

On the second, update it with an autocommit transaction.

pxc2> update t set ts=now() where id=1; Query OK, 1 row affected (0.01 sec) Rows matched: 1 Changed: 1 Warnings: 0 pxc1> select * from t; ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

Let’s examine what happened here. The local record lock held by the started transation on pxc1 didn’t play any part in replication or certification (replication happens at commit time, there was no commit there yet). Once the node received the write set from pxc2, that write set had a conflict with a transaction still in-flight locally. In this case, our transaction on pxc1 has to be rolled back. This is a type of conflict as well, but here the conflict is not caught on certification time. This is called a brute force abort. This happens when a transaction done by a slave thread conflict with a transaction that’s in-flight on the node. In this case the first commit wins (which is the already replicated one) and the original transaction is aborted. Jay Janssen discusses multi-node writing conflicts in detail in this post.

The same thing happens when 2 of the nodes are holding record locks via select for update. Whichever node commits first will win, the other transaction will hit the deadlock error and will be rolled back. The behavior is correct.

Here is the same SELECT … FOR UPDATE transaction overlapping on the 2 nodes.

pxc1> start transaction; Query OK, 0 rows affected (0.00 sec) pxc2> start transaction; Query OK, 0 rows affected (0.00 sec)

pxc1> select * from t where id=1 for update; +----+---------------------+ | id | ts | +----+---------------------+ | 1 | 2014-06-26 21:37:48 | +----+---------------------+ 1 row in set (0.00 sec) pxc2> select * from t where id=1 for update; +----+---------------------+ | id | ts | +----+---------------------+ | 1 | 2014-06-26 21:37:48 | +----+---------------------+ 1 row in set (0.00 sec)

pxc1> update t set ts=now() where id=1; Query OK, 1 row affected (0.01 sec) Rows matched: 1 Changed: 1 Warnings: 0 pxc2> update t set ts=now() where id=1; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0

pxc1> commit; Query OK, 0 rows affected (0.00 sec) pxc2> commit; ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction

Where does this happen in OpenStack?

For example in OpenStack Nova (the compute project in OpenStack), tracking the quota usage uses the SELECT…FOR UPDATE construct.

# User@Host: nova[nova] @ [10.10.10.11] Id: 147 # Schema: nova Last_errno: 0 Killed: 0 # Query_time: 0.001712 Lock_time: 0.000000 Rows_sent: 4 Rows_examined: 4 Rows_affected: 0 # Bytes_sent: 1461 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0 # InnoDB_trx_id: C698 # QC_Hit: No Full_scan: Yes Full_join: No Tmp_table: No Tmp_table_on_disk: No # Filesort: No Filesort_on_disk: No Merge_passes: 0 # InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000 # InnoDB_pages_distinct: 2 SET timestamp=1409074305; SELECT quota_usages.created_at AS quota_usages_created_at, quota_usages.updated_at AS quota_usages_updated_at, quota_usages.deleted_at AS quota_usages_deleted_at, quota_usages.deleted AS quota_usages_deleted, quota_usages.id AS quota_usages_id, quota_usages.project_id AS quota_usages_project_id, quota_usages.user_id AS quota_usages_user_id, quota_usages.resource AS quota_usages_resource, quota_usages.in_use AS quota_usages_in_use, quota_usages.reserved AS quota_usages_reserved, quota_usages.until_refresh AS quota_usages_until_refresh FROM quota_usages WHERE quota_usages.deleted = 0 AND quota_usages.project_id = '12ce401aa7e14446a9f0c996240fd8cb' FOR UPDATE;

So where does it come from?

These constructs are generated by SQLAlchemy using with_lockmode(‘update’). Even in nova’s pydoc, it’s recommended to avoid with_lockmode(‘update’) whenever possible. Galera replication is not mentioned among the reasons to avoid this construct, but knowing how many OpenStack deployments are using Galera for high availability (either Percona XtraDB Cluster, MariaDB Galera Cluster, or Codership’s own mysql-wsrep), it can be a very good reason to avoid it. The solution proposed in the linked pydoc above is also a good one, using an INSERT INTO … ON DUPLICATE KEY UPDATE is a single atomic write, which will be replicated as expected, it will also keep correct track of quota usage.

The simplest way to overcome this issue from the operator’s point of view is to use only one writer node for these types of transactions. This usually involves configuration change at the load-balancer level. See this post for possible load-balancer configurations.

The post OpenStack users shed light on Percona XtraDB Cluster deadlock issues appeared first on MySQL Performance Blog.

Generating test data from the mysql> prompt

Latest MySQL Performance Blog posts - September 10, 2014 - 1:00am

There are a lot of tools that generate test data.  Many of them have complex XML scripts or GUI interfaces that let you identify characteristics about the data. For testing query performance and many other applications, however, a simple quick and dirty data generator which can be constructed at the MySQL command line is useful.

First, let’s talk about what kind of data you can easily create with MySQL function calls:

You can generate a decimal number between zero and another number using the MySQL RAND() function like the following query (here between 0 and 10000):

SELECT RAND() * 10000;

Similarly, you can generate a random integer by adding the FLOOR() function:

SELECT FLOOR(RAND() * 10000)

You can generate a random string of 32 characters using MD5():

SELECT MD5(RAND() * 10000)

You can return a random integer between 500 and 1000 with the following:

SELECT FLOOR( 500 + RAND() * (1000 - 500))

You can return a random string from a list of strings by using a table to hold the list. A subselect can select a random name from the list of names.

create table names(id int auto_increment primary key, name varchar(20)); insert into names (name) values ('Justin','Jerry','James','Josh','Julien'); select (select name from names where id = 1 + rand() * 4);

Now we can generate a “fact” table with many rows using fairly simple SQL statements.

First create a table to generate data into:

CREATE TABLE fact ( dim1 int, dim2 int, name varchar(20), hash varchar(32), measure1 double );

Seed the table with one initial row:

INSERT INTO fact VALUES (1,1,'Justin',md5(''), .1);

Now grow the table by selecting from the table but providing new random values for the inserted rows:

INSERT INTO fact SELECT FLOOR(1+ rand()*9999), FLOOR(1 + rand()*499), (select name from names where id = 1 + rand() * 4), MD5(1+rand()*9999), rand() FROM fact;

As you repeat the INSERT … SELECT, the table will grow exponentially. You may want to add a LIMIT clause to the INSERT … SELECT to reduce the amount of data generated as the table grows.

You will create a table with an even data distribution for each column. You can then add some queries to add skew, either using INSERT … SELECT or UPDATE, for example:

INSERT INTO fact SELECT 1,1,'Justin',md5(''), .1 FROM fact LIMIT 10000;

That will skew the values by creating many rows with the same data as our initial row.

Using these simple tools, you can generate a data set that is great for testing purposes. For example, dim1 might be a customer_id and dim2 a product_id, and you would populate those tables with 10000 and 500 rows, respectively.

The post Generating test data from the mysql> prompt appeared first on MySQL Performance Blog.

Multi-Valued INSERTs, AUTO_INCREMENT & Percona XtraDB Cluster

Latest MySQL Performance Blog posts - September 9, 2014 - 7:38am

A common migration path from standalone MySQL/Percona Server to a Percona XtraDB Cluster (PXC) environment involves some measure of time where one node in the new cluster has been configured as a slave of the production master that the cluster is slated to replace. In this way, the new cluster acts as a slave of the production environment – traditional replication takes care of getting the data into the cluster, and then Galera replication handles the intra-cluster traffic. This often works without issue, although there is one case that I’ve encountered recently where special care must be taken to properly configure the stream to ensure that replication does not break. If you use multi-valued inserts with auto-increment columns, then this post is for you.

For purposes of our discussion, assume that we have a basic 3-node PXC cluster that we’ve set up using the PXC Reference Architecture document, and that we’re replicating from an asynchronous master (call it “server A”) into one of the PXC nodes. Without loss of generality, we’ll pick PXC03. Also, for purposes of our discussion, we’ll be working with the following table definition:

serverA> show create table auto_inc_test; CREATE TABLE `auto_inc_test` ( `i` int(11) NOT NULL AUTO_INCREMENT, `stuff` varchar(20) DEFAULT NULL, PRIMARY KEY (`i`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec) serverA> SELECT * FROM auto_inc_test; Empty set (0.00 sec)

If we insert rows into this table one at a time, we have no issues.

serverA> INSERT INTO auto_inc_test(stuff) VALUES ('first row'); serverA> INSERT INTO auto_inc_test(stuff) VALUES ('second row'); serverA> INSERT INTO auto_inc_test(stuff) VALUES ('third row'); serverA> SELECT * FROM auto_inc_test; +---+------------+ | i | stuff      | +---+------------+ | 1 | first row  | | 2 | second row | | 3 | third row | +---+------------+ PXC03> SELECT * FROM auto_inc_test; +---+------------+ | i | stuff      | +---+------------+ | 1 | first row  | | 2 | second row | | 3 | third row | +---+------------+

But if we start doing multi-valued inserts, we can run into a problem.

serverA> INSERT INTO auto_inc_test(stuff) VALUES('first row'),('second row'),('third row'); serverA> INSERT INTO auto_inc_test(stuff) VALUES('fourth row'),('fifth row'); serverA> SELECT * FROM auto_inc_test; +---+------------+ | i | stuff      | +---+------------+ | 1 | first row  | | 2 | second row | | 3 | third row | | 4 | fourth row | | 5 | fifth row | +---+------------+ PXC03> SELECT * FROM auto_inc_test; +---+------------+ | i | stuff      | +---+------------+ | 1 | first row  | | 2 | second row | | 5 | third row | +---+------------+ PXC03> SHOW SLAVE STATUS; ... output elided ... Last_SQL_Errno: 1062 Last_SQL_Error: Error 'Duplicate entry '5' for key 'PRIMARY'' on query. Default database: 'test'. Query: 'INSERT INTO auto_inc_test (stuff) VALUES ('fourth row'),('fifth row')' ... output elided ...

Uh oh. Replication is broken and our data is now inconsistent. So why does this happen and how can we prevent it?

binlog_format

The astute observer will note that I have not yet said anything about the binary log format on the master. If the binary log format on the master is already set to ROW, then the above error will not occur. RBR will properly replicate multi-valued INSERTs to the PXC cluster without issue, and the data will be consistent. Problem solved. However, there may be reasons that the master is not using or cannot use RBR, such as disk space or IOPS limitations, and thus it’s running in MIXED or STATEMENT mode. In that case, we need to look elsewhere….

wsrep_auto_increment_control

When set to ON (the default), this variable has the effect of automatically specifying values for auto_increment_increment and auto_increment_offset based on the cluster size. The idea behind it is to help prevent auto-increment value conflicts when writing to multiple nodes. However, what it also means is that in a multi-node cluster, the auto-increment values generated by any given node will never be consecutive and the “next” auto-increment value on the slave cluster node will always be higher than what the master believes it should be. For example:

serverA> INSERT INTO auto_inc_test (stuff) VALUES ('first row'),('second row'),('third row'); serverA> SHOW CREATE TABLE auto_inc_test; CREATE TABLE `auto_inc_test` ( `i` int(11) NOT NULL AUTO_INCREMENT,   `stuff` varchar(20) DEFAULT NULL,   PRIMARY KEY (`i`) ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1 PXC03> SHOW CREATE TABLE auto_inc_test; CREATE TABLE `auto_inc_test` (   `i` int(11) NOT NULL AUTO_INCREMENT,   `stuff` varchar(20) DEFAULT NULL,   PRIMARY KEY (`i`) ) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=latin1

Hmm, that’s not good.

The above scenario can be avoided by setting wsrep_auto_increment_control to OFF on the node that’s acting as a slave while the cluster is still receiving asynchronous replication traffic. This can be configured in /etc/my.cnf or set dynamically from the MySQL command line with:

SET GLOBAL wsrep_auto_increment_control='OFF';

In the testing that I’ve done, this appears to be sufficient to deal with the problem, even if the original master has non-standard values specified for the auto_increment_* variables (e.g., because it’s part of a master-master pair where the odd ID numbers are generated by one server and the even ID numbers from the other one).

In fact, if the cluster is always going to be used in single-writer mode, there’s a compelling argument to be made for setting this variable to OFF on all of the cluster nodes even when going into production: it will prevent the cluster from burning through N auto-increment IDs every time a single row is inserted.

Fixing it when it’s broken

There’s no secret magic here. Percona Toolkit’s pt-table-checksum and pt-table-sync can be used to check and repair the data divergence between the master and the slave cluster node, just as if PXC were not involved, although you may find that it’s just faster to rebuild the cluster, depending upon how many diffs are found. Sometimes this issue happens right away when the master-to-cluster replication is started, and fixing it involves just one or two rows; other times I have seen it not occur for days with lots of rows out of sync once it finally breaks.

The tl;dr version

In sum, if you’re doing traditional replication into a PXC cluster (or any other flavor of MySQL/MariaDB + Galera), you may have issues with multi-valued INSERT statements, and this can be prevented with a configuration change on either side of the replication stream:

  • On the master, set binlog_format=ROW
  • On the PXC slave node, set wsrep_auto_increment_control=OFF

The post Multi-Valued INSERTs, AUTO_INCREMENT & Percona XtraDB Cluster appeared first on MySQL Performance Blog.

How to calculate the correct size of Percona XtraDB Cluster’s gcache

Latest MySQL Performance Blog posts - September 8, 2014 - 7:12am

When a write query is sent to Percona XtraDB Cluster all the nodes store the writeset on a file called gcache. By default the name of that file is galera.cache and it is stored in the MySQL datadir. This is a very important file, and as usual with the most important variables in MySQL, the default value is not good for high-loaded servers. Let’s see why it’s important and how can we calculate a correct value for the workload of our cluster.

What’s the gcache?
When a node goes out of the cluster (crash or maintenance) it obviously stops receiving changes. When you try to reconnect the node to the cluster the data will be outdated. The joiner node needs to ask a donor to send the changes happened during the downtime.

The donor will first try to transfer an incremental (IST), that is, the writesets the cluster received while the node was down. The donor checks the last writeset received by the joiner and then checks local gcache file. If all needed writesets are on that cache the donor sends them to the joiner. The joiner applies them and that’s all, it is up to date and ready to join the cluster. Therefore, IST can only be achieved if all changes missed by the node that went away are still in that gcache file of the donor.

On the other hand, if the writesets are not there a full transfer would be needed (SST) using one of the supported methods, XtraBackup, Rsync or mysqldump.

In a summary, the difference between a IST and SST is the time that a node needs to join the cluster. The difference could be from seconds to hours. In case of WAN connections and large datasets maybe days.

That’s why having a correct gcache is important. It work as a circular log, so when it is full it starts to rewrite the writesets at the beginning. With a larger gcache a node can be out of the cluster more time without requiring a SST. My colleague Jay Janssen explains in more detail about how IST works and how to find the right server to use as donor.

Calculating the correct size
When trick is pretty similar to the one used to calculate the correct InnoDB log file size. We need to check how many bytes are written every minute. The variables to check are:

wsrep_replicated_bytes: Total size (in bytes) of writesets sent to other nodes.

wsrep_received_bytes: Total size (in bytes) of writesets received from other nodes.

mysql> show global status like 'wsrep_received_bytes'; show global status like 'wsrep_replicated_bytes'; select sleep(60); show global status like 'wsrep_received_bytes'; show global status like 'wsrep_replicated_bytes'; +----------------------+----------+ | Variable_name | Value | +----------------------+----------+ | wsrep_received_bytes | 83976571 | +----------------------+----------+ +------------------------+-------+ | Variable_name | Value | +------------------------+-------+ | wsrep_replicated_bytes | 0 | +------------------------+-------+ [...] +----------------------+----------+ | Variable_name | Value | +----------------------+----------+ | wsrep_received_bytes | 90576957 | +----------------------+----------+ +------------------------+-------+ | Variable_name | Value | +------------------------+-------+ | wsrep_replicated_bytes | 800 | +------------------------+-------+

Therefore:

Bytes per minute:

(second wsrep_received_bytes – first wsrep_received_bytes) + (second wsrep_replicated_bytes – first wsrep_replicated_bytes)

(90576957 – 83976571) + (800 – 0) = 6601186 bytes or 6 MB per minute.

Bytes per hour:

6MB * 60 minutes = 360 MB per hour of writesets received by the cluster.

If you want to allow one hour of maintenance (or downtime) of a node, you need to increase the gcache to that size. If you want more time, just make it bigger.

The post How to calculate the correct size of Percona XtraDB Cluster’s gcache appeared first on MySQL Performance Blog.

Pages

Subscribe to Percona aggregator
Contact Us 24 Hours A Day
Support Contact us 24×7
Emergency? Contact us for help now!
Sales North America (888) 316-9775 or
(208) 473-2904
Sales
Europe
+44-208-133-0309 (UK)
0-800-051-8984 (UK Toll Free)
0-800-181-0665 (GER Toll Free)
More Numbers
Training (855) 55TRAIN or
(925) 271-5054

 

Share This
]]>