November 25, 2014

Schema changes in MySQL for OpenStack Trove users

People using OpenStack Trove instances can hit a common issue in the MySQL world: how to perform schema change operations while minimizing the impact on the database server? Let’s explore the options that can allow online schema changes. Summary With MySQL 5.5, pt-online-schema-change from Percona Toolkit is your best option for large tables while regular […]

How a set of queries can be killed in MySQL using Percona Toolkit’s pt-kill

You might have encountered situations where you had to kill some specific select queries that were running for long periods and choking the database. This post will go into more detail with an example of report query offloading. Report query (select) offloading to a slave server is a common practice to reduce the workload of the […]

Data inconsistencies on MySQL replicas: Beyond pt-table-checksum

Percona Toolkit’s pt-table-checksum is a great tool to find data inconsistencies between a MySQL master and its replicas. However it is sometimes not enough to know that there are inconsistencies and let pt-table-sync fix the issue: you may want to know which exact rows are different to identify the statements that created the inconsistency. This […]

How to avoid hash collisions when using MySQL’s CRC32 function

Percona Toolkit’s  pt-table-checksum performs an online replication consistency check by executing checksum queries on the master, which produces different results on replicas that are inconsistent with the master – and the tool pt-table-sync synchronizes data efficiently between MySQL tables. The tools by default use the CRC32. Other good choices include MD5 and SHA1. If you have installed the FNV_64 user-defined function, pt-table-sync will […]

MySQL compression: Compressed and Uncompressed data size

MySQL has information_schema.tables that contain information such as “data_length” or “avg_row_length.” Documentation on this table however is quite poor, making an assumption that those fields are self explanatory – they are not when it comes to tables that employ compression. And this is where inconsistency is born. Lets take a look at the same table […]

Examining the TokuDB MySQL storage engine file structure

As we know different storage engines in MySQL have different file structures. Every table in MySQL 5.6 must have a .frm file in the database directory matching the table name. But where the rest of the data resides depends on the storage engine. For MyISAM we have .MYI and .MYD files in the database directory […]

Managing shards of MySQL databases with MySQL Fabric

This is the fourth post in our MySQL Fabric series. In case you’re joining us now, we started with an introductory post, and then discussed High Availability (HA) using MySQL Fabric here (Part 1) and here (Part 2). Today we will talk about how MySQL Fabric can help you scale out MySQL databases with sharding. Introduction At the […]

Check for MySQL slave lag with Percona Toolkit plugin for Tungsten Replicator

A while back, I made some changes to the plugin interface for pt-online-schema-change which allows custom replication checks to be written. As I was adding this functionality, I also added the –plugin option to pt-table-checksum. This was released in Percona Toolkit 2.2.8. With these additions, I spent some time writing a plugin that allows Percona […]

Using InfiniDB MySQL server with Hadoop cluster for data analytics

In my previous post about Hadoop and Impala I benchmarked performance of analytical queries in Impala. This time I’ve tried InfiniDB for Hadoop (open-source version) on the modern hardware with an 8-node Hadoop cluster. One of the main advantages (at least for me) of InifiniDB for Hadoop is that it stores the data inside the Hadoop cluster but uses the […]

High Availability with MySQL Fabric: Part II

This is the third post in our MySQL Fabric series. If you missed the previous two, we started with an overall introduction, and then a discussion of MySQL Fabric’s high-availability (HA) features. MySQL Fabric was RC when we started this series, but it went GA recently. You can read the press release here, and see this blog post from Oracle’s Mats […]