November 23, 2014

How a set of queries can be killed in MySQL using Percona Toolkit’s pt-kill

You might have encountered situations where you had to kill some specific select queries that were running for long periods and choking the database. This post will go into more detail with an example of report query offloading. Report query (select) offloading to a slave server is a common practice to reduce the workload of the […]

How to avoid hash collisions when using MySQL’s CRC32 function

Percona Toolkit’s  pt-table-checksum performs an online replication consistency check by executing checksum queries on the master, which produces different results on replicas that are inconsistent with the master – and the tool pt-table-sync synchronizes data efficiently between MySQL tables. The tools by default use the CRC32. Other good choices include MD5 and SHA1. If you have installed the FNV_64 user-defined function, pt-table-sync will […]

How to scale big data applications using MySQL sharding frameworks

This Wednesday I’ll be discussing two common types of big data: machine-generated data and user-generated content. These types of big data are amenable to sharding, a commonly used technique for spreading data over more than one database server. I’ll be discussing this in-depth during a live webinar at 10 a.m. Pacific time on Sept. 24. […]

Using MySQL 5.6 Global Transaction IDs (GTIDs) in production: Q&A

Thank you to all of you who attended my webinar last week about Global Transaction IDs (GTIDs), which were introduced in MySQL 5.6 to make the reconfiguration of replication straightforward. If you missed my webinar, you can still listen to the recording and download the sides (free). We had a lot of questions during the […]

Using sysbench 0.5 for performing MySQL benchmarks

Given the recent excitement & interest around OpenStack I wanted to make sure I was ready to conduct appropriate evaluations of system performance.  I generally turn to sysbench since it comes with a variety of different tests (accessed via –test= option interface), including: fileio – File I/O test cpu – CPU performance test memory – […]

Using MySQL triggers and views in Amazon RDS

I recently had an opportunity to migrate a customer from a physical server into Amazon’s RDS environment. In this particular case the customers’ platform makes extensive use of MySQL triggers and views.  I came across two significant issues that prevented me from following Amazon’s documentation, which basically states “use mysqldump” but doesn’t call out a […]

Using UDFs for geo-distance search in MySQL

In my previous post about geo-spatial search in MySQL I described (along with other things) how to use geo-distance functions. In this post I will describe the geo-spatial distance functions in more details. If you need to calculate an exact distance between 2 points on Earth in MySQL (very common for geo-enabled applications) you have at […]

Using MySQL 5.6 Performance Schema in multi-tenant environments

Hosting a shared MySQL instance for your internal or external clients (“multi-tenant”) was always a challenge. Multi-tenants approach or a “schema-per-customer” approach is pretty common nowadays to host multiple clients on the same MySQL sever. One of issues of this approach, however, is the lack of visibility: it is hard to tell how many resources (queries, disk, […]

Using InfiniDB MySQL server with Hadoop cluster for data analytics

In my previous post about Hadoop and Impala I benchmarked performance of analytical queries in Impala. This time I’ve tried InfiniDB for Hadoop (open-source version) on the modern hardware with an 8-node Hadoop cluster. One of the main advantages (at least for me) of InifiniDB for Hadoop is that it stores the data inside the Hadoop cluster but uses the […]

Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from  MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting […]