November 28, 2014

Using InfiniDB MySQL server with Hadoop cluster for data analytics

In my previous post about Hadoop and Impala I benchmarked performance of analytical queries in Impala. This time I’ve tried InfiniDB for Hadoop (open-source version) on the modern hardware with an 8-node Hadoop cluster. One of the main advantages (at least for me) of InifiniDB for Hadoop is that it stores the data inside the Hadoop cluster but uses the […]

Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from  MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting […]

Increasing slow query performance with the parallel query execution

MySQL and Scaling-up (using more powerful hardware) was always a hot topic. Originally MySQL did not scale well with multiple CPUs; there were times when InnoDB performed poorer with more  CPU cores than with less CPU cores. MySQL 5.6 can scale significantly better; however there is still 1 big limitation: 1 SQL query will eventually use only […]

Why MySQL Performance at Low Concurrency is Important

A few weeks ago I wrote about “MySQL Performance at High Concurrency” and why it is important, which was followed up by Vadim’s post on ThreadPool in Percona Server providing some great illustration on the topic. This time I want to target an opposite question: why MySQL performance at low concurrency is important for you. […]

Accessing Percona XtraDB Cluster nodes in parallel from PHP using MySQL asynchronous queries

This post is followup to Peter’s recent post, “Investigating MySQL Replication Latency in Percona XtraDB Cluster,” in which a question was raised as to whether we can measure latency to all nodes at the same time. It is an interesting question: If we have N nodes, can we send queries to nodes to be executed in […]

Load management Techniques for MySQL

One of the very frequent cases with performance problems with MySQL is what they happen every so often or certain times. Investigating them we find out what the cause is some batch jobs, reports and other non response time critical activities are overloading the system causing user experience to degrade. The first thing you need […]

Is your MySQL Application having Busy IO by Oracle Measures ?

Preparing Choosing Storage Systems for MySQL talk for Percona Live in Washington,DC I ran into great paper called Sane SAN 2010 by James Morle from Scale Abilities – and Oracle consulting company. It is worth to read for variety of reason yet for this post I wanted to mention what James calls “Busy” Oracle database […]

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

What would make MySQL Multiple Queries Usable ?

MySQL Has API to run Multiple Queries at once. This feature was designed mainly with saving network round trip in mind and got a little traction due to associated security risks and not significant gains in most cases. What would make MySQL Multiple Queries API more usable ?

Top 5 Wishes for MySQL

About a week ago Marten send me email pointing to his article published on Jays Blog (Come on Marten, it is time for you to get your own blog). I should have replied much earlier but only found time to do that now. So here is my list 1. Be Pluggable Unlike many OpenSource projects […]