October 22, 2014

Using InfiniDB MySQL server with Hadoop cluster for data analytics

In my previous post about Hadoop and Impala I benchmarked performance of analytical queries in Impala. This time I’ve tried InfiniDB for Hadoop (open-source version) on the modern hardware with an 8-node Hadoop cluster. One of the main advantages (at least for me) of InifiniDB for Hadoop is that it stores the data inside the Hadoop cluster but uses the […]

Using Apache Hadoop and Impala together with MySQL for data analysis

Apache Hadoop is commonly used for data analysis. It is fast for data loads and scalable. In a previous post I showed how to integrate MySQL with Hadoop. In this post I will show how to export a table from  MySQL to Hadoop, load the data to Cloudera Impala (columnar format) and run a reporting […]

The small improvements of MySQL 5.6: Duplicate Index Detection

Here at the MySQL Performance Blog, we’ve been discussing the several new features that MySQL 5.6 brought: GTID-based replication, InnoDB Fulltext, Memcached integration, a more complete performance schema, online DDL and several other InnoDB and query optimizer improvements. However, I plan to focus on a series of posts on the small but handy improvements – […]

10 years of MySQL User Conferences

In preparing for this month’s Percona Live MySQL Conference and Expo, I’ve been reminiscing about the annual MySQL User Conference’s history – the 9 times it previously took place in its various reincarnations – and there are a lot of good things, fun things to remember. 2003 was the year that marked the first MySQL user conference […]

MySQL Life Cycle. Your Feedback is needed.

When I started with MySQL 3.22 I would start running MySQL from early beta (if not alpha) and update MySQL the same date as release would hit the web. Since that time I matured and so did MySQL ecosystem. MySQL is powering a lot more demanding and business critical applications now than 12 years ago […]

MySQL and PostgreSQL SpecJAppServer benchmark results

Listening to Josh Berkus presentation on OSCON today I decided to take a closer look at SpecJAppServer benchmarks results which were published by PostgreSQL recently and which as Josh Puts it “This publication shows that a properly tuned PostgreSQL is not only as fast or faster than MySQL, but almost as fast as Oracle (since […]

MySQL Installation and upgrade scripts.

I generally find MySQL Sever sufficiently tested, meaning at least minor version upgrades rarely cause the problems. Of course it is not perfect and I remember number of big issues when some releases could not be used due to behavior changes in them and when something had to be rolled back in the next release. […]

WITHer Recursive Queries?

Over the past few years, we’ve seen MySQL technology advance in leaps and bounds, especially when it comes to scalability. But by focusing on the internals of the storage engine for so long, MySQL has fallen behind regarding support for advanced SQL features. SQLite, another popular open-source SQL database, just released version 3.8.3, including support […]

Increasing slow query performance with the parallel query execution

MySQL and Scaling-up (using more powerful hardware) was always a hot topic. Originally MySQL did not scale well with multiple CPUs; there were times when InnoDB performed poorer with more  CPU cores than with less CPU cores. MySQL 5.6 can scale significantly better; however there is still 1 big limitation: 1 SQL query will eventually use only […]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to […]