September 2, 2014

Parallel Query for MySQL with Shard-Query

While Shard-Query can work over multiple nodes, this blog post focuses on using Shard-Query with a single node.  Shard-Query can add parallelism to queries which use partitioned tables.  Very large tables can often be partitioned fairly easily. Shard-Query can leverage partitioning to add paralellism, because each partition can be queried independently. Because MySQL 5.6 supports the […]

Creating GEO-enabled applications with MySQL 5.6

In my previous post I’ve showed some new MySQL 5.6 features which can be very helpful when creating geo-enabled applications. In this post I will show how we can obtain open-source GIS data, convert it to MySQL and use it in our GEO-enabled applications. I will also present at the upcoming Percona Live conference on this […]

Designing one to many relations – MongoDB vs MySQL

We already discussed one to one relations in MongoDB, and the main conclusion was that you should design your collections according to the most frequent access pattern. With one to many relations, this is still valid, but other factors may come into play. Let’s look at a simple problem: we are a shop and we […]

MySQL 5.6 vs MySQL 5.5 and the Star Schema Benchmark

So far most of the benchmarks posted about MySQL 5.6 use the sysbench OLTP workload.  I wanted to test a set of queries which, unlike sysbench, utilize joins.  I also wanted an easily reproducible set of data which is more rich than the simple sysbench table.  The Star Schema Benchmark (SSB) seems ideal for this. […]

MySQL Indexing Best Practices: Webinar Questions Followup

I had a lot of questions on my MySQL Indexing: Best Practices Webinar (both recording and slides are available now) We had lots of questions. I did not have time to answer some and others are better answered in writing anyway. Q: One developer on our team wants to replace longish (25-30) indexed varchars with […]

A case for MariaDB’s Hash Joins

MariaDB 5.3/5.5 has introduced a new join type “Hash Joins” which is an implementation of a Classic Block-based Hash Join Algorithm. In this post we will see what the Hash Join is, how it works and for what types of queries would it be the right choice. I will show the results of executing benchmarks […]

High-Performance Click Analysis with MySQL

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event. The next characteristic of a lot of these systems (real or planned) is […]

Enum Fields VS Varchar VS Int + Joined table: What is Faster?

Really often in customers’ application we can see a huge tables with varchar/char fields, with small sets of possible values. These are “state”, “gender”, “status”, “weapon_type”, etc, etc. Frequently we suggest to change such fields to use ENUM column type, but is it really necessary (from performance standpoint)? In this post I’d like to present […]

When (and how) to move an InnoDB table outside the shared tablespace

In my last post, “A closer look at the MySQL ibdata1 disk space issue and big tables,” I looked at the growing ibdata1 problem under the perspective of having big tables residing inside the so-called shared tablespace. In the particular case that motivated that post, we had a customer running out of disk space in his […]

Shard-Query EC2 images available

Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here […]