September 18, 2014

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to […]

Moving Subtrees in Closure Table Hierarchies

Many software developers find they need to store hierarchical data, such as threaded comments, personnel org charts, or nested bill-of-materials. Sometimes it’s tricky to do this in SQL and still run efficient queries against the data. I’ll be presenting a webinar for Percona on February 28 at 9am PST. I’ll describe several solutions for storing […]

How is join_buffer_size allocated?

When examining MySQL configuration, we quite often want to know how various buffer sizes are used. This matters because some buffers (sort_buffer_size for example) are allocated to their full size immediately as soon as they are needed, but others are effectively a “max size” and the corresponding buffers are allocated only as big as needed […]

Recall of Percona Server 5.1.47-11.0

Percona Server release 11.0 which we announced few days ago unfortunately was released with a bug introduced while implementing stripping comments in query cache which could cause server crash with certain types of queries if query cache is enabled. We have released Percona Server release 11.1 which includes a fix for this issue. If you […]

Multi Column indexes vs Index Merge

The mistake I commonly see among MySQL users is how indexes are created. Quite commonly people just index individual columns as they are referenced in where clause thinking this is the optimal indexing strategy. For example if I would have something like AGE=18 AND STATE=’CA’ they would create 2 separate indexes on AGE and STATE […]

AUTO_INCREMENT and MERGE TABLES

How would you expect AUTO_INCREMENT to work with MERGE tables ? Assuming INSERT_METHOD=LAST is used I would expect it to work same as in case insertion happens to the last table… which does not seems to be the case. Alternatively I would expect AUTO_INCREMENT to be based off the maximum value across all tables, respecting […]

Multiple column index vs multiple indexes

(There is an updated version of the content in this post by Percona’s Stephane Combaudon available here.) After my previous post there were questions raised about Index Merge on Multiple Indexes vs Two Column Index efficiency. I mentioned in most cases when query can use both of the ways using multiple column index would be […]

How adding another table to JOIN can improve performance ?

JOINs are expensive and it most typical the fewer tables (for the same database) you join the better performance you will get. As for any rules there are however exceptions The one I’m speaking about comes from the issue with MySQL optimizer stopping using further index key parts as soon as there is a range […]

Merge Tables Gotcha

I had the interesting customer case today which made me to do a bit research on the problem. You can create merge table over MyISAM tables which contain primary key and global uniqueness would not be enforced in this case, this is as far as most people will think about it. In fact however it […]