October 1, 2014

Shard-Query EC2 images available

Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here […]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to […]

Using Flexviews – part one, introduction to materialized views

If you know me, then you probably have heard of Flexviews. If not, then it might not be familiar to you. I’m giving a talk on it at the MySQL 2011 CE, and I figured I should blog about it before then. For those unfamiliar, Flexviews enables you to create and maintain incrementally refreshable materialized […]

Intro to OLAP

This is the first of a series of posts about business intelligence tools, particularly OLAP (or online analytical processing) tools using MySQL and other free open source software. OLAP tools are a part of the larger topic of business intelligence, a topic that has not had a lot of coverage on MPB. Because of this, […]

How fast is FLUSH TABLES WITH READ LOCK?

A week or so ago at the MySQL conference, I visited one of the backup vendors in the Expo Hall. I started to chat with them about their MySQL backup product. One of the representatives told me that their backup product uses FLUSH TABLES WITH READ LOCK, which he admitted takes a global lock on […]

How number of columns affects performance ?

It is pretty understood the tables which have long rows tend to be slower than tables with short rows. I was interested to check if the row length is the only thing what matters or if number of columns we have to work with also have an important role. I was interested in peak row […]

PROCEDURE ANALYSE

Quite common task during schema review is to find the optimal data type for the column value – for example column is defined as INT but is it really needed or may be SMALLINT or even TINYINT will do instead. Does it contain any NULLs or it can be defined NOT NULL which reduces space […]

When would you use SAN with MySQL ?

One question which comes up very often is when one should use SAN with MySQL, which is especially popular among people got used to Oracle or other Enterprise database systems which are quite commonly deployed on SAN. My question in such case is always what exactly are you trying to get by using SAN ?

5.0.75-build12 Percona binaries

After several important fixes to our patches we made binaries for build12. Fixes include: Control of InnoDB insert buffer to address problems Peter mentioned http://www.percona.com/blog/2009/01/13/some-little-known-facts-about-innodb-insert-buffer/, also check Bug 41811 to see symptoms of problem with Insert buffer. http://www.percona.com/docs/wiki/patches:innodb_io_patches * innodb_flush_neighbor_pages (default 1) – When the dirty page are flushed (written to datafile), this parameter determines […]

High-Performance Click Analysis with MySQL

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event. The next characteristic of a lot of these systems (real or planned) is […]