October 25, 2014

Shard-Query EC2 images available

Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here […]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to […]

Innodb row size limitation

I recently worked on a customer case where at seemingly random times, inserts would fail with Innodb error 139. This is a rather simple problem, but due to it’s nature, it may only affect you after you already have a system running in production for a while.

High-Performance Click Analysis with MySQL

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event. The next characteristic of a lot of these systems (real or planned) is […]

Innodb Performance Optimization Basics

Note: There is an updated post on this topic here. Interviewing people for our Job Openings I like to ask them a basic question – if you have a server with 16GB of RAM which will be dedicated for MySQL with large Innodb database using typical Web workload what settings you would adjust and interestingly […]

MyISAM Scalability and Innodb, Falcon Benchmarks

We many times wrote about InnoDB scalability problems, this time We are faced with one for MyISAM tables. We saw that several times in synthetic benchmarks but never in production, that’s why we did not escalate MyISAM scalability question. This time working on the customer system we figured out that box with 1 CPU Core […]

Working with large data sets in MySQL

What does working with large data sets in mySQL teach you ? Of course you have to learn a lot about query optimization, art of building summary tables and tricks of executing queries exactly as you want. I already wrote about development and configuration side of the problem so I will not go to details […]

Predicting how long data load would take

I had this question asked many times during last week, and there is really no easy answer. There are just way too many variables to consider especially while loading large schemas with a lot of different table structures. So which variables affect the load speed: Table Structure This one is tricky. The shorter rows you […]

Merge Tables Gotcha

I had the interesting customer case today which made me to do a bit research on the problem. You can create merge table over MyISAM tables which contain primary key and global uniqueness would not be enforced in this case, this is as far as most people will think about it. In fact however it […]

PBXT benchmarks

The PBXT Storage Engine (http://www.primebase.com/xt/) is getting stable and we decided to benchmark it in different workloads. This time I tested only READ queries, similar to ones in benchmark InnoDB vs MyISAM vs Falcon (http://www.percona.com/blog/2007/01/08/innodb-vs-myisam-vs-falcon-benchmarks-part-1) The difference is I used new sysbench with Lua scripting language, so all queries were scripted for sysbench.