October 20, 2014

Using any general purpose computer as a special purpose SIMD computer

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears […]

High availability for MySQL on Amazon EC2 – Part 2 – Setting up the initial instances

This post is the second of a series that started here. The first step to build the HA solution is to create two working instances, configure them to be EBS based and create a security group for them. A third instance, the client, will be discussed in part 7. Since this will be a proof […]

Friday challenge: ibd recovery

I want to make this Friday a bit more interesting – how do you feel to train a bit in InnoDB data recovery techniques. I have .ibd datafile which was created by query CREATE TABLE `tryme` ( `email` mediumblob, `content` mediumblob ) ENGINE=InnoDB … (SOME PARAMETERS SKIPPED) …; and I inserted one record into this […]

Percona builds with Percona patchsets

Percona has a strong team of MySQL developers and consultants on board, so we decided to prepare builds with our patches and third-party patches which we think are very useful. We actually use these internally and for our customers. Current releases include: microslow patch (enables microsecond resolution in slow logs) execution plan (show info about […]

Idea: Couple of more string types

MySQL has a lot of string data types – CHAR, VARCHAR, BLOB, TEXT, ENUM and bunch of variants such as VARBINARY but I think it is not enough I would also like to see type HEXCHAR which would be able to store hex strings, such as those returned as MD5() and SHA1() efficiently. With little […]

To SQL_CALC_FOUND_ROWS or not to SQL_CALC_FOUND_ROWS?

When we optimize clients’ SQL queries I pretty often see a queries with SQL_CALC_FOUND_ROWS option used. Many people think, that it is faster to use this option than run two separate queries: one – to get a result set, another – to count total number of rows. In this post I’ll try to check, is […]

Using CHAR keys for joins, how much is the overhead ?

I prefer to use Integers for joins whenever possible and today I worked with client which used character keys, in my opinion without a big need. I told them this is suboptimal but was challenged with rightful question about the difference. I did not know so I decided to benchmark. The results below are for […]

To UUID or not to UUID ?

Brian recently posted an article comparing UUID and auto_increment primary keys, basically advertising to use UUID instead of primary keys. I wanted to clarify this a bit as I’ve seen it being problems in so many cases. First lets look at the benchmark – we do not have full schema specified in the article itself […]

Long PRIMARY KEY for Innodb tables

I’ve written and spoke a lot about using short PRIMARY KEYs with Innodb tables due to the fact all other key will refer to the rows by primary key. I also recommended to use sequential primary keys so you do not end up having random primary key BTREE updates which can be very expensive. Today […]

How much memory Innodb locks really take ?

After playing yesterday a bit with INSERT … SELECT I decided to check is Innodb locks are relly as efficient in terms of low resource usage as they are advertised. Lets start with a bit of background – in Innodb row level locks are implemented by having special lock table, located in the buffer pool […]