November 22, 2014

Beware of MySQL Data Truncation

Here is nice gotcha which I’ve seen many times and which can cause just a minefield for many reasons. Lets say you had a system storing articles and you use article_id as unsigned int. As the time goes and you see you may get over 4 billions of articles you change the type for article_id […]

Innodb redo log archiving

Percona Server 5.6.11-60.3 introduces a new “log archiving” feature. Percona XtraBackup 2.1.5 supports “apply archived logs.” What does it mean and how it can be used? Percona products propose three kinds of incremental backups. The first is full scan of data files and comparison the data with backup data to find some delta. This approach […]

InnoDB file formats: Here is one pitfall to avoid

UPDATED: explaining the role of innodb_strict_mode and correcting introduction of innodb_file_format Compressed tables is an example of an InnoDB feature that became available with the Barracuda file format, introduced in the InnoDB plugin. They can bring significant gains in raw performance and scalability: given the data is stored in a compressed format the amount of […]

Fishing with dynamite, brought to you by the randgen and dbqp

I tend to speak highly of the random query generator as a testing tool and thought I would share a story that shows how it can really shine. At our recent dev team meeting, we spent approximately 30 minutes of hack time to produce test cases for 3 rather hard to duplicate bugs. Of course, […]

Shard-Query EC2 images available

Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here […]

Where does HandlerSocket really save you time?

HandlerSocket has really generated a lot of interest because of the dual promises of ease-of-use and blazing-fast performance. The performance comes from eliminating CPU consumption. Akira Higuchi’s HandlerSocket presentation from a couple of months back had some really good profile results for libmysql versus libhsclient (starting at slide 15). Somebody in the audience at Percona […]

Sample datasets for benchmarking and testing

Sometimes you just need some data to test and stress things. But randomly generated data is awful — it doesn’t have realistic distributions, and it isn’t easy to understand whether your results are meaningful and correct. Real or quasi-real data is best. Whether you’re looking for a couple of megabytes or many terabytes, the following […]

Analyzing air traffic performance with InfoBright and MonetDB

Accidentally me and Baron played with InfoBright (see http://www.percona.com/blog/2009/09/29/quick-comparison-of-myisam-infobright-and-monetdb/) this week. And following Baron’s example I also run the same load against MonetDB. Reading comments to Baron’s post I tied to load the same data to LucidDB, but I was not successful in this. I tried to analyze a bigger dataset and I took public […]

PROCEDURE ANALYSE

Quite common task during schema review is to find the optimal data type for the column value – for example column is defined as INT but is it really needed or may be SMALLINT or even TINYINT will do instead. Does it contain any NULLs or it can be defined NOT NULL which reduces space […]

Efficient Boolean value storage for Innodb Tables

Sometimes you have the task of storing multiple of boolean values (yes/now or something similar) in the table and if you get many columns and many rows you may want to store them as efficient way as possible. For MyISAM tables you could use BIT(1) fields which get combined together for efficient storage: