October 30, 2014

MySQL 5.6 vs MySQL 5.5 and the Star Schema Benchmark

So far most of the benchmarks posted about MySQL 5.6 use the sysbench OLTP workload.  I wanted to test a set of queries which, unlike sysbench, utilize joins.  I also wanted an easily reproducible set of data which is more rich than the simple sysbench table.  The Star Schema Benchmark (SSB) seems ideal for this. […]

Upgrading MySQL

Upgrading MySQL Server is a very interesting task as you can approach it with so much different “depth”. For some this is 15 minutes job for others it is many month projects. Why is that ? Performing MySQL upgrade two things should normally worry you. It is Regressions – functionality regressions when what you’ve been […]

Slow DROP TABLE

It is a known fact that ext3 is not the most efficient file system out there and for example file removals can be painfully slow and cause a lot of random I/O. However, as it turns out, it can sometimes have a much more severe impact on the MySQL performance that it would seem. When […]

Why MySQL could be slow with large tables ?

If you’ve been reading enough database related forums, mailing lists or blogs you probably heard complains about MySQL being unable to handle more than 1.000.000 (or select any other number) rows by some of the users. On other hand it is well known with customers like Google, Yahoo, LiveJournal,Technocarati MySQL has installations with many billions […]

Find and remove duplicate indexes

Having duplicate keys in our schemas can hurt the performance of our database: They make the optimizer phase slower because MySQL needs to examine more query plans. The storage engine needs to maintain, calculate and update more index statistics DML and even read queries can be slower because MySQL needs update fetch more data to […]

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

table_cache negative scalability

Couple of months ago there was a post by FreshBooks on getting great performance improvements by lowering table_cache variable. So I decided to investigate what is really happening here. The “common sense” approach to tuning caches is to get them as large as you can if you have enough resources (such as memory). With MySQL […]

Four ways to optimize paginated displays

A paginated display is one of the top optimization scenarios we see in the real world. Search results pages, leaderboards, and most-popular lists are good examples. You know the design pattern: display 20 results in some most-relevant order. Show a “next” and “previous” link. And usually, show how many items are in the whole list […]

Enum Fields VS Varchar VS Int + Joined table: What is Faster?

Really often in customers’ application we can see a huge tables with varchar/char fields, with small sets of possible values. These are “state”, “gender”, “status”, “weapon_type”, etc, etc. Frequently we suggest to change such fields to use ENUM column type, but is it really necessary (from performance standpoint)? In this post I’d like to present […]