September 20, 2014

How can we bring query to the data?

Baron recently wrote about sending the query to the data looking at distributed systems like Cassandra. I want to take a look at more simple systems like MySQL and see how we’re doing in this space. It is obvious getting computations as closer to the data as possible is the most efficient as we will […]

Joining many tables in MySQL – optimizer_search_depth

Working on customer case today I ran into interesting problem – query joining about 20 tables (thank you ORM by joining all tables connected with foreign keys just in case) which would take 5 seconds even though in the read less than 1000 rows and doing it completely in memory. The plan optimizer picked was […]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to […]

Should we give a MySQL Query Cache a second chance ?

Over last few years I’ve been suggesting more people to disable Query Cache than to enable it. It can cause contention problems as well as stalls and due to coarse invalidation is not as efficient as it could be. These are however mostly due to neglect Query Cache received over almost 10 years, with very […]

Recovering Innodb table Corruption

Assume you’re running MySQL with Innodb tables and you’ve got crappy hardware, driver bug, kernel bug, unlucky power failure or some rare MySQL bug and some pages in Innodb tablespace got corrupted. In such cases Innodb will typically print something like this: InnoDB: Database page corruption on disk or a failed InnoDB: file read of […]

Performance impact of complex queries

What is often underestimated is impact of MySQL Performance by complex queries on large data sets(ie some large aggregate queries) and batch jobs. It is not rare to see queries which were taking milliseconds to stall for few seconds, especially in certain OS configurations, and on low profile servers (ie having only one disk drive) […]

MySQL Query Cache

MySQL has a great feature called “Query Cache” which is quite helpful for MySQL Performance optimization tasks but there are number of things you need to know. First let me clarify what MySQL Query Cache is – I’ve seen number of people being confused, thinking MySQL Query Cache is the same as Oracle Query Cache […]

Why MySQL could be slow with large tables ?

If you’ve been reading enough database related forums, mailing lists or blogs you probably heard complains about MySQL being unable to handle more than 1.000.000 (or select any other number) rows by some of the users. On other hand it is well known with customers like Google, Yahoo, LiveJournal,Technocarati MySQL has installations with many billions […]

Generating test data from the mysql> prompt

There are a lot of tools that generate test data.  Many of them have complex XML scripts or GUI interfaces that let you identify characteristics about the data. For testing query performance and many other applications, however, a simple quick and dirty data generator which can be constructed at the MySQL command line is useful. […]

Reducer.sh – A powerful MySQL test-case simplification/reducer tool

Let me start by saying a big “thank you” to the staff at Oracle for deciding to open source reducer.sh. It’s a tool I developed whilst I was working for them several years ago. Its sole purpose is to do one thing – but do it good: test-case simplification. So, let’s say some customer just […]