September 23, 2014

Inexpensive SSDs for Database Workloads

The cost of SSDs has been dropping rapidly, and at the time of this writing, 2.5-drives have reached the 1TB capacity mark.  You can actually get inexpensive drives for as little as 60 cents per GB. Even inexpensive SSDs can perform tens of thousands of IOPs and come with 1.5M – 2M hous MTBF and […]

MySQL and Hadoop integration

Dolphin and Elephant: an Introduction This post is intended for MySQL DBAs or Sysadmins who need to start using Apache Hadoop and want to integrate those 2 solutions. In this post I will cover some basic information about the Hadoop, focusing on Hive as well as MySQL and Hadoop/Hive integration. First of all, if you […]

Join Optimizations in MySQL 5.6 and MariaDB 5.5

This is the third blog post in the series of blog posts leading up to the talk comparing the optimizer enhancements in MySQL 5.6 and MariaDB 5.5. This blog post is targeted at the join related optimizations introduced in the optimizer. These optimizations are available in both MySQL 5.6 and MariaDB 5.5, and MariaDB 5.5 […]

Multi Range Read (MRR) in MySQL 5.6 and MariaDB 5.5

This is the second blog post in the series of blog posts leading up to the talk comparing the optimizer enhancements in MySQL 5.6 and MariaDB 5.5. This blog post is aimed at the optimizer enhancement Multi Range Read (MRR). Its available in both MySQL 5.6 and MariaDB 5.5 Now let’s take a look at […]

High-Performance Click Analysis with MySQL

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event. The next characteristic of a lot of these systems (real or planned) is […]

Quickly preloading Innodb tables in the buffer pool

In the previous post I mentioned a way I use to preload Clustered Index (data) for Innodb tables. Though I thought this topic would benefit from a bit more information. But lest first start with feature request for Innodb Team: All ways I mention here are hacks and they can’t be as efficient as native […]

The MySQL optimizer, the OS cache, and sequential versus random I/O

In my post on estimating query completion time, I wrote about how I measured the performance on a join between a few tables in a typical star schema data warehousing scenario. In short, a query that could take several days to run with one join order takes an hour with another, and the optimizer chose […]

How much overhead is caused by on disk temporary tables

As you might know while running GROUP BY and some other kinds of queries MySQL needs to create temporary tables, which can be created in memory, using MEMORY storage engine or can be created on disk as MYISAM tables. Which one will be used depends on the allowed tmp_table_size and also by the data which […]

A case for MariaDB’s Hash Joins

MariaDB 5.3/5.5 has introduced a new join type “Hash Joins” which is an implementation of a Classic Block-based Hash Join Algorithm. In this post we will see what the Hash Join is, how it works and for what types of queries would it be the right choice. I will show the results of executing benchmarks […]

The relationship between Innodb Log checkpointing and dirty Buffer pool pages

This is a time-honored topic, and there’s no shortage of articles on the topic on this blog. I wanted to write a post trying to condense and clarify those posts, as it has taken me a while to really understand this relationship. Some basic facts Most of us know that writing into Innodb updates buffer […]