October 23, 2014

Full Text Search Webinar Questions Followup

I presented a webinar this week to give an overview of several Full Text Search solutions and compare their performance.  Even if you missed the webinar, you can register for it, and you’ll be emailed a link to the recording. During my webinar, a number of attendees asked some good questions.  Here are their questions and my […]

Find and remove duplicate indexes

Having duplicate keys in our schemas can hurt the performance of our database: They make the optimizer phase slower because MySQL needs to examine more query plans. The storage engine needs to maintain, calculate and update more index statistics DML and even read queries can be slower because MySQL needs update fetch more data to […]

A case for MariaDB’s Hash Joins

MariaDB 5.3/5.5 has introduced a new join type “Hash Joins” which is an implementation of a Classic Block-based Hash Join Algorithm. In this post we will see what the Hash Join is, how it works and for what types of queries would it be the right choice. I will show the results of executing benchmarks […]

Emulating global transaction ID with pt-heartbeat

Global transaction IDs are being considered for a future version of MySQL. A global transaction ID lets you determine a server’s replication position reliably, among other benefits. This is great when you need to switch a replica to another master, or any number of other needs. Sometimes you can’t wait for the real thing, but […]

Using any general purpose computer as a special purpose SIMD computer

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears […]

Flexviews – part 3 – improving query performance using materialized views

Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually […]

Using Flexviews – part two, change data capture

In my previous post I introduced materialized view concepts. This post begins with an introduction to change data capture technology and describes some of the ways in which it can be leveraged for your benefit. This is followed by a description of FlexCDC, the change data capture tool included with Flexviews. It continues with an […]

EXPLAIN EXTENDED can tell you all kinds of interesting things

While many people are familiar with the MySQL EXPLAIN command, fewer people are familiar with “extended explain” which was added in MySQL 4.1 EXPLAIN EXTENDED can show you what the MySQL optimizer does to your query. You might not know this, but MySQL can dramatically change your query before it actually executes it. This process […]

Joining on range? Wrong!

The problem I am going to describe is likely to be around since the very beginning of MySQL, however unless you carefully analyse and profile your queries, it might easily go unnoticed. I used it as one of the examples in our talk given at phpDay.it conference last week to demonstrate some pitfalls one may […]

High-Performance Click Analysis with MySQL

We have a lot of customers who do click analysis, site analytics, search engine marketing, online advertising, user behavior analysis, and many similar types of work.  The first thing these have in common is that they’re generally some kind of loggable event. The next characteristic of a lot of these systems (real or planned) is […]