November 26, 2014

Distributed Set Processing with Shard-Query

Can Shard-Query scale to 20 nodes? Peter asked this question in comments to to my previous Shard-Query benchmark. Actually he asked if it could scale to 50, but testing 20 was all I could due to to EC2 and time limits. I think the results at 20 nodes are very useful to understand the performance: […]

Optimizing slow web pages with mk-query-digest

I don’t use many tools in my consulting practice but for the ones I do, I try to know them as best as I can. I’ve been using mk-query-digest for almost as long as it exists but it continues to surprise me in ways I couldn’t imagine it would. This time I’d like to share […]

Flexviews – part 3 – improving query performance using materialized views

Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually […]

Using Flexviews – part one, introduction to materialized views

If you know me, then you probably have heard of Flexviews. If not, then it might not be familiar to you. I’m giving a talk on it at the MySQL 2011 CE, and I figured I should blog about it before then. For those unfamiliar, Flexviews enables you to create and maintain incrementally refreshable materialized […]

Intro to OLAP

This is the first of a series of posts about business intelligence tools, particularly OLAP (or online analytical processing) tools using MySQL and other free open source software. OLAP tools are a part of the larger topic of business intelligence, a topic that has not had a lot of coverage on MPB. Because of this, […]

3 ways MySQL uses indexes

I often see people confuse different ways MySQL can use indexing, getting wrong ideas on what query performance they should expect. There are 3 main ways how MySQL can use the indexes for query execution, which are not mutually exclusive, in fact some queries will use indexes for all 3 purposes listed here.

Sphinx 0.9.8 is released just in time for OSCON 2008

As you probably already seen in a post by Baron, Sphinx Release 0.9.8 is finally out, just in time for OSCON 2008. Even though it is “minor release” if you look at the number, it is major release in practice (and you can view snapshots as minor releases). The changes since 0.9.7 are dramatic with […]

Heikki Tuuri Innodb answers – Part I

Its almost a month since I promised Heikki Tuuri to answer Innodb Questions. Heikki is a busy man so I got answers to only some of the questions but as people still poking me about this I decided to publish the answers I have so far. Plus we may get some interesting follow up questions […]

Using GROUP BY WITH ROLLUP for Reporting Performance Optimization

Quite typical query for reporting applications is to find top X values. If you analyze Web Site logs you would look at most popular web pages or search engine keywords which bring you most of the traffic. If you’re looking at ecommerce reporting you may be interested in best selling product or top sales people. […]