How to scale big data applications using MySQL sharding frameworks

This Wednesday I’ll be discussing two common types of big data: machine-generated data and user-generated content. These types of big data are amenable to sharding, a commonly used technique for spreading data over more than one database server. I’ll be discussing this in-depth during a live webinar at 10 a.m. Pacific time on Sept. 24. […]

Generating test data from the mysql> prompt

There are a lot of tools that generate test data.  Many of them have complex XML scripts or GUI interfaces that let you identify characteristics about the data. For testing query performance and many other applications, however, a simple quick and dirty data generator which can be constructed at the MySQL command line is useful. […]

Trawling the binlog with FlexCDC and new FlexCDC plugins for MySQL

Swanhart-Tools includes FlexCDC, a change data capture tool for MySQL. FlexCDC follows a server’s binary log and usually writes “changelogs” that track the changes to tables in the database. I say usually because the latest version of Swanhart-Tools (only in github for now) supports FlexCDC plugins, which allow you to send the updates to a remote […]

Parallel Query for MySQL with Shard-Query

While Shard-Query can work over multiple nodes, this blog post focuses on using Shard-Query with a single node.  Shard-Query can add parallelism to queries which use partitioned tables.  Very large tables can often be partitioned fairly easily. Shard-Query can leverage partitioning to add paralellism, because each partition can be queried independently. Because MySQL 5.6 supports the […]

MySQL webinar: ‘Introduction to open source column stores’

Join me Wednesday, September 18 at 10 a.m. PDT for an hour-long webinar where I will introduce the basic concepts behind column store technology. The webinar’s title is: “Introduction to open source column stores.” What will be discussed? This webinar will talk about Infobright, LucidDB, MonetDB, Hadoop (Impala) and other column stores I will compare […]

MySQL and the SSB – Part 2 – MyISAM vs InnoDB low concurrency

This blog post is part two in what is now a continuing series on the Star Schema Benchmark. In my previous blog post I compared MySQL 5.5.30 to MySQL 5.6.10, both with default settings using only the InnoDB storage engine.  In my testing I discovered that innodb_old_blocks_time had an effect on performance of the benchmark.  There was […]

MySQL 5.6 vs MySQL 5.5 and the Star Schema Benchmark

So far most of the benchmarks posted about MySQL 5.6 use the sysbench OLTP workload.  I wanted to test a set of queries which, unlike sysbench, utilize joins.  I also wanted an easily reproducible set of data which is more rich than the simple sysbench table.  The Star Schema Benchmark (SSB) seems ideal for this. […]

Webinar: Building a highly scaleable distributed row, document or column store with MySQL and Shard-Query

On Friday, February 15, 2013 10:00am Pacific Standard Time, I will be delivering a webinar entitled “Building a highly scaleable distributed row, document or column store with MySQL and Shard-Query” The first part of this webinar will focus on why distributed databases are needed, and on the techniques employed by Shard-Query to implement a distributed […]

Replication of the NOW() function (also, time travel)

Notice the result of the NOW() function in the following query. The query was run on a real database server and I didn’t change the clock of the server or change anything in the database configuration settings.

You may proceed to party like it is 1999. How can the NOW() function return a value […]