November 25, 2014

Quickly finding unused indexes (and estimating their size)

I had a customer recently who needed to reduce their database size on disk quickly without a lot of messy schema redesign and application recoding.  They didn’t want to drop any actual data, and their index usage was fairly high, so we decided to look for unused indexes that could be removed. Collecting data It’s […]

Emulating global transaction ID with pt-heartbeat

Global transaction IDs are being considered for a future version of MySQL. A global transaction ID lets you determine a server’s replication position reliably, among other benefits. This is great when you need to switch a replica to another master, or any number of other needs. Sometimes you can’t wait for the real thing, but […]

Scaling problems still exist in MySQL 5.5 and Percona Server 5.5

MySQL 5.5 and Percona Server 5.5 do not solve all scalability problems even for read only workloads. Workloads which got a lot of attention such as Sysbench and DBT2/TPC-C scale pretty well a they got a lot of attention, there can be other quite typical workloads however which do not scale that well. This is […]

Using any general purpose computer as a special purpose SIMD computer

Often times, from a computing perspective, one must run a function on a large amount of input. Often times, the same function must be run on many pieces of input, and this is a very expensive process unless the work can be done in parallel. Shard-Query introduces set based processing, which on the surface appears […]

Shard-Query EC2 images available

Infobright and InnoDB AMI images are now available There are now demonstration AMI images for Shard-Query. Each image comes pre-loaded with the data used in the previous Shard-Query blog post. The data in the each image is split into 20 “shards”. This blog post will refer to an EC2 instances as a node from here […]

Shard-Query turbo charges Infobright community edition (ICE)

Shard-Query is an open source tool kit which helps improve the performance of queries against a MySQL database by distributing the work over multiple machines and/or multiple cores. This is similar to the divide and conquer approach that Hive takes in combination with Hadoop. Shard-Query applies a clever approach to parallelism which allows it to […]

Should we give a MySQL Query Cache a second chance ?

Over last few years I’ve been suggesting more people to disable Query Cache than to enable it. It can cause contention problems as well as stalls and due to coarse invalidation is not as efficient as it could be. These are however mostly due to neglect Query Cache received over almost 10 years, with very […]

Moving Subtrees in Closure Table Hierarchies

Many software developers find they need to store hierarchical data, such as threaded comments, personnel org charts, or nested bill-of-materials. Sometimes it’s tricky to do this in SQL and still run efficient queries against the data. I’ll be presenting a webinar for Percona on February 28 at 9am PST. I’ll describe several solutions for storing […]

Ultimate MySQL variable and status reference list

I am constantly referring to the amazing MySQL manual, especially the option and variable reference table. But just as frequently, I want to look up blog posts on variables, or look for content in the Percona documentation or forums. So I present to you what is now my newest Firefox toolbar bookmark: an option and […]

Shard-Query adds parallelism to queries

Preamble: On performance, workload and scalability: MySQL has always been focused on OLTP workloads. In fact, both Percona Server and MySQL 5.5.7rc have numerous performance improvements which benefit workloads that have high concurrency. Typical OLTP workloads feature numerous clients (perhaps hundreds or thousands) each reading and writing small chunks of data. The recent improvements to […]