October 21, 2014

Shard-Query adds parallelism to queries

Preamble: On performance, workload and scalability: MySQL has always been focused on OLTP workloads. In fact, both Percona Server and MySQL 5.5.7rc have numerous performance improvements which benefit workloads that have high concurrency. Typical OLTP workloads feature numerous clients (perhaps hundreds or thousands) each reading and writing small chunks of data. The recent improvements to […]

Quick comparison of MyISAM, Infobright, and MonetDB

Recently I was doing a little work for a client who has MyISAM tables with many columns (the same one Peter wrote about recently). The client’s performance is suffering in part because of the number of columns, which is over 200. The queries are generally pretty simple (sums of columns), but they’re ad-hoc (can access […]

How to load large files safely into InnoDB with LOAD DATA INFILE

Recently I had a customer ask me about loading two huge files into InnoDB with LOAD DATA INFILE. The goal was to load this data on many servers without putting it into the binary log. While this is generally a fast way to load data (especially if you disable unique key checks and foreign key […]

Working with large data sets in MySQL

What does working with large data sets in mySQL teach you ? Of course you have to learn a lot about query optimization, art of building summary tables and tricks of executing queries exactly as you want. I already wrote about development and configuration side of the problem so I will not go to details […]

Beware large Query_Cache sizes

During last couple of months I had number of consulting projects with MySQL Query Cache being source of serious problems. One would see queries both for MyISAM and Innodb tables mysteriously stalling for few seconds with “NULL” in State column. These could be troubleshooted to waiting on Query Cache which at the same time had […]

Why MySQL could be slow with large tables ?

If you’ve been reading enough database related forums, mailing lists or blogs you probably heard complains about MySQL being unable to handle more than 1.000.000 (or select any other number) rows by some of the users. On other hand it is well known with customers like Google, Yahoo, LiveJournal,Technocarati MySQL has installations with many billions […]

Are larger buffers always better ?

Sometimes I see people thinking about buffers as “larger is always better” so if “large” MySQL sample configuration is designed for 2GB and they happen to have 16, they would simply multiply all/most values by 10 and hope it will work well. Obviously it does not. The least problem would be wasting memory, allocating a […]

Getting to know TokuDB for MySQL

During last April’s Percona Live MySQL Conference and Expo, TokuDB celebrated it’s first full-year as an open source storage engine. I still remember reading the official announcement and the expectations it created one year ago. The premises were very interesting as it had the potential of helping MySQL manage “big data” in a way InnoDB just […]

Parallel Query for MySQL with Shard-Query

While Shard-Query can work over multiple nodes, this blog post focuses on using Shard-Query with a single node.  Shard-Query can add parallelism to queries which use partitioned tables.  Very large tables can often be partitioned fairly easily. Shard-Query can leverage partitioning to add paralellism, because each partition can be queried independently. Because MySQL 5.6 supports the […]

How rows_sent can be more than rows_examined?

When looking at queries that are candidates for optimization I often recommend that people look at rows_sent and rows_examined values as available in the slow query log (as well as some other places). If rows_examined is by far larger than rows_sent, say 100 larger, then the query is a great candidate for optimization. Optimization could […]