November 22, 2014

Speeding up GROUP BY if you want aproximate results

Doing performance analyzes today I wanted to count how many hits come to the pages which get more than couple of visits per day. We had SQL logs in the database so It was pretty simple query:

Unfortunately this query ran for over half an hour badly overloaded server and I had to kill […]

Group commit and real fsync

During the recent months I’ve seen few cases of customers upgrading to MySQL 5.0 and having serious performance slow downs, up to 10 times in certain cases. What was the most surprising for them is the problem was hardware and even OS specific – it could show up with one OS version but not in […]

Innodb transaction history often hides dangerous ‘debt’

In many write-intensive workloads Innodb/XtraDB storage engines you may see hidden and dangerous “debt” being accumulated – unpurged transaction “history” which if not kept in check over time will cause serve performance regression or will take all free space and cause an outage. Let’s talk about where it comes from and what can you do […]

Using InfiniDB MySQL server with Hadoop cluster for data analytics

In my previous post about Hadoop and Impala I benchmarked performance of analytical queries in Impala. This time I’ve tried InfiniDB for Hadoop (open-source version) on the modern hardware with an 8-node Hadoop cluster. One of the main advantages (at least for me) of InifiniDB for Hadoop is that it stores the data inside the Hadoop cluster but uses the […]

A technical WebScaleSQL review and comparison with Percona Server

The recent WebScaleSQL announcement has made quite a splash in the MySQL community over the last few weeks, and with a good reason. The collaboration between the major MySQL-at-scale users to develop a single code branch that addresses the needs of, well, web scale, is going to benefit the whole community. But I feel that […]

Parallel Query for MySQL with Shard-Query

While Shard-Query can work over multiple nodes, this blog post focuses on using Shard-Query with a single node.  Shard-Query can add parallelism to queries which use partitioned tables.  Very large tables can often be partitioned fairly easily. Shard-Query can leverage partitioning to add paralellism, because each partition can be queried independently. Because MySQL 5.6 supports the […]

ScaleArc: Benchmarking with sysbench

ScaleArc recently hired Percona to perform various tests on its database traffic management product. This post is the outcome of the benchmarks carried out by Uday Sawant (ScaleArc) and myself. You can also download the report directly as a PDF here. The goal of these benchmarks is to identify the potential overhead of the ScaleArc […]

A closer look at Percona Server 5.6

Yesterday we announced the GA release of Percona Server 5.6, the latest release of our enhanced, drop-in replacement for MySQL. Percona Server 5.6 is the best free MySQL alternative for demanding applications. Our third major release, Percona Server 5.6 offers all the improvements found in MySQL 5.6 Community Edition plus scalability, availability, backup, and security […]

Checking B+tree leaf nodes list consistency in InnoDB

If we have InnoDB pages there are two ways to learn how many records they contain: PAGE_N_RECS field in the page header Count records while walking over the list of records from infimum to supremum In some previous revision of the recovery tool a short summary was added to a dump which is produced by […]

More on MySQL transaction descriptors optimization

Since my first post on MySQL transaction descriptors optimization introduced in Percona Server 5.5.30-30.2 and a followup by Dimitri Kravchuk, we have received a large number of questions on why the benchmark results in both posts look rather different. We were curious as well, so we tried to answer that question by retrying benchmarks on […]