I have been working for a customer benchmarking insert performance on Amazon EC2, and I have some interesting results that I wanted to share. I used a nice and effective tool iiBench which has been developed by Tokutek. Though the “1 billion row insert challenge” for which this tool was originally built is long over, […]
Are you using InnoDB tables on MySQL version 5.1.22 or newer? If so, you probably have gaps in your auto-increment columns. A simple INSERT IGNORE query creates gaps for every ignored insert, but this is undocumented behaviour. This documentation bug is already submitted. Firstly, we will start with a simple question. Why do we have […]
When we’re looking at benchmarks we typically run some stable workload and we run it in isolation – nothing else is happening on the system. This is not however how things happen in real world when we have significant variance in the load and many things can be happening concurrently. It is very typical to […]
I’ve been working with Clustrix team for long time on the evaluation of Clustrix product, and this is the report on performance characteristics of Clustrix under tpcc-mysql workload. I tested tpcc 5000W (~500GB of data in InnoDB) on Clustrix systems with 3, 6, 9-nodes and also, to have base for comparison, ran the same workload […]
As part of work on “High Performance MySQL, 3rd edition”, Baron asked me to compare different MySQL version in some simple benchmark, but on decent hardware. So why not.
It is known that MySQL due internal limitations is not able to utilize all CPU and IO resources available on modern hardware. Idea is to run multiple instances of MySQL to gain better performance on Fusion-io ioDrive card. Full report is available in PDF
We raised topic of problems with flushing in InnoDB several times, some links: InnoDB Flushing theory and solutions MySQL 5.5.8 in search of stability This was not often recurring problem so far, however in my recent experiments, I observe it in very simple sysbench workload on hardware which can be considered as typical nowadays.
My previous benchmark on Performance Schema was mainly in memory workload and against single tables. Now after adding multi-tables support to sysbench, it is interesting to see what statistic we can get from workload that produces some disk IO. So let’s run sysbench against 100 tables, each 5000000 rows (~1.2G ) and buffer pool 30G. […]
“The least expensive query is the query you never run.” Data access is expensive for your application. It often requires CPU, network and disk access, all of which can take a lot of time. Using less computing resources, particularly in the cloud, results in decreased overall operational costs, so caches provide real value by avoiding […]
Combating “data drift” In my first post in this series, I described materialized views (MVs). An MV is essentially a cached result set at one point in time. The contents of the MV will become incorrect (out of sync) when the underlying data changes. This loss of synchronization is sometimes called drift. This is conceptually […]