October 23, 2014

ScaleArc: Benchmarking with sysbench

ScaleArc recently hired Percona to perform various tests on its database traffic management product. This post is the outcome of the benchmarks carried out by Uday Sawant (ScaleArc) and myself. You can also download the report directly as a PDF here. The goal of these benchmarks is to identify the potential overhead of the ScaleArc […]

16000 active connections – Percona Server continues to work when others die

We just published results with improvements in Thread Pool in Percona Server: Percona Server: Thread Pool Improvements for Transactional Workloads Percona Server: Improve Scalability with Thread Pool What I am happy to see is that Percona Server is able to handle a tremendous amount of user connections. From our charts you can see it can […]

Percona Server: Improve Scalability with Percona Thread Pool

By default, for every client connection the MySQL server spawns a separate thread which will process all statements for this connection. This is the ‘one-thread-per-connection’ model. It’s simple and efficient until some number of connections N is reached. After this point performance of the MySQL server will degrade, mostly due to various contentions caused by […]

Automation: A case for synchronous replication

Just yesterday I wrote about math of automatic failover today I’ll share my thoughts about what makes MySQL failover different from many other components and why asynchronous nature of standard replication solution is causing problems with it. Lets first think about properties of simple components we fail over – web servers, application servers etc. We […]

The Math of Automated Failover

There are number of people recently blogging about MySQL automated failover, based on production incident which GitHub disclosed. Here is my take on it. When we look at systems providing high availability we can identify 2 cases of system breaking down. First is when the system itself has a bug or limitations which does not […]

Spreading .ibd files across multiple disks; the optimization that isn’t

Inspired by Baron’s earlier post, here is one I hear quite frequently – “If you enable innodb_file_per_table, each table is it’s own .ibd file.  You can then relocate the heavy hit tables to a different location and create symlinks to the original location.” There are a few things wrong with this advice:

More on dangers of the caches

I wrote couple of weeks ago on dangers of bad cache design. Today I’ve been troubleshooting the production down case which had fair amount of issues related to how cache was used. The deal was as following. The update to the codebase was performed and it caused performance issues, so it was rolled back but […]

Cache Miss Storm

I worked on the problem recently which showed itself as rather low MySQL load (probably 5% CPU usage and close to zero IO) would spike to have hundreds instances of threads running at the same time, causing intense utilization spike and server very unresponsive for anywhere from half a minute to ten minutes until everything […]

Data mart or data warehouse?

This is part two in my six part series on business intelligence, with a focus on OLAP analysis. Part 1 – Intro to OLAP Identifying the differences between a data warehouse and a data mart. (this post) Introduction to MDX and the kind of SQL which a ROLAP tool must generate to answer those queries. […]

What do we optimize with mk-query-digest ?

When we’re looking at mk-query-digest report we typically look at the Queries causing the most impact (sum of the query execution times) as well as queries having some longest samples. Why are we looking at these ? Queries with highest Impact are important because looking at these queries and optimizing them typically helps to improve […]